Uncategorized

L08 Correlation versus Causation, or Ice Cream Sales Cause More Homicides

L08 Correlation versus Causation, or Ice Cream Sales Cause More Homicides

Recall that in the previous lesson we learned that one necessary condition for establishing causality is an association between two variables. With this in mind, consider the following variables that are strongly associated with each other: ‘amount of ice cream sold’ and homicide rates.

Take a look at the following graph showing the change in both ice cream sales and homicide rates between the months of May and July. This graph clearly shows that as the amount of ice cream sold increases, so too does the number of murders that are committed.

As you watch the following short video that discusses this particular correlation, ask yourself: Does eating ice cream cause the homicide rate to increase?

https://www.youtube.com/watch?v=BaETnBzM7yU (Links to an external site.)

Does eating ice cream cause an increase in homicides? Of course not. However, there is an important lesson here, namely that just because two variables are correlated does not mean that one causes the other.

While this case may seem somewhat trivial, the problem of confusing correlation with causation is not. More importantly, this case represents an instance in which it would be very difficult to utilize randomization as we would in an experimental design. So, in these types of cases, specifically ones where randomization is not possible because it is either impossible, or unethical, or implausible, how can we try to establish a causal relationship? This will form the basis of the current lesson.

Just for Fun: More “Spurious Correlations”

Did you know that the number of sociology doctorates earned in the U.S. each year correlates with the frequency of deaths related to anticoagulants? Or that the annual rate of pool drownings correlates with the number of films Nicolas Cage appears in that year?

If you’re interested, as a fun exercise visit Tyler Vigen’s website “Spurious Correlations (Links to an external site.)” and browse through the 30,000 correlations he has collected.

 

L08 Classical Experiments and Observational Studies in Criminological Research

Traditionally, criminology and criminal justice fields are not based on laboratory science, nor are they derived from a social science paradigm with strong experimental history. Moreover, there are concerns over things like due process. Finally, experiments are not amenable to many issues of interest. For instance, the researchers often lack control over the study sample or are interested in issues not appropriate to the type of measurement required in experiments.

Recall the concept of a counterfactual, which is a contrast between what did happen to a subject under one treatment with what would have happened under the other treatment. In an experiment, a control group is a proper counterfactual. Like experiments, nonexperimental methods seek to establish a counterfactual as well, though this is more difficult in the absence of randomization.

As we think about how we might establish causation without randomization, let’s recall the criteria for causation we covered in the last lesson:

  • Two variables must be empirically correlated with one another for a causal relationship to exist
  • Cause must precede effect in time
  • Observed correlation between two variables cannot be explained away by a third variable
  • Causal relationship strengthened by finding causal mechanism
  • Causal relationship should be considered within context

While the first two are generally easy to establish without an experiment (and the fourth and fifth are not required), it is the third one that presents the greatest challenge for nonexperimental methods.

Generally, in social science research, we rely on observational studies and the observational data which comes from them. An observational study is an empirical analysis of treatments or policies and the effects that they cause, which differs from an experiment in that the investigator has no control over the treatment assignments.

 

 

L08 Observational Studies and Causality

Recall the concept of internal validity from the previous lesson which asks, “Did the intervention actually cause the observed change in outcome?” In that lesson, we considered internal validity in an experimental setting in which the researcher has control over who receives (or is withheld from) an intervention and thus can assign it randomly.

But in observational studies, subjects are not assigned randomly to control and experimental groups.

So, does this mean that without randomization we can’t establish causality?

Take a moment and consider the well-known Surgeon General’s warning that is placed on packs of cigarettes.

We have no experimental evidence in which we randomly assign certain individuals to smoke (and follow-up later on to evaluate health problems). Yet, we still are quite certain that smoking is a key risk factor for many bad health outcomes later in life. (Notice the Surgeon General explicitly uses the word causes).

So how do we establish causality in the absence of experimental evidence? How do we deal with selection bias?

 

 

L08 Defining ‘What Works’ in Observational Studies

When experimental controls are not an option, how do we evaluate observational studies? How do we know “what works”?

When evaluating the effect of a program or intervention, we can conduct an impact assessment.

In other words, we can ask what effects does the program have on the intended outcomes? Are there important unintended effects? Is the program leading to a change?

The key idea is that the more rigorous the research design, the higher the validity of the resulting estimate of the intervention effects. Although an experiment is the highest level of rigor, in the absence of an experiment we can utilize quasi-experimental or non-experimental methods.

Recall that in the previous lesson’s discussion of the Maryland Scientific Methods Scale, we looked at the report ‘What works, what doesn’t, and what’s promising (Links to an external site.)’. This was the 1997 report to Congress, based on a systematic review of more than 500 scientific evaluations of crime prevention practices (where only 16% of the studies evaluated used experimental methods).

Let’s take a look again at the 5-level scale the report used.

L07: Table 1: Sherman et al.’s Scientific Methods Scale (SMS)
SMS Score Description
1 Correlation between a crime prevention program and a measure of crime or crime risk factors
2 Temporal sequence between the program and the crime or risk outcome clearly observed, or a comparison group present without demonstrated comparability to the treatment group
3 A comparison between two or more units of analysis, one with and one without the program
4 Comparison between multiple units with and without the program, controlling for other factors, or a nonequivalent comparison group has only minor differences evident
5 Random assignment and analysis of comparable units to program and comparison groups

From Evaluation Level to ‘What Works’

So, given the scale above, and the studies reviewed, how did the authors define ‘what works’? The authors turned these numbers into four categories.

  • What Works:“Programs coded as ‘working’ by this definition must have at least two level 3 evaluations with statistical significance tests and the preponderance of all available evidence showing effectiveness.”
  • What Doesn’t Work:“Programs coded as ‘not working’… must have at least two level 3 evaluations with statistical significance tests showing ineffectiveness and the preponderance of all available evidence…”
  • What’s Promising:“Programs are coded as ‘promising’ if they were found effective in at least one level 3 evaluation”
  • What’s Unknown:Any study that was not placed in any of the three categories above.

Notice that many studies could be described as ‘working’ in that the intervention using a much weaker research design than level 5, or random assignment.

 

 

 

 

 

L08 Selection Bias

When randomization of an intervention is not possible, subjects can self-select into treatment. By extension, differences in outcomes between treatment and control groups may be due to preexisting differences between the two groups, rather than an actual causal effect of the intervention. Such differences are known as selection effects or selection biases. Selection can make it appear as if a causal effect is present even if one is not.

Selection bias occurs when treatment and comparison groups are different at the beginning or end of the study. In a lot of applications, it is important to be able to understand and diagnose the problem of selection bias. This is often more important than being able to ‘solve’ it. Oftentimes, methods to correct selection bias can get pretty complicated (and are a lot of times not guaranteed to ‘fix’ anything). Be conscious of papers which use fancy methods and make claims about causality – they are often wrong. Finally, it is important to realize that sometimes correlation is okay, and it can be a useful result.

There are lots of ways to deal with selection bias, including:

  • Regression
  • Matching
  • Within-person Comparisons
  • Case-Control Design

Most of these require some more advanced knowledge of statistical methods.

Regression, for example, allows you to explain variability in a dependent variable (y) with an independent variable (x). With multiple regression, the idea is that you can statistically ‘control’ for other confounders. You should be careful with regression – although it looks sophisticated it does not necessarily mean that you make causal statements!

We’ll consider examples of three of these next: Regression, Matching and Within-person Comparisons, beginning with an example of regression analysis.

 

 

 

 

 

 

 

L08 Example of Regression Analysis: Early Onset Offending and Neuropsychological Deficits

A regression analysis lets you ‘control’ for other factors which might affect both the outcome and key independent variables. Let’s consider an example of this.

In his article “Testing Moffitt’s neuropsychological variation hypothesis for the prediction of life-course persistent offending (Links to an external site.)”, Piquero (2001) is interested in testing Moffitt’s theory of antisocial behavior, specifically the hypothesis that the presence of neuropsychological deficits should differentiate life-course persistent offenders (as measured by violent and non-violent offending).

To test this hypothesis, Piquero tests for the association between early onset and neuropsychological (NP) deficits measured using the Wechsler Intelligence Scale (WISC; at age 7), notable verbal and performance on this scale.

In the regression, Piquero controls for other known correlates of offending, include sex, SES and family structure (which, as stated above, may affect the outcome).

In the output below, the numbers reported in the table are regression coefficients. The negative coefficient on ‘verbal subscale’ means that as the verbal intelligence increases, violent offending decreases. The other variables (e.g., sex, family structure, low birth weight, etc.) are controls which we think might be relevant confounders.

Hierarchical logistic regression estimates predicting non-violent and violent offending
Variable Model 1 (B) Model 1 (se) Model 2 (B) Model 2 (se)
Sex -1.703 0.637 -2.006 0.663
Low Birth Weight 0.370 0.418 0.465 0.436
SES -0.010 0.013 0.076 0.163
Biosocial Interaction 0.085 0.158 0.076 0.163
WISC- Verbal Subscale -0.227 0.092
WISC- Performance Subscale -0.180 0.111

As Piquero found, this shows that for NP deficit measures, the association between verbal score and the outcome is negative as hypothesized, even after controlling for other correlates of offending.

 

 

L08 Example of Matching: Adult Transfer

Matching is a way to create a counterfactual by comparing subjects who ‘look alike’ in terms of many or most observable outcomes. Though the process of creating matches can get somewhat complicated, the result is very easy to understand and explain to nontechnical audiences.

Let’s consider matching using an article researching the effect of adult transfer on future juvenile recidivism.

In their research study, “Differential Effects of Adult Court Transfer on Juvenile Offender Recidivism,” Loughran et al. (2010) compared differences between transferred youth and those retained in the juvenile system in terms of possible confounder at baseline, i.e., before transfer.

Below are the baseline differences before matching. The values for juvenile and adult are the means for either of these groups, respectively, for each variable.

The t-stat is for the associated test of the null hypothesis of no difference. A t-stat with an absolute value greater than 1.96 means we can reject this null hypothesis at a=.05, or that there is a preexisting difference on this measure between juveniles and adults which is statistically significant.

Pay attention to the t-stat values listed below. How many are greater than 1.96?

Table 1: Baseline differences between transferred youth and those retained in the juvenile system
  Transfer Juvenile T-stat
Age 17.0 16.22 8.03
Male 0.91 0.83 2.64
White 0.23 0.36 -2.74
Suppression of Anger 2.95 2.68 2.84
Temperament 2.87 2.68 2.39
# Priors – Ever 3.46 2.89 2.42
Exposure to Violence 5.70 4.74 3.03
Certainty of Punishment 5.47 5.91 -2.15

 

Now let’s consider the same list, but this time after matching. What do you notice about the t-stat values after matching?

Table 2: After matching differences between transferred youth and those retained in the juvenile system
  Transfer Juvenile T-stat
Age 17.0 16.86 1.46
Male 0.91 0.95 -1.17
White 0.23 0.20 0.86
Suppression of Anger 2.95 2.91 0.48
Temperament 2.87 2.77 1.21
# Priors – Ever 3.46 3.51 -0.02
Exposure to Violence 5.70 6.02 -1.01
Certainty of Punishment 5.47 5.77 -1.45

Notice that after matching, the groups tend to look much more similar than before. This allows us to assume our comparison group is a reasonable counterfactual.

 

 

L08 Within Person Designs

With both regression and matching, the strength of the counterfactual depends on how much we can observe and subsequently control for as possible confounders.

A within-person design allows each person to be their own counterfactual. These estimates look at changes in the IV and changes in the DV.  If there are confounders which do not change over time (such as demographic variables like sex, or relatively stable concepts like criminal propensity), then they can be eliminated as confounders by studying this change. Otherwise, this tends to work much like a basic regression.

 

 

L08 Summary

When assessing causality in non-experiments, there are several issues to consider. First, how well can we meet the criteria for causality? Next, does one type of design do better than another?

Does the design meet the criteria of time order? In an experimental design, yes. In a nonexperimental, maybe. Does the design meet the criteria of correlation? Both are equally capable of showing correlation.

Does the design meet the criteria of spuriousness? In an experimental design, yes—because the only difference is the intervention. In a non-experimental, we can use statistical controls, that is we can hold variable(s) constant so relationships between two or more other variables can be examined apart from the influence of ‘control’ variable(s).

In this lesson, we continued to think of how to make causal inference in the absence of experimental data, which is often rare and difficult to acquire in criminological studies. Instead, we must often rely on observational data in which individuals tend to nonrandomly select into interventions we wish to study. The preexisting differences between those who do and do not receive interventions are known as selection bias, and it tends to distort our estimate of the true effect of the intervention. To combat selection bias, researchers typically use methods such as regression, matching and within-person designs.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

L08 Discussion: Is Prison Criminogenic?

 

For this week’s discussion, we are going to consider whether or not ‘placement causes crime’.

First, consider the graph below showing re-offending rates (measured in terms of both re-arrest and self-reported offending [SRO]) of a sample of juvenile offenders, some of who are sentenced to prison (i.e., placement) and some of whom are sentenced to probation.

For rearrest, the t-statistic is -.51 and the p-value is less than .001. For self-reported offending, the t-statistic is -2.5 and the p-value is less than .05.

Notice that the differences between outcomes in both cases are statistically significant.

For your initial post this week, answer the following questions:

  • What can we say about the rates of offending for probation versus placement?
  • Can we say that placement causescrime? If not, why?
  • What else could explain the differences we are observing here?

 

 

IF YOU NEED HELP WITH THIS ASSIGNMENT OR A SIMILAR ONE PLEASE PLACE YOUR ORDER TODAY AND GET A DISCOUNT