I am currently the Henry Putnam University Professor of Sociology at Princeton University. I am also a Research Associate at the National Bureau of Economic Research and a faculty affiliate at the New York Genome Center. Additionally, I serve in a pro bono capacity as the Dean of Health Sciences at the University of the People, a tuition-free, accredited, online college committed to expanding access to higher education.
My sociology doctoral thesis--written eons ago--explored the role of parental net-worth in perpetuating racial inequality among the post-Civil Rights cohorts of black and white Americans. It was not until 1984 that multiple U.S., nationally-representative social science surveys such as the Panel Study of Income Dynamics (PSID) and the Survey of Income and Program Participation (SIPP) among others started collecting decent data on households' assets and debts. This allowed for a more complete picture of total family economic resources than was gained by just measuring income, occupation and education. In fact, when thrown all together in a model, parental income and occupation did not seem to matter to children's life chances. Only parental education and net worth retained predictive power. This observation was especially relevant to race dynamics in the U.S., since data showed that the median black family had an order of magnitude less wealth than the typical white family--and that income differences explained only about half this gap. Indeed, when comparing young adults who came from families with the same parental education and wealth levels, there were no observable Black-White gaps in educational attainment, welfare usage, labor market attachment and so on. Not just race, but effects of family structure (such as single parenthood) also seemed to be proxies for wealth effects.
As I was turning my thesis into a book (Being Black, Living in the Red), I came across What Money Can’t Buy: Family Income and Children’s Life Chances by Susan Mayer of the University of Chicago Harris School. Her work challenged my assumptions and altered my research trajectory forever. In this clever volume, Mayer deployed a number of counterfactuals to show that the traditional estimates of the effect of income on children’s life chances had been overstated. For example, she showed that a dollar from a transfer payment had little to no effect on children while a dollar from earnings had a much bigger effect—suggesting that it was the underlying attributes of the parents associated with earnings that were having the positive effect, not the dollars per se. Further, controlling for parental income after the fact—i.e. when the offspring was already an adult—wiped out the effect of parental income that temporally preceded the child measure in question. She also showed that additional income did not usually result in the purchase of goods or services we might expect to improve the human capital or life chances of children. While there are certainly limitations to her work and some questionable assumptions in her models, she upended the world of poverty research as far as I was concerned, lifting a methodological veil to reveal biased parameter estimates plaguing the literature. (It is important to note that much research in the subsequent decades has shown income to have a robust causal effect on children's outcomes.)
While I went on to publish my book with the appropriate warnings against interpretation of parental wealth “effects” as causal, the Mayer work sent me off in search of a correctly specified way to assess the impact of parental resources and family conditions on children’s outcomes. This journey led me first to econometrics and labor economics, which I viewed as ahead of sociology in confronting the issue of endogeneity and selection bias. Though I found difference-in-differences, instrumental variable, and regression discontinuity approaches helpful in generating more consistent estimates, such approaches all suffer from the limitation that the researcher has to "take what she could get," so to speak. In other words, the research questions that one can answer are driven by the availability of a natural experiment. There is—as far as I know—no good instrumental variable for parental wealth, for example. There is no regression discontinuity for race. Even if we consider randomized controlled trials, there remain issues such as limited external validity as pointed out by Angus Deaton and others. What's more, reliance on RCTs or natural experiments puts narrow boundaries on the sorts of factors that are manipulable (and the population of compliers who respond to treatment) and therefore amenable to being studied in a causal, counterfactual framework. To quote Penn sociologist Herb Smith, “Nobody denies that the moon causes the tides even though we can’t perform an experiment on it.”
This frustration, in turn, led me to study genetics. (I received a Ph.D. in biology from NYU in 2014.) Five decades of research in behavioral genetics has demonstrated that the additive genetic component of the variation in traits in human populations is generally large and, correspondingly, the common (or family) portion of environmental variation explains little (with the notable exception of educational attainment). This is a huge gauntlet thrown down to social science researchers--in particular, to stratification scholars. Have we been studying family background effects thinking they were capturing the impact of household, school, and neighborhood environments when they were really reflecting shared genetics? Had we been focused on the wrong dimensions of the environment and instead should have been looking at "unique" environmental influences that are not shared by siblings? (See The Pecking Order on this last issue.)
The recent addition of genetic markers (single nucleotide polymorphisms or SNPs) to large datasets such as the Health and Retirement Study, the National Longitudinal Survey of Adolescent to Adult Health, and the Wisconsin Longitudinal Survey has opened up a new frontier for the social sciences. We now enjoy the possibility of directly confronting, measuring, and controlling for one of the two main "lurking" variables bias traditional models of socioeconomic attainment. That lurking variable is, of course, genetic endowment (the other being the influence of cultural practices that are also transmitted across generations). In my current research, I attempt to model directly what has until now been a latent variable in models of socioeconomic attainment. By constructing and including polygenic indices (PGIs) for outcomes, we can obtain better-specified, less-biased parameter estimates for the variables (such as parental education and so on) that typically interest social scientists. Further, we can then interact genetic propensity with exogenous environmental variables to go from the adage "a gene for aggression lands you in the board room if you grow up ruch but in prison if you grow up poor" to a robust research agenda on GxE effects. I believe this is one of the next frontiers in stratification research: integrating the big data of genomics with the established social scientific models of mobility.
The collection of these data and advances in econometric methods represent a major potential shift for the social sciences as they broaden and deepen the study of the transmission of social behavior. To date, modeling genetic influences on social outcomes among humans was mainly the province of behavioral geneticists who relied on adoption studies and identical (MZ) v. fraternal (DZ) twin comparisons in order to quantify the degree of (unmeasured) genetic influence on behavioral phenotypes. These studies are controversial, and the assumptions underlying them have been questioned (e.g., Goldberger 1979; for a defense see Conley et al. 2013). The shift to the study of specific genetic markers offers hope for those interested in an explicit research program aimed at identifying genotype-specific effects for complex traits such as behavioral phenotypes (Manski 2011).
The first payoff to actual, measured genotypes to mobility research is in the specification of proper models for traditional variables. While individual markers at the genome wide significance level (p<5x10-8) have limited predictive value for behavioral outcomes on their own, researchers have had more success in generating overall polygenic indices by progressively adding the coefficients associated with genetic markers to generate a single, scalar variable that performs fairly well in terms of R2. When controlling for PGIs, we can observe what happens to “traditional” variables in mobility research, such as parental education, income, occupation, or wealth. (Ideally, with some datasets we can control for both parents’ genotypes in addition to the offspring genotype.)
In addition to generating genetically less-biased effects of social variables that are somewhat resistant to econometric techniques such as IV regression, the move to studying SNPs and other genetic polymorphisms has opened up a particularly promising research program on genetic-(social) environmental (GxE) interactions in human populations. The estimation of such interaction effects has long been a goal of social scientists fond of expressing the dependence of genetic expression on social structure. Caspi et al. (2002, 2003) suggested an important, genetic source of heterogeneity in responses to adverse early-life events, attempting to partially answer the question of why some individuals are resilient to stressors while others suffer deleterious psychological sequelae. While these studies created substantial interest in potential gene-by-environment interactions, they also required replication and extension by other researchers using alternative data. Indeed, there are now competing meta-analyses suggesting either that the original results linking differential response to stress by genotype are reasonably robust (Karg et al. 2011) or fail to replicate (Risch et al. 2009).
The discussion generated by this line of research in the biological and social science communities has been productive because it has led to a greater appreciation of the shortcoming of Caspi et al.’s research design -- namely that the alleles and the proposed environmental modifiers may not be randomly assigned in the population and may therefore correlate with unobserved causal factors. For example, it may be the case that an observed interaction between a genetic variant and environmental exposure actually reflects differential risk of exposure (e.g., “genes selecting environments”) rather than the genetic moderation of exogenous environmental exposures. This is known as gene-environment correlation (rGE). In this way, measured environments may be correlated with unmeasured genetic variation and thus could be acting as proxy for a gene-by-gene interaction rather than a GxE interaction. Conversely, early studies of candidate genes left open the possibility that the measured genotypes were themselves serving as proxies for unmeasured environments, leading to ExE effects disguised as GxE.
While there is active involvement in enrolling individuals in RCTs and examining genetic heterogeneity of causal effects to help solve these inference problems, this is only a small area of research (typically in pharmacology or toxicology) and likely does not have the capacity to answer many important GxE questions of broader relevance to social science. Because many social interventions occur on a large scale, such as state soda taxation, Medicaid expansion, and student loan cancellation, only large epidemiological and social science data and methods, combined with genetic and biomarker measures, can be deployed to examine issues related to broad questions of major policy import—such as why some interventions (like sin taxes) work for certain individuals and sub-populations and not for others or why certain pedagogical styles are more or less effective with given student populations
Thus, much of my current work applies econometric methods for causal inference--namely, a natural experiment framework--to genome-wide and/or sibling/twin data available in social surveys to model gene-by-environment interaction effects. Examples in this vein include deploying the Vietnam draft lottery to examine the heterogeneous effects of military service on educational attainment and smoking behavior, using twin differences in birth weight to examine post-natal racial disparities in infant mortality, modeling genetic interactions with exogenous job loss (such as plant closure), and linking sibling differences in genotype and environments to questions of health, development and socioeconomic attainment across the life course.
I also use genetics as a tool to interrogate social processes. For example, in some work, we use random variation in the metagenomic environment in schools to identify peer influence in a way that is not confounded by homophily, the reflection problem, or contextual effects. We control for genotypes to better isolate the social impact of colorist discrimination on cardiovascular health among African Americans. We use offspring genotype as a random shock to child endowment to study child investment and parental behavior. And we use spouses' genotypes to identify marital socialization effects. Lastly, I am also interested in mapping the genetic architecture of phenotypic plasticity, developing approaches to test for antenatal genetic selection, interrogating the assumptions underlying twin and GREML models for heritability, and characterizing social and genetic sorting (e.g., assortative mating and differential fertility) as distinct processes.
Other recent, non-genetic work has examined the transgenerational impact of birth order on eductional achievement, the effects of cousin order and cousinship size on attainment, deployed random variation in local weather to assess the impact of children's activities on cognition, and assessed the impact of the "virtual physics" of online systems on crowd wisdom, social learning, and racial/gender/SES biases in evaluations.
 I should note that subsequent experimental research like the American Dream Demonstration project and the Oklahoma SEED study has shown very minor to modest effects of wealth, highlighting the bias inherent in bread and butter regressions such as those I ran at the beginning of my career.
 Similar efforts are also underway in Europe, for example: the Biobank Project in the United Kingdom (Ollier et al., 2005), MoBA in Norway, and the large-scale genotyping of subjects at several European twin registries.
|Conley CV||135 KB|