I am Assistant Professor of Sociology at Princeton University where I am also affiliated with the Politics Department, the Office of Population Research, the Princeton Institute for Computational Science and Engineering, The Center for Information Technology Policy and the Center for the Digital Humanities. I develop new quantitative statistical methods for applications across computational social science. I completed my PhD in Government at Harvard in 2015 where I had the good fortune of working with the interdisciplinary group at IQSS. I also earned a master's degree in Statistics from Harvard in 2014.
I've worked extensively on methods for computer-assisted text analysis and with Justin Grimmer published an introduction to the field. Molly Roberts, Dustin Tingley and I have developed the Structural Topic Model, an unsupervised topic model geared towards inference in the social sciences. The accompanying software stm is available on CRAN and at structuraltopicmodel.com. It also includes a full vignette demonstrating its use. I've also worked on the connections between text analysis and causal inference. Current papers cover adjusting for confounding with text, and text as treatment and outcome. I also contributed to a recent survey of these topics aimed at a computer science audience.
Justin Grimmer, Molly Roberts and I have a book coming out this spring from Princeton University Press, Text as Data: A New Framework for Machine Learning and the Social Sciences, which provides a guide to using computational text analysis to learn about the social world. A related perspective is given in our recent annual review piece on machine learning.
I've also written a new paper with my graduate students Ian Lundberg (now a postdoc at UCLA, then assistant professor at Cornell Information Science) and Rebecca Johnson (now assistant professor at Dartmouth) about the disconnect between claims and evidence in the social sciences. The paper encourages more precision in specifying the goals of quantitative work where the target of estimation is too often specified in terms of regression coefficients rather than model-free estimands.
I teach undergraduate and graduate statistics as well as the occassional course on text analysis. Course materials for these classes as well as from my summer methods camp and sociology statistics reading group are available on my teaching page.