The Sociology Statistics Reading Group is an approximately bi-weekly group to discuss interesting statistical methods papers drawn from a wide range of literatures. Each meeting we choose a paper which all members of the group read. A single discussant then walks through the material and we have a broader group discussion. Please note that the presentations below often draw figures from the original papers and sometimes from presentations graciously provided by the paper authors.

NB: We won't be running reading group this Fall (2020) because of COVID and my teaching scheduling. I'm offering a related reading course in machine learning with social data. We will plan to resume in the Spring.

For some exemplars of presentations see the Causal Forests presentation by Ian Lundberg, Fairness in Machine Learning presentation by Rebecca Johnson or the Double Machine Learning by Chris Felton (handout version).

**Year 3: 2018-2019**

(Coordinators: Rebecca Johnson & Simone Zhang)

May 1, 2019: Word Embeddings and Fairness

Garg et al. (2018) Word embeddings quantify 100 years of gender and ethnic stereotypes *Proceedings of the National Academy of Sciences*

Presenter: Ziyao Tian (Presentation)

April 24, 2019: Measuring Paradigmaticness with Text

Evans, Gomez and McFarland (2016) Measuring Paradigmaticness of Disciplines Using Text *Sociological Science*.

Presenter: Alex Kindel (Presentation)

March 13, 2019: Working with Body Camera Data

Voigt et al (2017) Language from policy body camera footage shows racial disparities in officer respect. *Proceedings of the National Academy of Sciences*.

Presenter: Simone Zhang (Presentation)

February 20, 2019: Versions of Treatment (consistency in causal inference)

Hernán (2016) Does water kill? A call for less casual causal inferences. *Annals of Epidemiology*.

Presenter: Ian Lundberg (Presentation)

November 29, 2018: Machine Learning and Causal Inference

Chernozhukov et al (2018) Double/debiased machine learning for treatment and structural parameters. *Econometrics Journal*.

Presenter: Chris Felton (Presentation)

November 8, 2018: Sequence Analysis

Aisenbrey and Fasang (2010) New life for old ideas: The" second wave" of sequence analysis bringing the" course" back into the life course *Sociological Methods & Research.*

Presenter: Daniela Urbina Julio (Presentation)

October 25, 2018: Fairness in Machine Learning

Corbett-Davies and Goel "The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning"

Lakkaraju et al `The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables' in *KDD *2017

Presenter: Rebecca Johnson (Presentation)

October 4, 2018: Computer Vision and Deep Convolutional Neural Networks

LeCun, Bengio and Hinton "Deep Learning" *Nature.* 2015.

Optional: Fei Fei Li's TED Talk on Computer Vision

Optional: Goodfellow, Bengio and Courville *Deep Learning* Convolutional Networks Chapter

Presenter: Han Zhang (Presentation)

**Year 2: 2017-2018**

(Coordinator: Ian Lundberg)

May 3, 2018: Gradient Boosting

Efron and Hastie Computer Age Statistical Inference. 17.2-17.4

Presenter: Jeremy Cohen (Presentation)

April 19, 2018: Stochastic Actor Oriented Models

Block et al (2018) *Social Networks*. Change we can believe in: Comparing longitidunal network models on consistency, interpretability and predictive power.

Presenter: Daniel Karell (Presentation)

March 29, 2018: Machine Learning and Personalized Policy

Bansak et al (2018) *Science* Improving refugee integration through data-driven algorithmic assignment

Presenter: Hannah Postel (Presentation)

March 8, 2018: Scale-Free Networks

Barabási (2015) *Network Science* Chapter 4: The Scale-Free Property

Broido and Clauset (2018) "Scale-free networks are rare"

Presenter: Cambria Naslund (Presentation)

March 1, 2018: LASSO and Causal Inference

Belloni, Chernozhukov and Hansen (2014) "High-Dimensional Methods and Inference on Structural and Treatment Effects." *Journal of Economic Perspectives*.

Hastie, Tibshirani and Friedman (2009) *Elements of Statistical Learning *Selections from Chapter 3 on the LASSO.

Presenter: Daniela Urbina Julio (Presentation)

February 8, 2018: Mixture of Regressions

Imai and Tingley (2012) "A Statistical Method for Empirical Testing of Competing Theories." *American Journal of Political Science.*

Presenter: Ryan Parsons

December 14, 2017: High-Dimensional Interactions

Egami and Imai (2017) "Causal Interaction in Factorial Experiments: Application to Conjoint Analysis."

Presenter: Belén Unzueta

November 9, 2017: Sensitivity Analysis

Ding and Vanderweele (2016) "Sensitivity Analysis Without Assumptions." *Epidemiology*.

Presenter: Chris Felton (Presentation)

October 26, 2017: Exponential Random Graph Models

Robins, Pattison, Kalish and Lusher (2007) "An introduction to exponential random graph (p*) models for social networks." *Social Networks*.

Presenter: Janet Xu (Presentation)

October 12, 2017: Causal Forests: A Tutorial in High-Dimensional Causal Inference

Athey and Imbens (2016) "Recursive paritioning for heterogenous causal effects" *Proceedings of the National Academy of Sciences*.

Presenter: Ian Lundberg (Presentation)

September 28, 2017: Problems with p-values

Simmons, Nelson and Simonsohn (2011) "False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant." *Psychological Science*.

Greenland et al. (2016) "Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations" *European Journal of Epidemiology*.

Presenter: Xinyi Duan

September 14, 2017: Predictive Social Science

Collins (1984) "Statistics versus Words" *Sociological Theory*.

Hofman, Sharma and Watts (2017) "Prediction and explanation in social systems" *Science*.

Cranmer and Desmarais (2017) "What can we learn from predictive modeling?" *Political Analysis.*

Presenter: Alex Kindel (Presentation)

**Year 1: 2016-2017**

(Coordinator: Clark Bernier)

May 4, 2017: Weights in Survey Experiments

Franco, Malhotra, Simonovits, Zigerell (2017) "Developing Standards for Post-Stratification Weighting in Population-Based Survey Experiments" *Journal of Experimental Political Science*.

Miratrix, Sekhon, Theororidis and Campos (2017) "Worth Weighting? How to Think About and Use Sample Weights in Survey Experiments"

Presenter: Janet Xu (Presentation)

April 19, 2017: Mixed Factor Analysis

Quinn (2004) "Bayesian factor Analysis for Mixed Ordinal and Continuous Responses" *Political Analysis*.

Presenter Ryan Parsons

March 30, 2017: Fixed Effects

Imai and Kim (2016) "When Should We Use Linear Fixed Effects Regression Models for Causal Inference With Longitudinal Data?"

Presenter Jeremy Cohen (Presentation)

March 16, 2017: Ecological inference

King, Rosen and Tanner (2004) "Information in Ecological Inference: An Introduction" in *Ecological Inference: New Methodological Strategies**.*

Imai and Khanna (2016) "Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records" *Political Analysis*.

(Optional) King, Rosen, Tanner and Wagner "Ordinary Economic Voting Behavior in the Extraordinary Election of Adolf Hitler" *The Journal of Economic History*.

Presenter Simone Zhang (Presentation)

*Special thanks to Gary King for sharing slides used in this section.

March 2, 2017: Conjoint Analysis

Hainmueller, Hopkins and Yamamoto (2014) "Causal Inference in Conjoint Analysis: Understanding Multi-Dimensional Choices Via Stated Preference Experiments" *Political Analysis*.

Presenter Xinyi Duan

February 16, 2017: Mixed Membership Stochastic Blockmodels

Airoldi, Blei, Fienberg and Xing (2008) "Mixed Membership Stochastic Blockmodels" *Journal of Machine Learning Research*.

Presenter: Herissa Lamothe (Presentation)

December 1, 2016: Cross-Validation

Ward, Greenhill and Bakke (2010) "The perils of policy by p-value: Predicting civil conflicts" *Journal of Peace Research*.

Selections on Cross Validation from Andrew Ng and Hastie, Tibshirani and Friedman (2009).

Presenter: Alex Kindel (Presentation)

November 10, 2016: Front-door Adjustment Strategies

Glynn and Kashin (2014) "Front-door Versus Back-door Adjustment with Unmeasured Confounding: Bias Formulas for Front-door and Hybrid Adjustments"

Presenter: Ethan Fosse (Presentation)

October 27, 2016: Sensitivity Analysis

Blackwell (2014) "A Selection Bias Approach to Sensitivity Analysis for Causal Effects" *American Journal of Political Science*.

and

Morgan and Winship (2015) "Chapter 12: Distributional Assumptions, Set Identification, and Sensitivity Analysis" in Counterfactuals and Causal Inference: Methods and Principles for Social Research*, Second Edition*.

Presenter: Rebecca Johnson (Presentation)

October 13, 2016: Mediation

Imai, Keele, Tingley, Yamamoto (2011) "Unpacking the black box of causality: Learning about causal mechanisms from experimental and observational studies" *American Political Science Review.*

and

Acharya, Blackwell and Sen (2016) "Explaining Causal Findings Without Bias: Detecting and Assessing Direct Effects" *American Political Science Review.*

Presenter: Ian Lundberg

*Special thanks to Dustin Tingley and Matt Blackwell for providing us with slides for this session.

September 29, 2016: Interactions

Hainmueller, Mummolo, Xu (2016) "How Much Should We Trust Estimates from Multiplicative Interaction Models? Simple Tools to Improve Empirical Practice"

Presenter: Jason Windawi (Presentation)

*Special Thanks to Yiqing Xu for supplying slides on which much of the presentation is based.

September 22, 2016: Colliders

Elwert and Winship (2014) "Endogenous selection bias: The problem of conditioning on a collider variable" *Annual Review of Sociology*

Presenter: Han Zhang (Presentation)