Fixed effects with panel data vs including lagged variables with cross section data

Cross Validated Asked by gannawag on January 3, 2022

I have panel data with many groups $i$ and two time periods $t$.
I want to know the effect of a binary treatment $D$ on a continuous outcome $Y$. Some groups go from untreated to treated, while others are treated in both periods, and others are untreated in both periods.
I am considering two approaches, and I’m curious about the differences between the two.

Approach 1: Fixed effects with panel data

I shape the data into long format, where each observation is a group-time period (so each group has two observations in this case). Then I run the following regression:

$Y_{it} = delta_1 D_{it} + alpha_i + gamma_t + epsilon_{it} $

Where $alpha_i$ is a group-level fixed effect, and $gamma_t$ is a time period-level fixed effect (in this case it would just be a dummy for the second time period).

Approach 2: including lagged variables with cross section data

Reshape the data into wide format, so each observation is a group. Then I have two new variables that are the lagged outcome value ($Y_{t-1}$), and the lagged treatment status variable ($D_{t-1}$). The $i$ subscript is gone. Run the following regression:

$Y_{t} = delta_2 D_{t} + beta_1 D_{t-1} + beta_2 Y_{t-1} + nu_{t} $

What is the difference between the two approaches? Is one generally preferred or is it context-specific? here is a screenshot of some made up data in long format. the wide format only uses the observations where the lags are not NA

Add your own answers!

Ask a Question

Get help from others!

© 2024 All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP