TransWikia.com

How to predict income n years after graduation using income dataset with age, but not graduation year?

Economics Asked by Aidan O'Gara on December 21, 2020

I’m trying to predict the income of graduates of certain Master’s programs n years after graduation, using the American Community Survey individual level dataset to calibrate my equation. It gives me observations of individuals listing their income, age, degree program, and other variables, but doesn’t tell me years since graduation or years in the workforce. How can I best predict income n years after graduation?

I’d like my final product to be a regression equation of the form:

income_year_n = starting_salary(many variables) * (1+rate_of_raises)^n

My initial approach is along the lines of:

  • The youngest people with this degree are 23 years old. They all must have graduated the previous year. Use them to calibrate the starting_salary function.
  • The next youngest group is 24. They must either have one year of experience, or have just graduated. I know roughly the proportion for each, because I know how many 23 year olds there are in the dataset, but I don’t know which are the new graduates and which graduated last year.
  • The next youngest is 25, this group has either 0, 1, or 2 years of experience, and I know the proportions for each.

And so on. This feels like the beginning of an approach that would work, but I can’t figure out how to go from this data to a regression equation that predicts income after a certain number of years.

Any ideas? I know there’s no perfect solution, I’m looking for a close-enough hack to actually use the equation in the real world. Anything would be greatly appreciated, thanks a ton.

One Answer

I think you're looking for false precision, here. "Years since graduation" is already a step function with an essentially arbitrary cutoff (should it really matter to your model that much if somebody is one day shy of being at the end of +1 years, versus one day into +2 years? the actual difference is a few days.)

Just use age minus 23, or just use age in your model. It shouldn't impact things that much either way.

Answered by Bill Clark on December 21, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP