Lab 11:  Modeling Integrated Data

This lab shows how to create geo-spatial data at the Census public-use micro area level. These data are then integrated into the person-level data. You are asked to compare different fixed and random effects models for the integrated data.
  • The lab uses pumsbxak (AK 1% PUMS from Census 2000). All data and programs are also available on the SSG at /ssgprojects/courses/info7470/.
  • The lab can be performed using Stata.
  1. Create average income and house size (rooms) within PUMA5 area.
  2. Link those data to the person-level file.
  3. Build a fixed-effects model of the relation between log wage and salary income and sex, age, educational attainment, average income in the area, and average house size in the area.
  4. In the same model replace the area variables with fixed generic geographic heterogeneity controls.
  5. In the same model replace the fixed generic geographic heterogeneity controls with random ones.
  6. Interpret the results.

Files to submit

  • Submit a single program that achieves all of the above results. The program is expected to run error free, starting only with provided data. Clearly label any output listing (i.e., create proper table headings and footnotes, as if you were inserting this into a term paper). Your program should clearly identify the author (i.e., you!). You can include the data selector program by including the line
    %include "02.pums2000-missing.sas";
    at the top of your program. A template program is provided here and on the SSG.
  • Submit a document (Word, PDF, txt) that provides the interpretation of the results.

Submitting labs

Maximum group size: for programs, up to 3 students, if so declared. Otherwise: individual submissions only. Each student should still submit all required elements individually. Submissions are made on the Course Management Site. The documents can be submitted here. Due date: April 26, 2011, 3:59PM (Note: the site is used only for submissions, not for the other functionality you will find there.)