the lehd infrastructure files

The LEHD Infrastructure Files and the Creation of the Quarterly - PowerPoint PPT Presentation

The LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators John M. Abowd , , Bryce E. Stephens and Lars Vilhuber Cornell University U.S. Census Bureau, LEHD Program May 6, 2005 - p. 1/31 The LEHD


  1. In this paper ■ Describe the construction of the LEHD infrastructure The LEHD Infrastructure Files Introduction ✦ ... in particular the imputation mechanisms used ➲ What are QWI? ➲ What is it? ■ Describe the computation of the QWI statistics ➲ In this paper ✦ ... in particular the imputation mechanisms used Input Files Infrastructure Files ■ Describe the disclosure-proofing mechanism Forming Aggregated ■ Describe researcher access to infrastructure files and Estimates: QWI confidential QWI files Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 5/31

  2. The LEHD Infrastructure Files Introduction Input Files âž² Wage records: UI âž² Employer reports: ES202 âž² Demographics Input Files Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 6/31

  3. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ➲ Wage records: UI ➲ Employer reports: ES202 ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31

  4. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31

  5. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files ■ identifies EARNINGS, EMPLOYER, TIME PERIOD Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31

  6. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files ■ identifies EARNINGS, EMPLOYER, TIME PERIOD Forming Aggregated ■ some limited other state-dependent information available Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31

  7. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files ■ identifies EARNINGS, EMPLOYER, TIME PERIOD Forming Aggregated ■ some limited other state-dependent information available Estimates: QWI Disclosure-proofing the QWI ■ in particular, for Minnesota, the ESTABLISHMENT is Publicly available files reported Conclusion May 6, 2005 - p. 7/31

  8. Employer reports: ES202 ... or QCEW The LEHD Infrastructure Files Introduction Input Files âž² Wage records: UI âž² Employer reports: ES202 âž² Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 8/31

  9. Employer reports: ES202 â–  collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files âž² Wage records: UI âž² Employer reports: ES202 âž² Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 8/31

  10. Employer reports: ES202 â–  collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files â–  Also used as the inputs to the Business Employment âž² Wage records: UI âž² Employer reports: ES202 Dynamics (BED) âž² Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 8/31

  11. Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files ■ collects from employers covered by state unemployment Forming Aggregated insurance programs: Estimates: QWI ✦ employment Disclosure-proofing the QWI ✦ payroll Publicly available files ✦ geographic information Conclusion May 6, 2005 - p. 8/31

  12. Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files ■ collects from employers covered by state unemployment Forming Aggregated insurance programs: Estimates: QWI ✦ employment Disclosure-proofing the QWI ✦ payroll Publicly available files ✦ geographic information Conclusion ■ fundamental unit: ’reporting unit’ ( ≈ establishment) May 6, 2005 - p. 8/31

  13. Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files ■ collects from employers covered by state unemployment Forming Aggregated insurance programs: Estimates: QWI ✦ employment Disclosure-proofing the QWI ✦ payroll Publicly available files ✦ geographic information Conclusion ■ fundamental unit: ’reporting unit’ ( ≈ establishment) ■ One report per establishment per quarter is filed May 6, 2005 - p. 8/31

  14. Demographics ■ Demographics are taken from a number of Census-internal The LEHD Infrastructure Files Introduction files derived from administrative data: Input Files ✦ Person Characteristics File (PCF) ➲ Wage records: UI ➲ Employer reports: ES202 ✦ Census Numident ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 9/31

  15. Demographics ■ Demographics are taken from a number of Census-internal The LEHD Infrastructure Files Introduction files derived from administrative data: Input Files ✦ Person Characteristics File (PCF) ➲ Wage records: UI ➲ Employer reports: ES202 ✦ Census Numident ➲ Demographics ■ Where available, more detailed data on individuals is also Infrastructure Files Forming Aggregated extracted from surveys and censuses: Estimates: QWI ✦ CPS Disclosure-proofing the QWI ✦ SIPP Publicly available files ✦ ACS Conclusion ✦ 1990 Census ✦ 2000 Census May 6, 2005 - p. 9/31

  16. The LEHD Infrastructure Files Introduction Input Files Infrastructure Files âž² EHF: Employment History Infrastructure Files Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 10/31

  17. EHF: Employment History Files ■ Job-level EHF The LEHD Infrastructure Files Introduction ✦ complete in-state work history for each individual on Input Files UIwage records. Infrastructure Files ✦ one record for each employee-employer combination – a ➲ EHF: Employment History Files job ➲ ICF: Individual Characteristics File ✦ earnings and employment patterns ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 11/31

  18. EHF: Employment History Files ■ Job-level EHF The LEHD Infrastructure Files Introduction ✦ complete in-state work history for each individual on Input Files UIwage records. Infrastructure Files ✦ one record for each employee-employer combination – a ➲ EHF: Employment History Files job ➲ ICF: Individual Characteristics File ✦ earnings and employment patterns ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address ■ Employer and establishment-level employment history List ➲ Flow so far ✦ QCEW-based employment-activity history for every SEIN Forming Aggregated (employer) and SEINUNIT (establishment) Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 11/31

  19. EHF: Employment History Files ■ Job-level EHF The LEHD Infrastructure Files Introduction ✦ complete in-state work history for each individual on Input Files UIwage records. Infrastructure Files ✦ one record for each employee-employer combination – a ➲ EHF: Employment History Files job ➲ ICF: Individual Characteristics File ✦ earnings and employment patterns ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address ■ Employer and establishment-level employment history List ➲ Flow so far ✦ QCEW-based employment-activity history for every SEIN Forming Aggregated (employer) and SEINUNIT (establishment) Estimates: QWI ■ Comparison of employment and activity of SEINs between Disclosure-proofing the QWI Publicly available files UI and QCEW files is done for QA purposes, and in Conclusion preparation of weighting. May 6, 2005 - p. 11/31

  20. ICF: Individual Characteristics File â–  Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files Infrastructure Files âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  21. ICF: Individual Characteristics File â–  Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files â–  records without a valid match flagged Infrastructure Files âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  22. ICF: Individual Characteristics File â–  Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files â–  records without a valid match flagged Infrastructure Files âž² EHF: Employment History â–  CPS and SIPP identifiers are merged on. Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  23. ICF: Individual Characteristics File â–  Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files â–  records without a valid match flagged Infrastructure Files âž² EHF: Employment History â–  CPS and SIPP identifiers are merged on. Files âž² ICF: Individual â–  ... gender, education, and age information from the CPS Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  24. ICF: Individual Characteristics File â–  Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files â–  records without a valid match flagged Infrastructure Files âž² EHF: Employment History â–  CPS and SIPP identifiers are merged on. Files âž² ICF: Individual â–  ... gender, education, and age information from the CPS Characteristics File âž² ECF: Employer Characteristics File â–  Data completion âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  25. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  26. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  27. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI ✦ Education Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  28. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI ✦ Education Disclosure-proofing the QWI ✦ County of residence Publicly available files Conclusion May 6, 2005 - p. 12/31

  29. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI ✦ Education Disclosure-proofing the QWI ✦ County of residence Publicly available files are each imputed ten times Conclusion May 6, 2005 - p. 12/31

  30. ECF: Employer Characteristics File â–  Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction Input Files Infrastructure Files âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  31. ECF: Employer Characteristics File â–  Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction â–  Inputs: Input Files Infrastructure Files âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  32. ECF: Employer Characteristics File â–  Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction â–  Inputs: Input Files 1. ES202 Infrastructure Files âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  33. ECF: Employer Characteristics File â–  Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction â–  Inputs: Input Files 1. ES202 Infrastructure Files âž² EHF: Employment History 2. UI: supplement information on the ES202, extend Files âž² ICF: Individual published BLS county-level employment data Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  34. ECF: Employer Characteristics File â–  Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction â–  Inputs: Input Files 1. ES202 Infrastructure Files âž² EHF: Employment History 2. UI: supplement information on the ES202, extend Files âž² ICF: Individual published BLS county-level employment data Characteristics File âž² ECF: Employer Characteristics File 3. GAL: establishment geocodes âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  35. ECF: Employer Characteristics File â–  Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction â–  Inputs: Input Files 1. ES202 Infrastructure Files âž² EHF: Employment History 2. UI: supplement information on the ES202, extend Files âž² ICF: Individual published BLS county-level employment data Characteristics File âž² ECF: Employer Characteristics File 3. GAL: establishment geocodes âž² GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  36. ECF: Employer Characteristics File â–  Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction â–  Inputs: Input Files 1. ES202 Infrastructure Files âž² EHF: Employment History 2. UI: supplement information on the ES202, extend Files âž² ICF: Individual published BLS county-level employment data Characteristics File âž² ECF: Employer Characteristics File 3. GAL: establishment geocodes âž² GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information âž² Flow so far Forming Aggregated â–  Longitudinal edits for consistency and data completion Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  37. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI ■ Imputation: Disclosure-proofing the QWI ✦ impute SIC if NAICS non-missing and vice-versa Publicly available files Conclusion May 6, 2005 - p. 13/31

  38. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI ■ Imputation: Disclosure-proofing the QWI ✦ impute SIC if NAICS non-missing and vice-versa Publicly available files ✦ unconditional impute of missing SIC and NAICS codes Conclusion May 6, 2005 - p. 13/31

  39. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI ■ Imputation: Disclosure-proofing the QWI ✦ impute SIC if NAICS non-missing and vice-versa Publicly available files ✦ unconditional impute of missing SIC and NAICS codes Conclusion ✦ geography conditional on industry May 6, 2005 - p. 13/31

  40. GAL: Geocoded Address List â–  ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files Infrastructure Files âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  41. GAL: Geocoded Address List â–  ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files â–  geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  42. GAL: Geocoded Address List â–  ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files â–  geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates âž² EHF: Employment History Files âž² ICF: Individual â–  Inputs: Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  43. GAL: Geocoded Address List â–  ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files â–  geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates âž² EHF: Employment History Files âž² ICF: Individual â–  Inputs: Characteristics File âž² ECF: Employer 1. ES202 data Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  44. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  45. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  46. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files Conclusion May 6, 2005 - p. 14/31

  47. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion May 6, 2005 - p. 14/31

  48. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion 1. geocoded May 6, 2005 - p. 14/31

  49. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion 1. geocoded 2. standardized May 6, 2005 - p. 14/31

  50. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion 1. geocoded 2. standardized 3. unduplicated (by firm name) May 6, 2005 - p. 14/31

  51. Flow so far The LEHD Infrastructure Files Introduction Input Files Infrastructure Files âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 15/31

  52. Flow so far The LEHD Infrastructure Files Introduction Input Files Infrastructure Files âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 15/31

  53. Flow so far The LEHD Infrastructure Files Introduction Input Files Infrastructure Files âž² EHF: Employment History Files âž² ICF: Individual Characteristics File âž² ECF: Employer Characteristics File âž² GAL: Geocoded Address List âž² Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 15/31

  54. The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Forming Aggregated Estimates: QWI Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 16/31

  55. Correction of spurious worker flows â–  Firm identifier: The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  56. Correction of spurious worker flows â–  Firm identifier: state-specific account number The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  57. Correction of spurious worker flows â–  Firm identifier: The LEHD Infrastructure Files Introduction â–  Account numbers can and do change: Input Files Infrastructure Files Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  58. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  59. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  60. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ■ Change in firm identifier ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  61. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ■ Change in firm identifier is the component determining when ➲ Correction of spurious worker flows a worker changes employers ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  62. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ■ Change in firm identifier ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ■ → non-economic change in identifier creates spurious flow ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  63. Solution: Successor-Predecessor File â–  track large worker movements between SEINs The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 18/31

  64. Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction ■ → link entities that have different account numbes, but Input Files constitute the same economic entitiy Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 18/31

  65. Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction ■ → link entities that have different account numbes, but Input Files constitute the same economic entitiy Infrastructure Files ■ SPF provides a variety of link characteristics, based on the Forming Aggregated Estimates: QWI number of workers leaving an SEIN, in both absolute and ➲ Correction of spurious worker flows relative terms, and the number of workers entering an SEIN, ➲ Solution: Successor-Predecessor again in absolute and relative terms. File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 18/31

  66. Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction ■ → link entities that have different account numbes, but Input Files constitute the same economic entitiy Infrastructure Files ■ SPF provides a variety of link characteristics, based on the Forming Aggregated Estimates: QWI number of workers leaving an SEIN, in both absolute and ➲ Correction of spurious worker flows relative terms, and the number of workers entering an SEIN, ➲ Solution: Successor-Predecessor again in absolute and relative terms. File ➲ Attaching establishment characteristics to jobs ■ QWI: if 80% of an SEIN’s workers (the predecessor) are ➲ U2W: Unit to Worker Impute observed to move to a single successor, and that successor ➲ Probability Model ➲ Implementation absorbs 80% of its employees from a single predecessor, ➲ Implementation ➲ Computing the statistics then all flows between those two account numbers are Disclosure-proofing the QWI filtered out, and treated as if they had never existed. Publicly available files Conclusion May 6, 2005 - p. 18/31

  67. Attaching establishment characteristics to jobs â–  Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  68. Attaching establishment characteristics to jobs â–  Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction â–  Problem: no establishment identification on wage record Input Files Infrastructure Files Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  69. Attaching establishment characteristics to jobs â–  Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction â–  Problem: Input Files Infrastructure Files Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  70. Attaching establishment characteristics to jobs â–  Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction â–  Problem: Input Files â–  30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  71. Attaching establishment characteristics to jobs â–  Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction â–  Problem: Input Files â–  30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI â–  Solution: probability model for employment location and âž² Correction of spurious worker flows âž² Solution: imputation Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  72. Attaching establishment characteristics to jobs â–  Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction â–  Problem: Input Files â–  30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI â–  Solution: probability model for employment location and âž² Correction of spurious worker flows âž² Solution: imputation Successor-Predecessor File â–  Key elements are: âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  73. Attaching establishment characteristics to jobs â–  Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction â–  Problem: Input Files â–  30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI â–  Solution: probability model for employment location and âž² Correction of spurious worker flows âž² Solution: imputation Successor-Predecessor File â–  Key elements are: âž² Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  74. Attaching establishment characteristics to jobs â–  Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction â–  Problem: Input Files â–  30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI â–  Solution: probability model for employment location and âž² Correction of spurious worker flows âž² Solution: imputation Successor-Predecessor File â–  Key elements are: âž² Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence âž² U2W: Unit to Worker Impute âž² Probability Model 2. distribution of employment across establishments of âž² Implementation âž² Implementation multi-establishment firms. âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  75. Attaching establishment characteristics to jobs â–  Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction â–  Problem: Input Files â–  30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI â–  Solution: probability model for employment location and âž² Correction of spurious worker flows âž² Solution: imputation Successor-Predecessor File â–  Key elements are: âž² Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence âž² U2W: Unit to Worker Impute âž² Probability Model 2. distribution of employment across establishments of âž² Implementation âž² Implementation multi-establishment firms. âž² Computing the statistics â–  Important practical aspects: Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  76. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics ■ Important practical aspects: Disclosure-proofing the QWI ✦ Non-ignorable missing data imputation Publicly available files Conclusion May 6, 2005 - p. 19/31

  77. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics ■ Important practical aspects: Disclosure-proofing the QWI ✦ Non-ignorable missing data imputation Publicly available files ✦ Several million imputations every quarter Conclusion May 6, 2005 - p. 19/31

  78. U2W: Unit to Worker Impute â–  workers i = 1 , ..., I The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 20/31

  79. U2W: Unit to Worker Impute â–  workers i = 1 , ..., I The LEHD Infrastructure Files Introduction â–  firms j = 1 , ..., J Input Files Infrastructure Files Forming Aggregated Estimates: QWI âž² Correction of spurious worker flows âž² Solution: Successor-Predecessor File âž² Attaching establishment characteristics to jobs âž² U2W: Unit to Worker Impute âž² Probability Model âž² Implementation âž² Implementation âž² Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 20/31

Recommend


More recommend