Selected Research & Analysis: Data Sets, Linkages, Quality, and Evaluation

See also related Statistics & Data Files.

Improving the Measurement of Retirement Income of the Aged Population
ORES Working Paper No. 116 (released January 2021)
by Irena Dushi and Brad Trenkamp

Research has shown that survey-reported pension and retirement income measures may suffer from reporting errors, which lead to biased estimates of income and poverty of the aged population. In this paper, the authors evaluate income estimates from the Census Bureau's 2016 Current Population Survey (CPS) Annual Social and Economic Supplement (ASEC). The authors compare 2016 CPS ASEC public-use data with public-use survey data from the 2016 Health and Retirement Study and with CPS ASEC data that have been merged with administrative data from the Internal Revenue Service (IRS) and the Social Security Administration. They find that for the population aged 65 or older, supplementing the CPS ASEC with IRS and Social Security administrative data results in a higher estimate of pension income's share of aggregate income, less estimated reliance on Social Security, and a lower estimated rate of poverty. They also find that the HRS provides better estimates of the income of the aged population than the public-use CPS data.

When Every Dollar Counts: Comparing Reported Earnings of Social Security Disability Program Beneficiaries in Survey and Administrative Records
from Social Security Bulletin, Vol. 78, No. 4 (released November 2018)
by David C. Wittenburg, Jeffrey Hemmeter, Holly Matulewicz, Lindsay Glassman, and Lisa Schwartz

This article examines differences between survey- and administrative data–based estimates of employment and earnings for a sample of Social Security Administration (SSA) disability program beneficiaries. The analysis uses linked records from SSA's National Beneficiary Survey and administrative data from the agency's Master Earnings File. The authors find that estimated employment rates and earnings levels based on administrative data are higher than those based on survey data for beneficiaries overall and by sociodemographic subgroup. In proportional terms, the differences between survey and administrative data tend to be greater among subgroups with survey-reported employment rates that are lower than that of beneficiaries overall.

The Longevity Visualizer: An Analytic Tool for Exploring the Cohort Mortality Data Produced by the Office of the Chief Actuary
Research and Statistics Note No. 2016-02 (released May 2016)
by Brian J. Alleva

This note introduces the Longevity Visualizer (LV), a visual-analysis tool that enables users to explore various applications of cohort life-table data compiled and calculated by the Social Security Administration's Office of the Chief Actuary. The LV presents the life-table data in two series—survival functions and age-at-death probability distributions—each of which is generated for each potential age and each sex across a long range of historical and projected birth cohorts. The LV is designed to make complex longevity projections accessible to analysts and researchers, as well as to individuals making financial and retirement plans.

Why Researchers Now Rely on Surveys for Race Data on OASDI and SSI Programs: A Comparison of Four Major Surveys
Research and Statistics Note No. 2016-01 (released January 2016)
by Patricia P. Martin

Policy interest in the sociodemographic characteristics of beneficiaries of the Old-Age, Survivors, and Disability Insurance (OASDI) and Supplemental Security Insurance (SSI) programs is increasing as the minority share of the senior and disabled population grows. This note discusses using four major surveys—the Current Population Survey, the Survey of Income and Program Participation, the American Community Survey, and the Health and Retirement Study—to examine OASDI and SSI program use by race and ethnicity. Survey profiles highlight each survey's history, design, and methodology; the categories with which each collects race and ethnicity data; and their strengths and limitations for analyzing SSA's program data.

Comparing Earnings Estimates from the 2006 Earnings Public-Use File and the Annual Statistical Supplement
Research and Statistics Note No. 2012-01 (released January 2012)
by Michael Compson

The Social Security Administration recently released the 2006 Earnings Public-Use File (EPUF). The EPUF contains earnings information for individuals drawn from a systematic random 1-percent sample of all Social Security numbers issued before January 2007. This note presents the process of evaluating the earnings data in EPUF. It also identifies and explains four key differences between the data in EPUF and the estimates published in the Annual Statistical Supplement to the Social Security Bulletin. The note specifically compares EPUF data with Annual Statistical Supplement estimates of earnings, number of workers with earnings, median earnings by sex and age group, and percentage of workers with earnings below the taxable maximum by sex. After accounting for the expected differences, the remaining discrepancies between EPUF and Annual Statistical Supplement estimates are relatively small.

The 2006 Earnings Public-Use Microdata File: An Introduction
from Social Security Bulletin, Vol. 71, No. 4 (released November 2011)
by Michael Compson

This article introduces the 2006 Earnings Public-Use File (EPUF), a data file containing earnings records for individuals drawn from a 1-percent sample of all Social Security numbers issued before January 2007. The EPUF contains selected demographic and earnings information for 4.3 million individuals. It provides aggregate earnings data for 1937 to 1950 and annual earnings data for 1951 to 2006.

Using Matched Survey and Administrative Data to Estimate Eligibility for the Medicare Part D Low-Income Subsidy Program
from Social Security Bulletin, Vol. 70, No. 2 (released May 2010)
by Erik Meijer, Lynn A. Karoly, and Pierre-Carl Michaud

This article uses matched survey and administrative data to estimate, as of 2006, the size of the population eligible for the Low-Income Subsidy (LIS), which was designed to provide "extra help" with premiums, deductibles, and copayments for Medicare Part D beneficiaries with low income and limited assets. The authors employ individual-level data from the Survey of Income and Program Participation and the Health and Retirement Study to cover the potentially LIS-eligible noninstitutionalized and institutionalized populations of all ages. The survey data are matched to Social Security administrative data to improve on potentially error-ridden survey measures of income components and program participation.

Social Security Administration's Master Earnings File: Background Information
from Social Security Bulletin, Vol. 69, No. 3 (released October 2009)
by Anya Olsen and Russell E. Hudson

The Social Security Administration (SSA) receives reports of earnings for the U.S. working population each year from employers and the Internal Revenue Service. The earnings information received is stored at SSA as the Master Earnings File (MEF) and is used to administer Social Security programs and to conduct research on the populations served by those programs. This article documents the history, content, limitations, complexities, and uses of the MEF (and data files derived from the MEF). It is intended for researchers who use earnings data to study work patterns and their implications, and for those interested in understanding the data used to administer the current-law programs.

Uses of Administrative Data at the Social Security Administration
from Social Security Bulletin, Vol. 69, No. 1 (released May 2009)
by Jennifer McNabb, David Timmons, Jae G. Song, and Carolyn Puckett

This article discusses the advantages and limitations of using administrative data for research, examines how linking administrative data to survey results can be used to evaluate and improve survey design, and discusses research studies and SSA statistical products and services that are based on administrative data.

The Social Security Administration's Death Master File: The Completeness of Death Reporting at Older Ages
from Social Security Bulletin, Vol. 64, No. 1 (released April 2002)
by Mark E. Hill and Ira Rosenwaike

To provide a more detailed assessment of the coverage of deaths of older adults in the Social Security Administration's Death Master File (DMF), this research note compares age-specific death counts from 1960 to 1997 in the DMF with official counts tabulated by the National Center for Health Statistics, the most authoritative source of death information for the U.S. population. Results suggest that for most years since 1973, 93 percent to 96 percent of deaths of individuals aged 65 or older were included in the DMF.

The Development of the Project NetWork Administrative Records Database for Policy Evaluation
from Social Security Bulletin, Vol. 62, No. 2 (released September 1999)
by Kalman Rupp, Dianne Driessen, Robert Kornfeld, and Michelle L. Wood

This article describes the development of SSA's administrative records database for the Project NetWork return-to-work experiment targeting persons with disabilities. The article is part of a series of papers on the evaluation of the Project NetWork demonstration. In addition to 8,248 Project NetWork participants randomly assigned to receive case management services and a control group, the simulation identified 138,613 eligible nonparticipants in the demonstration areas. The output data files contain detailed monthly information on Supplemental Security Income (SSI) and Disability Insurance (DI) benefits, annual earnings, and a set of demographic and diagnostic variables. The data allow for the measurement of net outcomes and the analysis of factors affecting participation. The results suggest that it is feasible to simulate complex eligibility rules using administrative records, and create a clean and edited data file for a comprehensive and credible evaluation. The study shows that it is feasible to use administrative records data for selecting control or comparison groups in future demonstration evaluations.

Linkages With Data From Social Security Administrative Records in the Health and Retirement Study
from Social Security Bulletin, Vol. 62, No. 2 (released September 1999)
by Janice A. Olson

The Health and Retirement Study (HRS is a major longitudinal study designed for scientific and policy researchers for study of the economics, health, and demography of retirement and aging. This note describes the data from SSA records that have been released for linking to HRS data, linkage rates resulting from the consent process, and subgroup patterns in linkage rates.

Linkages with Data from Social Security Administrative Records in the Health and Retirement Study
ORES Working Paper No. 84 (released August 1999)
by Janice A. Olson

The Health and Retirement Study (HRS) is a major longitudinal study designed for scientific and policy researchers for study of the economics, health, and demography of retirement and aging. The primary HRS sponsor is the National Institute of Aging, and the project is being conducted by the Survey Research Center of the Institute for Social Research at the University of Michigan. Several agencies, including the Social Security Administration, are supporting the project. This is the second paper describing SSA's data support for the HRS. It describes the data from SSA records that have been released for linking to HRS data, linkage rates resulting from the consent process, and subgroup patterns in linkage rates.

The Accuracy of Survey-Reported Marital Status: Evidence from Survey Records Matched to Social Security Records
ORES Working Paper No. 80 (released January 1999)
by David A. Weaver

Many researchers have concluded that, in surveys, divorced persons often fail to report accurate marital information. In this paper, I revisit this issue using a new source of data—surveys exactly matched to Social Security data. I find that divorced persons frequently misreport their marital status, but there is evidence that the misreporting is unintentional. A discussion of possible improvements in surveys is presented. Implications for the study of differential mortality and the study of poverty among aged women are discussed.

The Development of a New Geographic Coding System for the Continuous Work History Sample
from Social Security Bulletin, Vol. 57, No. 4 (released October 1994)
by Linda M. Dill, Barry V. Bye, and Cheryl I. Williams

This article describes the statistical development of the geographic coding system used to identify worker location for the Continuous Work History Sample. The new system—which is planned for implementation for data year 1993—will provide more accurate geographic distributions of workers within a residence concept than the old system could provide within an employer location concept. The article also presents the results of a pilot study that tested the operational aspects of the new system. The results provide some preliminary estimates of the effect of the revised codes on the geographic distribution of workers.

Statistical Notes from the New Beneficiary Data System
from Social Security Bulletin, Vol. 57, No. 1 (released January 1994)
Statistical Notes From the New Beneficiary Data System
from Social Security Bulletin, Vol. 56, No. 3 (released July 1993)
The Development and Use of Industry Data by the Social Security Administration
from Social Security Bulletin, Vol. 55, No. 4 (released October 1992)
by Linda M. Dill
The New Beneficiary Data System: The First Phase
from Social Security Bulletin, Vol. 55, No. 2 (released April 1992)
by Martynas A. Yčas
The Social Security Administration's 10-Percent Sample File of OASDI Beneficiaries
from Social Security Bulletin, Vol. 55, No. 1 (released January 1992)
by John W. Wagner
Development of Diagnostic Data in the 10-Percent Sample of Disabled SSI Recipients
from Social Security Bulletin, Vol. 54, No. 7 (released July 1991)
by Satya Kochhar
The Decline in Establishment Reporting: Impact on CWHS Industrial and Geographic Data
from Social Security Bulletin, Vol. 54, No. 1 (released January 1991)
by Linda M. Dill, Adah D. Enis, and Cheryl I. Williams
The Social Security Administration's Continuous Work History Sample
from Social Security Bulletin, Vol. 52, No. 10 (released October 1989)
by Creston M. Smith
The Monthly OASDI One-Percent Sample File
from Social Security Bulletin, Vol. 52, No. 6 (released June 1989)
by Lewis F. Frain
Commentary: Interagency Data Matching Projects for Research Purposes
from Social Security Bulletin, Vol. 51, No. 7 (released July 1988)
by Daniel B. Radner
The 1973 CPS-IRS-SSA Exact Match Study
from Social Security Bulletin, Vol. 51, No. 7 (released July 1988)
by Beth Kilss and Frederick J. Scheuren
Commentary: Continuous Work History Sample
from Social Security Bulletin, Vol. 51, No. 4 (released April 1988)
by Warren Buckler
Retirement History Study: Introduction
from Social Security Bulletin, Vol. 51, No. 3 (released March 1988)
by Lola M. Irelan
Adjusted Estimates of the Size Distribution of Family Money Income for 1972
ORES Working Paper No. 24 (released October 1981)
by Daniel B. Radner

It is well-known that for most purposes income size distribution data collected in household surveys are far from ideal. The problems with those data can be separated into two types: the data items that are collected, and the accuracy of the data collected. Usually, although there are important exceptions, the income data collected are confined to cash income before taxes, thus ignoring the effects of both taxes and noncash income of all types. Also, the income estimates usually are for one year, which often is not the best accounting period for analysis. Furthermore, there usually is a lack of adequate detail by income type, and the data ordinarily are not sufficiently detailed to adjust for changes in the composition of the family unit during the income accounting period.

An Example of the Use of Statistical Matching in the Estimation and Analysis of the Size Distribution of Income
ORES Working Paper No. 18 (released October 1980)
by Daniel B. Radner

This paper discusses the use of statistical matching in the estimation and analysis of the size distribution of family unit personal income. Statistical matching is a relatively new technique that has been used to combine, at the single observation level, data from two different samples, each of which contains some data items that are absent from the other file. In a statistical match, the information brought together from the different files ordinarily is not for the same person but for similar persons; the match is made on the basis of similar characteristics. In contrast, in an "exact" match, information for the same person from two or more files is brought together using personal identifying information.

Mortality Reporting in SSA Linked Data: Preliminary Results
from Social Security Bulletin, Vol. 42, No. 11 (released November 1979)
by Wendy Alvey and Faye Aziz
Selection of Simple and Stratified Random Samples of Fixed Size Without Replacement
ORES Working Paper No. 9 (released June 1979)
by Michael H. Bostron

For the past few years, the Division of Disability Studies has been using simple random and stratified random sampling procedures for many of its studies. The beneficiary sample for the 1978 Survey of Disability and Work was a stratified random sample drawn from the Master Benefit Record. The samples used in the Study of Consistency and Validity of Initial Disability Decisions and the Trial Work Period Folder Study also used simple random sampling procedures. Simple random subsampling has been used to enable multivariate analysis to be performed on files that would otherwise have been too large for existing software.

Because of the Division of Disability Studies' wide use of simple and stratified random sampling designs, software was developed to efficiently accomplish these sampling schemes. This paper describes the algorithm and presents the computer programs that are currently being used in the division.

The 1973 CPS-IRS-SSA Exact Match Study
from Social Security Bulletin, Vol. 41, No. 10 (released October 1978)
by Beth Kilss and Frederick J. Scheuren
Access to Social Security Microdata Files for Research and Statistical Purposes
from Social Security Bulletin, Vol. 41, No. 8 (released August 1978)
by Lois A. Alexander and Thomas B. Jabine
Retirement History Study: Introduction
from Social Security Bulletin, Vol. 35, No. 11 (released November 1972)
by Lola M. Irelan
Social Security Statistical Data, Social Science Research, and Confidentiality
from Social Security Bulletin, Vol. 30, No. 10 (released October 1967)
by Joseph Steinberg and Heyman C. Cooper
Development of the Continuous Work-History Sample in Old-Age and Survivors Insurance
from Social Security Bulletin, Vol. 20, No. 3 (released March 1957)
by Benjamin Mandel
Old-Age and Survivors Insurance Records: Derivation of Byproduct Data
from Social Security Bulletin, Vol. 15, No. 7 (released July 1952)
by William H. Cummins
OASI Sampling Methods
from Social Security Bulletin, Vol. 14, No. 6 (released June 1951)
The Continuous Work-History Sample: The First 12 Years
from Social Security Bulletin, Vol. 14, No. 4 (released April 1951)
by Jacob Perlman
The Continuous Work History Sample Under Old-Age and Survivors Insurance
from Social Security Bulletin, Vol. 7, No. 2 (released February 1944)
by Jacob Perlman and Benjamin Mandel
The Protection and Use of Information Obtained Under the Social Security Act
from Social Security Bulletin, Vol. 4, No. 5 (released May 1941)
by Ida C. Merriam
The Statistical Adequacy of Employers' Occupational Records
from Social Security Bulletin, Vol. 2, No. 5 (released May 1939)
by Katherine D. Wood
Counting the Recipients of Public Assistance and the Dollars They Receive
from Social Security Bulletin, Vol. 1, No. 5 (released May 1938)
by Helen R. Jeter
Census Classifications and Social Security Categories
from Social Security Bulletin, Vol. 1, No. 4 (released April 1938)
by Laura Wendt
Applications for Public Assistance Under the Social Security Act—1937
from Social Security Bulletin, Vol. 1, No. 4 (released April 1938)
Eleven-Million Sample of Applications for Employee Account Numbers
from Social Security Bulletin, Vol. 1, No. 4 (released April 1938)