Disability Analysis File (DAF) Restricted Access File (RAF)

Objective

Our administrative systems contain the data that underlies and supports our programs, and this data is critical to our understanding of beneficiaries with disabilities. The Disability Analysis File (DAF) is designed to simplify access to this data and to support the efficient development of data for research and demonstration projects. We rebuild the DAF annually by taking data from our most relevant administrative systems and creating a formatted database that is ready for analysis, easy to use, and well-documented.

We create two versions of the DAF each year—a restricted access file (RAF) version for SSA staff, grantees, contractors, and federal partners—and a de-identified public use file (PUF) version available to the public through our website. Both versions currently include data up to the year 2021 (DAF21).

The DAF focuses on data needed to answer questions about disability and work, and it also complements the National Beneficiary Survey (NBS), which provides information that is not available from our administrative sources. When combined, the DAF and NBS provide a complete picture of demographics, benefits, work, and work attitudes for all Social Security Disability Insurance (SSDI) and Supplemental Security Income (SSI) beneficiaries with disabilities.

Status

The DAF is an ongoing project. We have created and refined annual DAF files since 2004 (prior to 2012, the DAF was called the Ticket Research File). The latest contract to build the DAF was awarded to Mathematica Inc. in December of 2021. That contract includes DAF construction for the 2021 through 2025 data years.

  • Filenames, location, documentation, and other resources for the DAF21 RAF are available on this page in the designated sections below.
  • Construction of both the RAF and the PUF versions of DAF22 is in process with an expected completion date in May 2024. The DAF22 will be the first DAF build that includes data on applicants to SSA’s SSI and DI disability programs.
  • Construction of DAF23 began in November of 2023 with an expected completion date in May 2025, with the DAF24 and DAF25 files following the same 18-month schedule thereafter, with the contract ending in May of 2027.

Key Activities and Design

We designed the DAF to support SSA and our research partners. With each new version of the DAF, we add two types of data. First, we add the administrative data for the new people SSA has added to the disability rolls in the last calendar year. Second, we add new and modified administrative data for the people who were previously in the DAF. Because SSA retrospectively updates data on existing beneficiaries as we obtain new information, we cannot simply add data from the most recent calendar year. Instead, we rebuild the DAF with both new data as well as updated data for prior years. In this way, the current DAF always shows the most up-to-date information in the SSA administrative files. For some variables, such as benefit amounts, where these retrospective changes are important to track, we carry both the updated value (e.g., benefit payment amount due) and the point-in-time values (e.g., benefit amount paid) in the DAF, so we can see the effect of these retrospective modifications to the administrative record. In addition, we also enhance and refine the DAF each year in consultation with DAF users.

We obtain the data for the DAF by extracting and combining data from various SSA administrative data systems including the:

  • Master Beneficiary Record (MBR);
  • Supplemental Security Income Record (SSR);
  • Master Files of Social Security Number (SSN) Holders and SSN Applications (NUMIDENT);
  • Earnings Recording and Self-Employment Income System (also known as the Master Earnings File (MEF));
  • Disability Determination Service Processing File (also known as the 831/832);
  • Completed Determination Record – Continuing Disability Determinations (also known as the Disability Control File (DCF)); and
  • various employment services participation and payment data files maintained by SSA.

From each of these data systems, we use various standardized data extracts, tables, or files to obtain the data for the DAF. Most extracts, such as those from the MBR, SSR, and NUMIDENT, are in fixed column ASCII format while others are in CSV or DB2 formats. Monthly benefit history variables include such items as state of residence, impairment codes, benefit payments, and earnings, while one-time/last-recorded variables include data such as SSN, Date of Birth, initial entitlement date, and last payment status. Under the DAF, SSA combines data from these various administrative sources into a single record per beneficiary in a common format. Critically, we also provide comprehensive documentation on the DAF to make it easy to use.

Results

We use the DAF for internal research, to support demonstration development and evaluation, and to answer any questions that may arise. For example, we use the DAF to examine the costs and benefits of the Ticket to Work (TTW) program, the effectiveness of TTW mailings, and the characteristics associated with successful return to work by beneficiaries. We use the DAF to model and forecast payments to Employment Network and Vocational Rehabilitation under the TTW program. We also use the DAF to support oversight by the Social Security Advisory Board, our Office of the Inspector General, the Office of Management and Budget, Congress, the Government Accountability Office, and others. Additionally, we allow non-SSA researchers to use the DAF, primarily through the Retirement and Disability Research Consortium (RDRC) and through a public use version of the DAF.

Reports

The following report published by Ben-Shalom and Stapleton used our DAF14 build to better understand the long-term program participation and employment patterns of adult SSI recipients following benefit award.

Longitudinal Statistics for New SSI Recipients, 2012

The following report provides key descriptive statistics from the DAF15 build. It examines the work activity, employment expectations and characteristics, employment services, and factors affecting employment of working-age adult DI beneficiaries and SSI recipients. It focuses on longitudinal analyses with timeframes spanning periods before and after disability award.

DI & SSI Program Participants: Characteristics & Employment, 2015

Data

We define the DAF population, data range, and content as follows:

The DAF Population: In terms of the beneficiaries and recipients included in the DAF, the DAF contains benefit history data and current-status data on the approximately 40 million children and adults though retirement age who participated in the SSDI or SSI programs at any time between 1996 and the most recent DAF year. The DAF also includes program application data for children and adults who applied for SSDI or SSI program benefits. The DAF file has no lower age restriction.

The DAF Data Range: Many monthly benefit history variables in the DAF have data spanning from January 1994 to December of the most recent DAF year. Most of the beneficiaries and recipients on the rolls in the earliest year for the DAF population, 1996, had been on the rolls prior to 1996. This data range provides up to two years of prior monthly information for these individuals. Not all data elements are available back to January 1994. These elements are included from their earliest point of availability, but none is earlier than January 1994.

The DAF Data Content: The content included in the DAF is extensive. Ignoring repeated occurrences of variables, the DAF contains about 1,300 unique variables. However, many of these variables have repeated occurrences (either because they are monthly “yymm” or multiple “n” variables). Counting all variables, the DAF contains approximately 40,000 individual variables.

DAF Metadata (approximate as variables increase each year)

All variables (ignoring repeated occurrences)

1,300

One-time variables

1,100

"n" variables (ranging from 5-50 occurrences)

100

"yymm" variables (ranging from 204-300 occurrences)

150

All variables (including repeated occurrences)

40,000

The DAF Datasets: The dataset names of all components and linkable files of the most recent DAF build are provided in a text file of filenames in the documentation section below.

User's Code Library

To make the DAF more efficient and easier to use, we have developed SAS code for common analytical tasks run on DAF files. Researchers can use and modify this code as needed. The library provided in the documentation section below includes code to complete the following tasks:

  • determine whether a beneficiary is in current pay for either SSDI or SSI within a user specified time period;
  • categorize impairment codes into the groupings used in SSA’s published statistics;
  • determine whether a beneficiary has been suspended or terminated due to work within a user specified time period; and
  • reorder N suffixed variables to be in a chronological order.

In addition to providing code, we specify the DAF components necessary to run the code, an example data step, the variables used in the program, and output files and variables created by the program.

Restricted Access File Documentation

The complete documentation of the DAF is contained in twelve volumes. Together, these volumes provide comprehensive detail for users of all levels and backgrounds.

Related Links

DAF PUF website
NBS website
RDRC website
TTW website
Mathematica's DAF Project website