40 PUMS Home Page

NAME(ACRONYM):
PUBLIC USE MICRODATA SAMPLES, 1940
(PUMS)

ABSTRACT/SUMMARY:

The United States Bureau of the Census gathers a variety of useful information when it conducts the census. This information covers an array of topics pertaining to the lifestyles, incomes, jobs, and families of United States citizens. Small "samples" are extracted from the data and are collectively called the Public Use Microdata Samples (PUMS).

The PUMS are computer-accessible files that contain records for a sample of housing units, with information on the characteristics of the units and the people in them. Within the limits of the sample size and geographic detail, you can use the PUMS to complete several types of tabulations.

The 1940 data collection was assembled through a collaborative effort between the United States Bureau of the Census and the Center for Demography and Ecology of the University of Wisconsin. The 1940 Census Public Use Sample Project was supported by The National Science Foundation under Grant SES-7704135.

All data released by the U.S. Bureau of the Census are subject to strict confidentiality measures imposed by U.S. legislation (Title 13, U.S. Codes). The census data can be used for statistical purposes only. The PUMS records contain information on the characteristics of housing units and the people in them. The PUMS are extracted from the census data in a manner that avoids disclosure of information that can identify households or individuals. To protect the confidentiality of the respondents, any information that would identify a household or an individual is excluded.

Microdata records identify no geographic area with fewer than 100,000 inhabitants. Microdata samples include only a small fraction of the population, drastically limiting the chance that the record of a given individual is even contained in a PUMS file, much less identifiable.

ARCHIVAL/ACCESS:

Distributor: CIESIN
- About CIESIN
- Temporal: 1940 (1 percent, national file).
- Spatial: United States.
- Resolution: Geographic identification of the location of the sampled households includes Census regions and divisions (e.g., Standard Metropolitan Areas (SMAs) and State Economic Areas (SEAs)), and each state (note, this does not include Alaska and Hawaii).
- Documentation: No codebooks are on-line.
Other Sources:
- Producer: United States Department of Commerce, Bureau of the Census.
  - About the Census Bureau.
  - Who's Who.
- Distributor: Inter-university Consortium for Political and Social Research.
  - Introduction to ICPSR.
  - Recent ICPSR Announcements & Data Releases.

DATASET DESCRIPTION:

Documentation:
The datafile is documented in a codebook containing a data dictionary and supporting appendix information.

Layout:
This data collection was constructed from and consists of 20 independently-drawn subsamples stored in 20 discrete physical files.
The 1940 dataset files are rectangular ASCII files with no newlines delineating the records. The files are arranged by subsample with each subsample stored as a separate physical file of information. The 20 subsamples were selected randomly. Within each of the 20 subsamples, records are sequenced by state.
For example, if you were interested in extracting all of the records for a particular state, you would read through all of the 20 physical files and select that state's records from each of the 20 subsamples. Record types are ordered within households. Household characteristics first, "sample line" are second, and person records last.

Universe:
Information from the census was derived either from questions asked of the entire population or from questions asked of only a sample of the population. Those questions asked about every person and housing unit are called 100-percent or short-form questions. The others are called sample or long-form questions.
Those households receiving the short-form questionnaires were asked only the 100-percent questions, and those receiving the long form were asked both the 100-percent questions and the sample questions. Sampling rates vary depending on geographic location and population size.
PUMS data contain a sample of the individual long-form census records showing most population and housing characteristics with identifying information removed.

Design and Methodology:
The coding system varies for each census, so it is important to have access to the codebook for each census in order to assess the meaning of a specific field in a census record and its comparability across censuses. Very little comparability exists between geographic identifiers on each of the previous files, but housing and population characteristics are similar.
The sample questionnaires were edited for completeness and consistency, and substitutions or allocations for any missing data were made. Allocation flags appear at the end of each record to indicate when an item has been allocated. A user wishing to tabulate only actually observed values can eliminate variables with allocated values.

Variables:

Each of the 20 subsamples contains three record types:

Household:
which contain variables describing the location and composition of the household (this dataset contains 391,034 household records).

"Sample line" respondent:
which contain variables describing demographic characteristics such as nativity, marital status, number of children, veteran status, wage deductions for social security, and occupation (this dataset contains 391,034 "sample line" records).

Person in the household:
which contain demographic variables such as nativity, marital status, family membership, education, employment status, income, and occupation (this dataset contains 1,351,732 person records). Each record type has a logical record length of 138 characters. These records were encoded from microfilm copies of original handwritten enumeration schedules from the 1940 Census of Population. The 1940 dataset contains a stratified 1% sample of households.
DATASET VARIABLES:

Wais Search of 1940 Data Dictionary.
Browse a variable listing (1940 only).

RELATED DATASETS:

Summary Tape Files (STF) are designed to provide statistics with greater subject detail for geographic areas than is feasible or desirable to provide in printed reports. The census data contained in printed reports are arranged in tables. Population and housing characteristics are presented for specified geographic areas; for example a table may represent the number of rented housing units in a census tract, the number of persons 65 years of age or older in a city, or the total population of a county. Census data at the small-area level, such as census tracts and smaller, will contain limited subject matter detail. STF files, in machine readable format, mimic this table layout.

RELATED PUBLICATIONS:

For more information on the 1940 PUMS dataset, refer to the 1940 Census of Population and Housing Public Use Microdata Sample Technical Documentation. For a copy of the documentation, contact Data User Services Division, Systems and Programming Branch, Bureau of the Census, Washington, D.C. 20233. Telephone: (301) 763-4100.

CONTACTS/REFERENCES:

NAME: Bureau of the Census
EMAIL:
PHONE: 301-763-7962 Population Division
PHONE: 301-763-7963 Housing Division
PHONE: 301-763-4100 Census Customer Services
FAX: 301-763-4794
OTHER:
ADDRESS:

Customer Services Branch Data User Services Division
Bureau of the Census
Washington, DC 20233

For additional information concerning particular subject matter topics on the files, contact Population Division, (301) 763-7962, or Housing Division, (301) 763-2873, Bureau of the Census, Washington, D.C. 20233.

FOR MORE INFORMATION CONTACT:

NAME: CIESIN User Services
EMAIL: ciesin.info@ciesin.org
PHONE: 517-797-2727
FAX: 517-797-2622
ADDRESS:

CIESIN
2250 Pierce Road,
University Center, MI USA

KEYWORDS:

Census, United States, Demographics, Populations, Housing, PUMS.