The PUMS are computer-accessible files that contain records for a sample of housing units, with information on the characteristics of the units and the people in them. Within the limits of the sample size and geographic detail, you can use the PUMS to complete several types of tabulations.
The 1940 data collection was assembled through a collaborative effort between the United States Bureau of the Census and the Center for Demography and Ecology of the University of Wisconsin. The 1940 Census Public Use Sample Project was supported by The National Science Foundation under Grant SES-7704135.
All data released by the U.S. Bureau of the Census are subject to strict confidentiality measures imposed by U.S. legislation (Title 13, U.S. Codes). The census data can be used for statistical purposes only. The PUMS records contain information on the characteristics of housing units and the people in them. The PUMS are extracted from the census data in a manner that avoids disclosure of information that can identify households or individuals. To protect the confidentiality of the respondents, any information that would identify a household or an individual is excluded.
Microdata records identify no geographic area with fewer than 100,000 inhabitants. Microdata samples include only a small fraction of the population, drastically limiting the chance that the record of a given individual is even contained in a PUMS file, much less identifiable.
The datafile is documented in a codebook containing a data dictionary and supporting appendix information.
This data collection was constructed from and consists of 20 independently-drawn subsamples stored in 20 discrete physical files.
The 1940 dataset files are rectangular ASCII files with no newlines delineating the records. The files are arranged by subsample with each subsample stored as a separate physical file of information. The 20 subsamples were selected randomly. Within each of the 20 subsamples, records are sequenced by state.
For example, if you were interested in extracting all of the records for a particular state, you would read through all of the 20 physical files and select that state's records from each of the 20 subsamples. Record types are ordered within households. Household characteristics first, "sample line" are second, and person records last.
Information from the census was derived either from questions asked of the entire population or from questions asked of only a sample of the population. Those questions asked about every person and housing unit are called 100-percent or short-form questions. The others are called sample or long-form questions.
Those households receiving the short-form questionnaires were asked only the 100-percent questions, and those receiving the long form were asked both the 100-percent questions and the sample questions. Sampling rates vary depending on geographic location and population size.
PUMS data contain a sample of the individual long-form census records showing most population and housing characteristics with identifying information removed.
The coding system varies for each census, so it is important to have access to the codebook for each census in order to assess the meaning of a specific field in a census record and its comparability across censuses. Very little comparability exists between geographic identifiers on each of the previous files, but housing and population characteristics are similar.
The sample questionnaires were edited for completeness and consistency, and substitutions or allocations for any missing data were made. Allocation flags appear at the end of each record to indicate when an item has been allocated. A user wishing to tabulate only actually observed values can eliminate variables with allocated values.
For additional information concerning particular subject matter topics on the files, contact Population Division, (301) 763-7962, or Housing Division, (301) 763-2873, Bureau of the Census, Washington, D.C. 20233.