*************************************** * Enhanced Version of Migration Files * *************************************** The Special Tabulation Product (STP28) files (county to county migration tally, abbrevatiated format) are available from the US Bureau of the Census via their FTP server. The files may be retrieved by establishing an anonymous ftp session with the Bureau's server (ftp ftp.census.gov; login as user ftp; supply email address when prompted for a password; change directories to /pub/stp). The files are posted by the Census Bureau in PKZIP self-extracting executable files containing one data file per state, one documentation file, one data file containing marginal totals, and one file presenting county-state names and FIPS codes. The structure of these files may generate problems in processing. For example: - The files cannot be processed in mainframe environments before decompression, which may require users to have to transfer the files twice: once to the PC, followed by decompression, and then up to the mainframe. - The files contain a single data file per state with data on in-migration to the state only (in addition to within state migration data) - The table matrix for race and race/not hispanic breakdowns are grouped together in the same data file (P1 Race of Person: white, black, asian, and other; and P2 Spanish Origin of Person: not spanish, mexican, puerto rican, cuban, and other) - The table matrix for P1 and P2 contain only a single count field, for the number of persons who moved between 1985 and 1990. From these data and files structure it becomes evident that: - Calculations of netmigration, including the flow of persons from the state of interest to other states, requires that all other state data files must be searched. This involves handling 50 self-extracting executable files and requires a fair amount of free disk space to process. For example, the data file for Pennsylvania has no information about how many persons moved from Pennsylvania to New York, Ohio, or California during the period of 1985 to 1990. - Similarly, large amounts of data need to be processed even if only race of person or spanish origin of person is needed (ie, only access to P1 or P2 table amtric is derired). Henk Meij of CIESIN (Saginaw, MI) and John Blodgett of the University of Missouri St. Louis (MO SDC) have created an alternative set of files containing the same basic data but in a restructured format. The primary advantage of this new format is more easily obtainable data on both in- and out-migration for a specific geographic area. The enhancements involve separations of table matrixes, addition of multiple count fields for in and out moves on table matrixes, and creation of a third file for net-migration flows. In all files, information about migration flows in both directions are stored on a single record for ease of processing. The files are available via anonymous FTP and Mosaic at CIESIN (see below for details). The format of these enhanced files is described in detail below. There are 3 separate files for each state. o The "p1" and "p2" files contain all the data from the original census STP28 files, but the data by race (p1) and by hispanic origin (p2) have been separated into individual state level files. The p1 and p2 files have been enhanced by the creating of two count fields per record instead of one: the POPIN field counts persons moving into the county (COUNTY) from another county (COUNTY2), while POPOUT counts persons moving in the opposite direction. The headers record type is gone, and the two county codes appear in every record, a tradeoff that involves a little more storage but a lot more processing convenience when needing to subset the files. o The third state level file, "tf", contains the "total flows" of persons for all states (into the county, out of the county and within county moves). The total flows are reported for all counties within a state, and all counties outside of the state which were sources or destinations of moves involving the state of interest. This file was created by aggregating the more detailed data in the P1 file The layout of the these files is quite simple. The following SAS (r) statements can be used to read each of the files. The codes for the stratifier fields are identical to those used by the Census Bureau (consult the codebook). /* read tf record */ INPUT COUNTY $1-5 COUNTY2 $6-10 POPIN POPOUT; /* read p1 record */ INPUT COUNTY $1-5 COUNTY2 $6-10 RACE $11 SEX $12 NATIVITY $13 POVSTAT $14 EDUC $15 AGE $16-17 POPIN POPOUT; /* read p2 record */ same as p1 record but substitute "SPANISH" for "RACE" Note that the two population count fields (POPIN and POPOUT) are read as "free form" instead of fixed-length fields. These fields will only be as long as their values require with one leading blank as delimiter. These variables may be defined as (again SAS (r) syntax example): /* assign labels */ LABEL POPIN =' IN-MIGRATION COUNTY2 TO COUNTY, 85-90' POPOUT='OUT-MIGRATION COUNTY TO COUNTY2, 85-90' ; Examples of the data are presented next for the three files mentioned above involving the counties Adair County, MO (29001) and St. Clair County, IL (17163) from the MO files: /* p1 file */ 29001171631111305 5 0 29001171631113204 7 0 29001171631113304 6 0 29001171631113305 8 0 29001171631211205 4 0 29001171632111304 9 0 29001171632112406 0 5 /* p2 file */ 29001171631111304 9 0 29001171631111305 5 0 29001171631112406 0 5 29001171631113204 7 0 29001171631113304 6 0 29001171631113305 8 0 29001171631211205 4 0 /* tf file */ 2900100000 11641 0 Total number of persons in county(1) 2900129001 4719 4719 Total moves within county 29001. 2900117163 39 5 Total moves between counties in example. 1716329001 5 39 Same record but from IL file. In this example a "total flows" record indicates migration of 39 persons moving into Adair from St. Clair, while 5 persons moved to St. Clair from Adair(2). Notes: 1-A value of '00000' in the 2nd county field indicates data pertaining to persons over 5 who resided in the county and did not move between 1985 and 1990 (this is undocumented information and needs verification by Bill Frey and/or Census Bureau) 2-if there are no moves between two counties there will be no record for that county pair 3-a move in one direction but none in the opposite direction are indicated by a '0' in either the POPIN or POPOUT fields. There is one type of file which had to be created to generate the total flows files, the interstate migration files, or "im" files. The data structure is identical to the original Bureau of the Census files. The "im" files contain all migration from a state of interest to all other states. In other words, by combining the "im" file for a particular state with the original data file for a particular for that state, the p1, p2 and tf files may be created. An example of this data involving the counties mentioned above is presented next. /* im file */ 01716329001 12112406 5 21112406 5 To obtain these files from CIESIN you follow the instructions below for an anonymous ftp session. Or via Mosaic, the files may be retrieved directly. FTP Retrieval: ftp ftp.ciesin.org (if this naming convention yields the error "host unknown", try ...) ftp 160.39.8.201 Name: ftp (log in as user ftp) Password: shere_khan@jungle.book.org (email address as password) ftp> cd /pub/census (change directory to archive) ftp> binary (turn binary mode on, needed for *.zip files) At this point informative messages (the readme files) will be echoed to the screen when changing into directories. Descend into the directories "usa" and "stp" to find the subdirectories "p1", "p2" and "tf". You may retrieve as many files as you like (note: unfortunately the ftp server does not currently support on-the-fly-decompression so only compressed files can be retrieved (ZIP). If you need an unzip binary for your platform or require the source code, retrieve the appropriate file from /pub/census/src). Mosaic Retrieval: Load URL into Mosaic http://www.ciesin.org/ and follow the links "Data Access" "Dataset Guides" "US Demography" and connect to the anoynous ftp archive of census data products. Or, load the demography home page directly http://www.ciesin.org/datasets/us-demog/us-demog-home.html When you click on the link depicting the anonymous FTP server, you may either traverse the directories yourself ("browse this archive") or you may select the files you want to retrieve from the appropriate section. When selected, Mosaic will prompt you for a local path/filename combination to save the retrieved file. ################################################## If you have questions about the logistics of accessing these files at CIESIN you can send an e-mail message to Henk Meij (hmeij@ciesin.org) or ciesin.info@ciesin.org. If you have questions about the content of the files (understanding the data, not how to do custom applications!) you can send an email message to John Blodgett (c1921@umslvma.umsl.edu).