U. of Minnesota and Ancestry.com Announce Plans to Create Massive Database of 1940 Census Material

April 2, 2012

From the University of Minnesota and Ancestry.com:

A collaboration between the University of Minnesota and Ancestry.com will create the largest database of detailed information about people and their households ever made available for scientific research. The National Archives and Records Administration today released images of the enumeration manuscripts from the 1940 Census of Population. [Our emphasis] The Minnesota Population Center at the University of Minnesota will leverage a substantial investment by Ancestry.com in digitizing information on the entire population of the United States.


The database will include all of the information collected on the 132 million Americans recorded in the Census of 1940. The project will involve transcription of 7.8 billion keystrokes of data describing the demographic and economic characteristics of all individuals, families, households, and group quarters present in the United States in 1940. This database will be an extraordinary new resource for economists, demographers, geographers, epidemiologists, other social science and health researchers, and the general public.


Capturing 100 percent of the U.S. population recorded in the census, the 1940 database will be significantly larger than any other census datasets created for social science and health research. These datasets normally only include a 1-10 percent sample of the population, and many studies are hindered by these small samples. The new database will allow much richer studies of small populations in 1940, such as Dust Bowl migrants to California, Native Americans, and working mothers with young children.

Researchers will also be able to link recent economic and health surveys and mortality records to the 1940 database. These linkages will allow researchers to study the impact of early life conditions—including socioeconomic status, parental education, and family structure—on later health and mortality. In addition to individual and family information, the database will provide contextual information on childhood neighborhood characteristics, labor-market conditions, and environmental conditions.


All numerically-coded fields in the database will be made freely available to the scientific community and the public. Data and documentation will be distributed through the Integrated Public Use Microdata Series (IPUMS) data access system (www.ipums.org). The IPUMS data access system pioneered web-based distribution of large-scale datasets and the Minnesota Population Center continues to innovate at the cutting edge of information technology. The system offers capabilities for navigating database documentation, defining datasets, constructing customized variables that capitalize on the individual and household information in the census, and adding neighborhood information.

The project will be supported by grants from the National Science Foundation, the National Institute of Aging and the Eunice Kennedy Shriver Institute for Child Health and Human Development. The project also benefits from investments and support by the National Archives and Records Administration and the U.S. Census Bureau.

