DISC News contains articles about local, national, and international data issues.
It is published twice a semester by the library staff.
Editor: Joanne Juhnke, Special Librarian
Staff Contributors: Lu Chou, Senior Special Librarian; Benjamin Cowan, PhD candidate in Economics
(Visit our PDF edition as well!)
“Census 2010: We can’t move forward until you send it back!” goes the exhortation from the U.S. Census Bureau, encouraging people to fill out and return their decennial census forms this year. The census data collection is already underway in some areas, kicked off in January in remote parts of Alaska. Most people will receive their census forms in the mail (or dropped off at their door in harder-to-count areas) sometime in March.
If you pay close attention, you’ll find dollar-amounts attached to a lot of the conversation around Census 2010. One of the more impressive figures that the Census Bureau frequently cites is that upwards of $400 billion in federal funding across 140 programs is allocated based on formulas that make use of census-related data. (Included in the general term “census-related data” are the decennial census, the American Community Survey, and the Current Population Survey.) A December 2009 report from the U.S. Government Accountability Office, at http://tinyurl.com/GAO10263, focuses on the top ten among those programs. In 2008, the top ten programs totaled about $334.9 billion, representing about 73 percent of total federal assistance. In 2009, the same programs are estimated to total about $478.3 billion, including about $122.7 billion funded by the Recovery Act, and representing about 84 percent of total federal assistance. The top five of those ten programs and their 2009 estimates are:
- Medicaid: $266.6 billion
- Highway Planning and Construction: $54.1 billion
- State Fiscal Stabilization Fund-Education State Grants: 39.7 billion
- Title I Grants to Local Education Agencies: $24.5 billion
- Individuals with Disabilities Education Act Part B: $22.8 billion
The decennial census itself is an expensive undertaking. According to the Washington Post (http://tinyurl.com/wp17Feb2010), “This year’s national head count will be the most expensive in U.S. history, costing an estimated $14.7 billion. Roughly $7.4 billion will be spent in the fiscal year that ends in September, mostly on payroll and advertising budgets.” The advertising expenditures have drawn recent criticism, particularly an ad spot run during Super Bowl XLIV that cost $2.5 million. The Census Bureau, however, responds that for every one percentage point increase in the initial mail-back response rate, taxpayers save roughly $85 million in costs associated with door-to-door, in-person interviews. The Bureau also points to Census 2000, the first decennial census with a paid advertising budget, crediting the $100 million ad campaign with turning around the trend of declining response rate and saving at least $305 million. (Census Bureau press release, 7 February 2010, http://tinyurl.com/cb7feb2010)
In addition to the paid advertising, the Census Bureau continues to rely on civic-minded, unfunded efforts of organizations nationwide. UW-Madison is spreading the word through a partnership between University Communications, University Housing, and various student groups. Students are frequently undercounted; to help reverse that trend, students should be sure to look for and fill out their forms, whether in residence halls or off-campus housing. If a student is living in Madison on April 1, that’s where he or she should fill out the census! More information for students is available at http://2010.census.gov/campus/.
Census payroll dollars, flowing primarily to temporary census workers, make a noticeable economic impact. According to the New York Times (http://tinyurl.com/nyt19dec09), the 1.2 million temporary census-taking jobs—some of which have already come and gone—will amount to a $2.3 billion injection into the economy.
On the more personal side, ever since the Census Act of 1790, response to the decennial census has been mandatory and non-response punishable by fine. The fine established in the first Census Act was $20, an amount which would be equivalent to $500 today. The current fine could be seen as a relative bargain at $100, but in fact the Census Bureau has rarely attempted to collect such fines, preferring instead to promote positive messages.
In my research on labor markets and inequality, I have often used the IPUMS version of the March Supplement to the Current Population Survey (CPS). March CPS 1962-2009 is freely available online from the University of Minnesota Population Center, at http://cps.ipums.org/cps/, complete with thorough documentation and a user-friendly extraction system.
In order to facilitate comparisons over time, IPUMS has harmonized the CPS variables across the years. Variables have been harmonized not only over time but also with the variables from the decennial censuses and the American Community Survey, which is a great advantage. IPUMS has also created a number of additional useful variables, such as industry and occupation of employment based on the 1950 classification for all years.
In general, both the documentation and the work of variable harmonization they have done are superb, at least regarding the variables I have used in my research. Their codebook is clear, complete, and very user-friendly. The IPUMS documentation includes notes and information regarding sample sizes, as well as problems the IPUMS project has found with the weights provided by the survey in some years–and how they have fixed them.
In spite of the fact that the IPUMS version of the March CPS is a great resource, it is worth mentioning that it does not have all the variables originally included in the March CPS. When a researcher needs any of the excluded variables she will find them in the Unicon version of the March CPS, available through a CD-ROM subscription in the DISC library. (Editor’s Note: DISC has Unicon’s CPS Utilities for the March CPS on CD-ROM through 2006.) Also important, the IPUMS version and the Unicon version of the March CPS do not fully agree in their treatment of variables; self-employment in the early years is a good example. In these cases it is up to the researcher to decide which treatment is correct (if any).
Pablo Mitnik earned his PhD in Sociology at UW-Madison in 2009, and is currently a postdoctoral scholar at Stanford University’s Center for the Study of Poverty and Inequality.
DISC is pleased to announce that we have published the Catasto study (15th Century Census and Property Survey for Florentine Domains) in our BADGIR (Better Access to Data for Global Interdisciplinary Research) catalog. The Catasto study can be found in BADGIR at http://tinyurl.com/ycr86kf.
The Catasto study data were coded between 1966 and 1976 from the official manuscripts of the tax declarations (Campioni) for the city of Florence and environs (Florentine domains) from 1427 to 1429. Also included are 10% samples of the 1458 and 1480 declarations for Florence and the 1425 and 1502 declarations for Verona. The Catasto study has been part of our collection since 1981. It consists of 31 data files, with hierarchical data in two record types: economic and demographic. The economic record provides data on the entire household, while the demographic record lists information on individual members of the household. There is only one economic record per household but there can be more than one demographic record (0 to 5) per household depending on the number of members in the household.
DISC staff used SPSS to concatenate the 31 data files of the Catasto study and recode their location variables. Metadata were added to describe this concatenated file. Users can now access all 65,204 census and property survey records via a friendly user interface, without needing to address the complicated hierarchical file structure and create statements to read in the raw data. Catasto data can be downloaded in SPSS, SAS, Stata or delimited format from BADGIR.
BADGIR (http://nesstar.ssc.wisc.edu/webview/) is powered by the Nesstar software suite. It allows users to search, browse and analyze data online. Viewing documentation and univariate statistics in BADGIR does not require registration. However, users must be registered and receive an e-mail confirmation in order to analyze or download data using BADGIR. To register, first-time users can visit the BADGIR registration page (http://www.ssc.wisc.edu/cdha/badgir/terms.htm?submit2=Register).
The Inter-University Consortium for Political and Social Research (ICPSR) is now accepting applications for the 2010 Summer Program in Quantitative Methods of Social Research. The central component of the program takes place in two four-week sessions, held in Ann Arbor at the University of Michigan June 21-July 16, and July 19-August 13. The sessions include lectures and workshops on a wide variety of topics in research design, quantitative reasoning, statistical methods, and data processing.
The 2010 ICPSR Summer Program is also offering more than 20 three- to five-day workshops on both statistical and subject-specific topics throughout the summer. These shorter workshops are held in a variety of locations: Amherst, MA; Ann Arbor, MI; Bloomington, IN; Chapel Hill, NC; and New Haven, CT.
Complete information on course descriptions, fees, registration, instructors, and housing can be found at http://www.icpsr.umich.edu/sumprog/.
As part of the DISC campus-wide subscription to the Roper Center Public Opinion Archive, UW-Madison has access to the iPOLL question databank, allowing searches of around half a million public opinion survey questions going back to 1935. At the beginning of February, iPOLL began using a new interface, still at http://digital.library.wisc.edu/1711.web/ipolld. Instead of having to enter an email address before using iPOLL, users can now search and view individual question results without logging in.
However, in order to download datasets, users must complete a one-time free registration and then log in on subsequent visits. Even if you had registered in the old system, you will still need to register in the new system. The “Register” link is in blue near the top of the iPOLL screen. Please let DISC staff know if you have any questions!
Crossroads Corner highlights web sites recently added to the searchable Internet Crossroads in Social Science Data on the DISC web site.
Business Monitor Online: Global Country Risk & Industry Analysis Service
Business Monitor Online’s Global Country Risk & Industry Analysis Service provides both reports and data covering country markets across Asia, Latin America, Europe, the Middle East and Africa. Reports include news analysis, political and economic risk assessment, macroeconomic forecasts, and country-specific analysis for 24 industry sectors. Though the reports include tables, these are generally not downloadable in manipulable form; the data download area of the site can be found in a right-hand menu bar labeled “Data and Forecasts: Compare and Export Data.” Users select geography (countries or country groups), and indicators (be sure to click on the blue triangles to drill down to individual indicators). The data, exported in Excel, includes annual figures back to the early 1990s, and forecasted several years beyond the present. An additional data-download topic is Risk Ratings, which appears when “Country Risk” or “Financial Markets” is chosen at the left of the screen.
UW-Madison subscribes to Business Monitor Online; current students, faculty and staff at UW-Madison can access the service at http://digital.library.wisc.edu/1711.web/businessmonitor.
World Government Data
The Guardian, a prominent British newspaper, has created a site that aims to be a gateway to government sites around the globe that carry freely-downloadable data. This ambitious project, at http://www.guardian.co.uk/world-government-data, is currently working on four English-speaking countries (U.S., UK, Australia and New Zealand) but plans to expand. The collection is keyword searchable and also browseable by 14 categories, the two largest of which are Health and Population.
The Guardian also carries a data blog, at http://www.guardian.co.uk/news/datablog, where every entry contains a link to the downloadable data and most entries end with a challenge, asking readers: what can you do with the data?
Data Resource Center for Child and Adolescent Health
The Data Resource Center for Child and Adolescent Health, at http://www.childhealthdata.org/, focuses on two surveys: the National Survey of Children’s Health (2007, 2003) and the National Survey of Children with Special Health Care Needs (2005/06, 2001). The query system enables users to drill down to specific indicators in each survey, presenting a table and graph that can then be modified to compare states or subgroups. State profiles are another way to approach the indicators; each profile displays a table of pre-selected indicators which can then be clicked to view results pages.
One of the site’s strengths is in educating users to find the data in a way that will be useful for advocacy. The Quick Guide to Data Query Topics under the How to use this site menu option includes “tips on choosing a starting point, searching the data, and telling a story with your results!” Users can save queries (free registration required) or e-mail them.
One option that does not appear readily available is a way to download the results tables in manipulable form. The datasets themselves cannot be downloaded directly from the site either, though instructions for acquiring the data are provided and many users—including academic researchers—can get the data for free. The site does include downloadable SAS and SPSS codebooks.