DISC News contains articles about local, national, and international data issues.
It is published twice a semester by the library staff.
Editor: Joanne Juhnke, Special Librarian
Staff Contributors: Lu Chou, Senior Special Librarian; Cindy Severt, Senior Special Librarian
(Visit our PDF edition as well!)
The new combination of the “short-form only” Census 2010 and the ongoing American Community Survey (ACS) is delivering results. Between the Census 2010 data products that are now being released, and the ACS data for 2010 emerging between September and December this year, we will soon have the full suite of data to replace the data products of Census 2000.
A decade ago, as the Census 2000 data was being released, census data users were eagerly anticipating the once-a-decade new data. In Census 2000, we had the traditional “short-form” 100% counts of how many people, where they live, and basic demographic information such as age, sex, race and Hispanic origin. Then a sample of one in six households also got a “long-form” survey that included characteristics such as income, employment, length of commute, education, detailed housing characteristics, and disabilities. However, the ten-year wait between the surveys was leaving the nation with data that grew very stale by the time the next decennial census came around.
The ACS, launched nationwide in 2005, has now taken the place of the long form sample data, with about one in every 40 households receiving the ACS form on a rolling basis in any given year. For areas with populations of 65,000 or more, the Census Bureau has produced 1-year ACS estimates every year since, including the 2010 1-year estimates that were released on September 22 and are the first ACS estimates to be released in the new American FactFinder interface at http://factfinder2.census.gov/. The ACS 1-year estimates represent an average across the entire year, as opposed to the single-point-in-time collection of the decennial census.
To adequately represent geographies with populations smaller than 65,000, the ACS is also producing 3-year estimates for areas with populations larger than 20,000, and 5-year estimates for small geographic areas. The 3-year estimates, each representing an average across three years of data-collection, have been produced annually since 2007, and the first set of 5-year estimates was released in December 2010. The estimates, since they are released annually, are more current than the previous “long-form” approach in terms of being refreshed each year; however, they do not capture a snapshot-in-time the way the previous approach did. The 5-year-estimates, for example, by definition contain at least some 5-year-old data collected at the beginning of the 5-year cycle.
The now-annually-recurring ACS data is joined this year by the decennial Census 2010 data. Census 2010 redistricting data files were provided to the states in February and March 2011, with a national summary file released in April. Demographic profiles for the states were released in April and May, with full Summary File 1 data released over the summer of 2011 (data for smaller geographies on age, sex, households, families, the population in group quarters, and housing units). Coming next will be Summary File 2, beginning in December, with population and housing characteristics for detailed race, ethnic, and tribal categories. The Census 2010 data release schedule is online at http://www.census.gov/population/www/cen2010/glance/.
The U.S. Department of Health and Human Services (HHS) has called for public comments on major proposed revisions to the federal regulations on protections for human research subjects. The proposal appeared in the July 25 Federal Register as an Advance Notice of Proposed Rulemaking (ANPRM), under the title “Human Subjects Research Protections: Enhancing Protections for Research Subjects and Reducing Burden, Delay, and Ambiguity for Investigators.”
The revisions would be the first major overhaul of federal regulations on human research subjects, often referred to as the Common Rule, since 1991.
UW-Madison’s Office of Research Policy plans to submit formal comments to HHS by the deadline of October 26, 2011. UW-Madison faculty or staff interested in commenting on the proposal should send responses via e-mail to email@example.com by 5pm on Tuesday, October 25 for inclusion in the joint UW-Madison response.
Several of the proposed revisions would have implications for social science data. The proposal calls for the creation of new data security and information protection standards, including bringing the Common Rule into closer alignment with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule used in the health sciences. Also proposed is a new category of Institutional Review Board oversight called “excused,” replacing the current category of “exempt.”
More information about the ANPRM is online at http://www.hhs.gov/ohrp/humansubjects/anprm2011page.html, including the full text and a tabular summary of the proposal. The page also includes a link for submitting comment directly to HHS.
The ANPRM is the first step in a multi-part process. The next step will be a Notice of Proposed Rulemaking (NPRM), which will afford another opportunity for public comment.
Eight New Data-Driven Learning Guides
Eight new Data-Driven Learning Guides (DDLGs) have been added to ICPSR’s Online Learning Center at http://www.icpsr.umich.edu/icpsrweb/OLC/, bringing the total number to 46. A wide range of topics including Homelessness, Attitudes About Gun Control, Partisanship, and Altruism are uniformly packaged for teaching data literacy to undergraduates. Each DDLG is designed as a lesson plan with a goal (e.g. to explore citizens’ perceptions of ethics in politics through partisanship), a dataset to be analyzed, how the data will be analyzed (cross tabulation, summary statistics), interpretation and summary of results, and a bibliography for further exploration.
2011 marks the 10th anniversary of the September 11th terrorist attacks, and ICPSR’s data holdings provide a unique opportunity to analyze public opinion and the impact of the attacks ten years later. Some of the surveys ask the same question over multiple years (“How often do you think about Sept. 11?”) while others address specific issues such as assessing the level of security in large shopping malls. In addition to searching across ICPSR, users can also search or browse the Terrorism & Preparedness Data Resource Center, housed within ICPSR at http://www.icpsr.umich.edu/icpsrweb/TPDRC/, for 9/11-related data.
ICPSR Turns 50
ICPSR recently celebrated its 50th birthday with a reception at the American Sociological Association’s annual meeting in Las Vegas, and the launch of a web site (http://www.icpsr.umich.edu/icpsrweb/ICPSR/fifty/) recounting the history of the Consortium. Features of the new web site include a personal history written by former interim Director Erik Austin who retired in 2006 after 41 years at ICPSR; video interviews with several current and former Consortium leaders; staff profiles; and upcoming receptions at various professional meetings. Let the celebration continue!
BADGIR (Better Access to Data for Global Interdisciplinary Research), our Nesstar-based online data archive at http://nesstar.ssc.wisc.edu/index.html, received a major overhaul this summer. BADGIR was migrated to a virtual server which has 8 GB RAM with latest Intel Xeon processor running Windows Server 2008 R2 (64-bit OS). The advantages of this virtual server include failover protection, faster/easier restoration in case of failures, and easier to allocate additional memory and processing power.
In addition, the Nesstar software suite that powers BADGIR was upgraded to version 4.0. The Nesstar WebView interface in this new version is cleaner with some changes in organization, and navigation and searches are improved. The search window now remains open and the search terms are highlighted in the results. Metadata of the studies in BADGIR are now searchable by Google. Help pages are no longer on our server but directly linked to the Nesstar company site. To help our users quickly get familiar with the version 4.0 WebView interface, we have created four tutorial videos linked from the “Hints and Helps” page.
Please contact DISC if you would like to add your studies to BADGIR. We can work with you to document your social science datasets according to the DDI (Data Documentation Initiative) standard and distribute them from our BADGIR archive.
On October 25th DISC librarian Cindy Severt will be conducting a hands-on workshop about finding and analyzing datasets as part of the UW-Madison Libraries Graduate Support Series. Though open to all members of UW, this workshop is geared toward the general reference librarian with no statistical analysis experience. The workshop will take place from 4-5:30pm in Memorial Library, Room 231.
Crossroads Corner highlights web sites recently added to the searchable Internet Crossroads in Social Science Data on the DISC web site.
ED Data Express
In August the US Department of Education announced a major upgrade to ED Data Express, http://www.eddataexpress.ed.gov/, an online initiative to present user-friendly K-12 education data to the public. The site includes data from EDFacts, Consolidated State Performance Reports (CSPR), State Accountability Workbooks, the National Center for Education Statistics (NCES), the National Assessment of Education Progress (NAEP), the College Board, and the Education Department's Budget Service office.
ED Data Express has three main sections: a State Snapshots page, a Data Element Explorer, and a Build a State Table page. The State Snapshot includes ready-made charts and tables with key data for each state. The Data Element Explorer presents tools for users to interact with the data on the site (graphs, tables, maps, trend lines). The Build a State Table page lets users build customized tables by choosing specific data elements and states. Before entering any of these sections, users are directed to a page on appropriate use of the data, particularly regarding comparisons across states, and to indicate that they have read the information before proceeding.
Kenya Open Data
“Our information is a national asset, and this site is about sharing it,” declares the Kenya Open Data project at http://www.opendata.go.ke/. With the site’s release in July 2011, Kenya became the first sub-Saharan country to launch a national open-data initiative. Operated by the Kenyan government in partnership with the World Bank, the site carries data at both the national and sub-national level. The collection includes over 160 datasets including the 2009 census, national budget data, nation and county public expenditure data, and health care and school facilities.
Though the project is aimed at the people of Kenya first and foremost, anyone can make use of Kenya Open Data. Data can be freely downloaded, as well as displayed online in customizable maps, charts and tables.
BadgerStat, online at http://badgerstat.org/, is a non-partisan organization that offers online data-driven performance measures of Wisconsin for citizens and policymakers. The site is organized by policy areas, such as agriculture, economy & business, transportation, and workforce. Within each policy area is a growing collection of BadgerStat briefs, each taking a topic such as “state government employment” or “adult obesity.” Each brief introduces why the topic is important, quick bulleted facts, details (with charts), and links to data sources.
Human Fertility Database
Max Planck Institute for Demographic Research, in partnership with the Vienna Institute for Demography, hosts the Human Fertility Database (HFD) at http://www.humanfertility.org/. The HFD currently covers 21 countries and includes detailed data on births, unconditional and conditional fertility rates, cohort and period fertility tables, total fertility rates, mean ages at childbearing, and parity progression ratios. Historical coverage for each country varies – the oldest data, for births in Sweden, goes back to 1891 – and the project continues to add updates, with plans to add more countries in the future. The main page notes that at this point, the site contents are still considered preliminary and intended primarily for testing and evaluation. Free registration is required to access data.