DISC News contains articles about local, national, and international data issues.
It is published twice a semester by the library staff.
Editor: Joanne Juhnke, Special Librarian
Staff Contributors: Lu Chou, Senior Special Librarian; Cindy Severt, Senior Special Librarian
(Visit our PDF edition as well!)
On the door of the Data & Information Services Center (DISC) library hangs a sign clipped from an advertisement in an information-industry magazine. The clipping features an image of a shadowy human figure in motion behind a railing, with a tagline in red letters: "I am data."
The sign is appropriate for the door at DISC for several reasons. First, it carries a message along the lines of “Data R Us” – think of DISC when you’re in need of social science data! Second is the warning message of the original ad: data can slip through your fingers if you don’t protect it.
The human figure in the image conveys a few more important things about social science data. At the heart of it, social science data is about people, real human beings who deserve (and require) consideration when you analyze data about them. And, while the figure in the image is a real person, he or she is represented as a shadow. The person’s identity is hidden.
So it is with public-use data. The datasets are designed so as not to disclose the identities of the individuals whose data is involved.
At UW-Madison, the required consideration for the real people behind social science data is mediated by the Human Research Protection Program. The UW-Madison Institutional Review Boards (IRBs) are responsible for reviewing research involving human subjects — including research that uses previously-collected data. The Human Research Protection Program website at http://www.grad.wisc.edu/research/hrpp/ includes step-by-step instructions for requesting approval, otherwise known as submitting a protocol. The first step involves determining whether or not your project meets the specific regulatory definition of “human subjects research” at UW-Madison.
Because of the identity protection built into public-use data, the Human Research Protection Program at UW-Madison has a policy regarding secondary analysis of existing data. The policy is online at http://my.gradsch.wisc.edu/hrpp/10023.htm. Research involving public-use data from the following datasets and repositories does not require IRB approval:
- Better Access to Data for Global Interdisciplinary Research (BADGIR)
- Inter-University Consortium for Political and Social Research (ICPSR)
- National Center for Health Statistics
- National Center for Education Statistics
- National Election Studies
- Roper Center for Public Opinion Research
- University of Wisconsin-Madison Data and Information Services Center (DISC)
- U.S. Bureau of the Census
- Luxembourg Income Study (LIS)
- Integrated Public Use Microdata Series (IPUMS-International)
- Integrated Public Use Microdata Series (IPUMS-USA)
- Integrated Public Use Microdata Series (IPUMS-CPS)
- Medical Expenditure Panel Survey (Household Component)
The policy is a living, growing document; the last four items on the list are additions since last semester. The policy also represents exceptions rather than the rule. Anyone at UW-Madison planning research with already-existing social science data from a source other than those listed above, or planning to merge datasets, will need to seek IRB approval. Staff at DISC can help you sort through the human-subjects issues involving data, including restricted data which contains confidential information requiring additional agreements and protections. The primary contact at DISC for human-subjects issues and restricted data is DISC Director Jack Solock,
More data, no subscription fee. That’s the message from the World Bank Group, announcing in April their new Open Data initiative, online at http://data.worldbank.org/. Their new Data Catalog offers free access to over 2,000 time series indicators on global development, for over 200 economies, with some data going back as far as 50 years.
UW-Madison has subscribed in the past to World Development Indicators (WDI), providing data back to 1960 on development-related variables covering education, health, poverty, environment, economy, trade, and more. WDI is now among the products available for free on the World Bank Data site, in combination with Global Development Finance (GDF), which covers debt and financial flows indicators.
The Data Catalog page, at http://data.worldbank.org/data-catalog, presents a list of World Bank data products now available for free. Click the blue Databank icon beside a data product to enter the database program and select countries, indicators, and time frame. Datasets in the Databank currently include:
- World Development Indicators
- Global Development Finance
- Africa Development Indicators
- Millennium Development Indicators
- Global Economic Monitor
- Education Statistics
- Enterprise Surveys
- Gender Statistics
- Health Nutrition and Population
- International Comparison Program
- Joint External Debt Hub
- Quarterly External Debt Statistics
In addition to benefitting from these newly-free data resources, UW-Madison now subscribes to World Bank e-library, a searchable online collection of World Bank publications and working papers, online at http://digital.library.wisc.edu/1711.web/worldbank-elibrary.
The Inter-university Consortium for Political and Social Research (ICPSR) has expanded their thematic collections and Social Science Variable Database. Campus users are invited to check out the rich content in these ICPSR collections.
Integrated Fertility Survey Series (IFSS)
The newly released IFSS data includes more than 90 sociodemographic variables from 10 fertility studies conducted in five decades with over 71,000 respondents. These variables have been harmonized to facilitate easy comparisons between the component surveys. IFSS also has an online analysis tool using Survey Documentation Analysis (SDA). Searches can be done at the variable level using the SOLR system. SOLR searches multiple fields and displays the results in facets that allow users to quickly refine and expand the search results.
National Addiction & HIV Data Archive Program (NAHDAP)
NAHDAP was created to acquire, prepare, and disseminate data resulting from the National Institute on Drug Abuse (NIDA) funded research. Researchers can use data from the NAHDAP archive to conduct secondary analysis on issues related to drug addiction, mental health, youth, crime, sexual behavior and HIV infection.
Social Science Variable Database (SSVD)
ICPSR’s Social Science Variable Database uses structured variable documentation in XML, tagged according to Data Documentation Initiative (DDI) standards. Recently 300,000 new variables and 750 studies have been added to SSVD. Users can now search more than 1.5 million variables across the ICPSR collection. This means that SSVD now contains variables from over 2,000 studies, approaching 40 percent of ICPSR’s holdings.
Coordinated by the United Nations, the first ever World Statistics Day will be commemorated on October 20, 2010. Emphasizing service, professionalism, and integrity, World Statistics Day is devoted to “acknowledging and celebrating the role of statistics in the social and economic development of our societies” (Ban Ki-Moon, Secretary-General of the U.N.). Want to know what Tajikistan, Mongolia, and the Bahamas are doing to celebrate? Find them on FaceBook, or go to http://unstats.un.org/unsd/wsd/ for the official site to find out!
How many times have you read the words “In a recent survey…….” in a journal article and wondered exactly what survey was being discussed? So often we read the report or article on which a survey is based, but what about the underlying data itself? The answer to this and other questions about finding quantitative datasets to analyze for use in term papers can be presented to your class in a hands-on lab setting by DISC staff. Contact Cindy Severt, Senior Special Librarian (firstname.lastname@example.org) for more information or to schedule a session.
by Joanne Juhnke
Crossroads Corner highlights web sites recently added to the searchable Internet Crossroads in Social Science Data on the DISC web site.
County Health Rankings
Launched in February 2010 at http://www.countyhealthrankings.org/, County Health Rankings is a collaboration between the UW Population Health Institute and the Robert Wood Johnson Foundation. The rankings are designed to identify areas for potential improvement in community health, so that community leaders and state and local health departments can work on solutions.
The County Health Rankings site uses data from a variety of sources to create county rankings based on health outcomes and four types of health factors: health behaviors, clinical care, social and economic, and physical environment factors. Each county receives an overall summary rank within its state for health outcomes and health factors, downloadable in Excel and viewable in HTML or as a map graphic. The site also provides tables broken down by health factors and health outcomes, as well as reports specific to each county. For each of the measures used, the site provides a definition of the measure, where the data comes from, and why the measure is important.
International Encyclopedia of the Social
UW-Madison has subscribed to the online version of the International Encyclopedia of the Social Sciences, at http://digital.library.wisc.edu/1711.web/iess. The IESS online is the digital counterpart to the 17-volume print edition, originally published in 1968 but with 3000 new articles added for its re-release in 2008.
Disciplines covered include sociology, political science, economics, anthropology, psychology, and more. The encyclopedia includes entries on survey research and data methodology, including articles on specific major surveys such as the Survey of Income & Program Participation and the National Longitudinal Survey of Youth.
“Government spending at your fingertips” is the promise of USAspending.gov 2.0, a recently released update from the Office of Management and Budget at http://usaspending.gov/. The site features detailed information on federal funding awards, including recipients, locations, amounts, and types of awards.
The USAspending.gov site provides several different ways to search and browse the data. Most obvious is a single search box on the home page, and home-page links to popular search requests such as “Gulf Oil Spill Contracts.” An advanced-search page lets you specify search options such as fiscal year (back to 2000), agency, recipient location, performance location and more.
Site users can browse the data through Summaries or Trends. The Summaries section of the site provides overview information on federal spending by agency, recipient, or location. Each item of the summary can be clicked for more detailed information; for example, click a contractor name on the recipients page for an expanded summary of contracts. The Trends section allows users to explore spending over time categorized by location, contracting agency, assistance type, federal program, and type of spending.
One can also download data from the Data Feeds section of the site, either the raw data from the agencies or a user-defined dataset. The datasets are sometimes large; the site cautions that the 2009 contracts data from the Department of Defense is a full 2 gigabytes.