DPLS News, November 2002

DPLS News contains articles about local, national, and international data issues.
It is published twice a semester by the library staff.

Editor: Joanne Juhnke, Associate Special Librarian
Contributors: Lu Chou, Senior Special Librarian, Jay Dougherty, Library Assistant, & Cindy Severt, Senior Special Librarian

November 2002

Table of Contents

"So You Want to Use Restricted Data..."
Census Product Update
DPLS Holiday Closings
Researcher's Notes
ICPSR by Proxy
New Studies at DPLS
IPUMS International: Census Microdata Worldwide
Data Web Site Suggestions?

Internet Corner

Assocation of Research Libraries Statistics
US Census Resources on the Web
The Human Life Table Database
No Child Left Behind

"So You Want to Use Restricted Data...."

(Note: Text appearing in italics can be found on the UW Protection of Human Subjects in Research web site http://www.rsp.wisc.edu/humansubs/index.html)

You’ve found the perfect dataset: it’s rich, it has the right geographic breakdown, it’s socially relevant... and it’s restricted. Someone — you don’t know who — has to sign off on a use agreement, and a red flag suddenly appears. First of all, relax. Secondly, pay attention to the red flag.

It is ironic that lowered technical barriers to data have sometimes been offset by increased restrictions to access because of intense protection of human subjects. Is there an increase in human subject protection? Or are there simply more restricted datasets making an appearance in the public arena? The University of Wisconsin–Madison is guided by the ethical principles set forth in the report of the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research entitled Ethical Principles and Guidelines for the Protection of Human Subjects of Research, also known as The Belmont Report. UW–Madison has pledged that the institution and all investigators will follow the US Department of Health and Human Services (DHHS) regulations for protecting human research subjects. What this means is that there are institutional mechanisms in place for the protection (physical, confidentiality, etc.) of human subjects that must be adhered to. Risking noncompliance with these regulations could result in a loss of federal funding which could in turn jeopardize further research.

The good news is that the vast majority of data requested by DPLS users is exempt from institutional review. Research projects involving secondary data set analysis will NOT require prior IRB (Institutional Review Board) approval if the data set has been preapproved by the UW–Madison IRBs as indicated by posting on the following list:
· Inter-University Consortium for Political and Social Research (ICPSR)
· Data & Program Library Service (DPLS)
· Center for Demography and Ecology (CDE)
· U.S. Bureau of the Census
· National Center for Health Statistics
· National Center for Education Statistics
· National Election Studies

DPLS recommends that social science researchers familiarize themselves with the policies and procedures for conducting human subject research (including secondary analysis) at UW. The Protection of Human Subjects in Research web site (http://www.rsp.wisc.edu/humansubs/index.html) is a well organized resource which covers nearly every aspect of human subject review, including application forms and procedures and an online tutorial on investigator obligations.
Be sure to contact DPLS if you have any questions about restricted data, or data you think might be restricted. It is our goal that users spend more time analyzing data than in getting approval to analyze it.

Census Product Update

The U.S. Bureau of the Census is continuing to release data products from the 2000 census on a flow basis. With the recently completed state-by-state release of Summary File 3, only a few major data products remain to be released.
Summary File 3 covers long-form sample data through 813 detailed tables covering social, economic, and housing characteristics. Summary File 3 tables and maps are available through American FactFinder at http://factfinder.census.gov/ or for FTP download at http://www2.census.gov/census_2000/datasets/Summary_File_3/.

Products still awaiting release include Summary File 4 (detailed for many race and ethnicity/tribal/ancestry categories); Public Use Microdata Sample files; and the Congressional District Data Summary File for the redistricted 108th Congress. A product schedule is available at http://www.census.gov/ population/www/censusdata/c2kproducts.html.

DPLS Holiday Closings

Happy Holidays! DPLS will be CLOSED:

Thu./Fri. Nov. 28 & 29, for

Tue./Wed. Dec. 25-26, for Christmas.

Tue./Wed. Dec. 31-Jan. 1, for New Years.

Researcher's Notes

by Adam Signatur

I am a Project Assistant in the La Follette School of Public Affairs and an inexperienced user of the data library. Recently, while attempting to retrieve some data from the National Longitudinal Survey of Youth (NLSY), I came upon an interesting sort of problem. The files that I was using to extract data on some pre-determined variables were incompatible with the most current release of the data.

DPLS staff worked with me to get to the bottom of the problem. We tried several solutions, including digging up previous versions of the data that were archived in the library, changing the file extensions of the extraction files, and opening the files to examine their contents. All of these proved unsuccessful, but eventually we narrowed in on the solution. In the end, updating the files was a simple matter of creating new files with the proper file extensions and pasting the lists of necessary variables into the files.
In talking with NLS User Services, I learned that extraction file extensions are often changed with new releases of the data, but that the files themselves remain the same. In future releases, they hope to keep file extensions the same to make them somewhat easier to use. They generally prefer that users work with the most current release of the data because, over time, errors and inconsistencies are corrected.

The temporary difficulty that I encountered in extracting data highlights the need for data programs and files to remain compatible with current technology. In this case, the extraction files and data releases were created for the DOS environment and could not be run on available computers. Fortunately, updating them to the Windows environment was a simple process.

Editor’s Note: More information about the National Longitudinal Surveys is available online at http://www.bls.gov/nls/home.htm.

ICPSR by Proxy

As you may be aware, data from ICPSR has been much easier to retrieve in the past year than previously. Thanks to a service called ICPSR Direct, UW-Madison campus users can download data directly without going through DPLS librarians. All it takes is a free registration and a campus computer or WiscWorld connection and you’re in!
But did you know that students, faculty and staff at UW-Madison can use ICPSR Direct even from a non-campus Internet service provider (ISP)? Like many other UW library resources, ICPSR Direct is available through the library proxy server, EZproxy. With EZproxy, members of the UW-Madison community can log in to restricted library resources from non-campus Internet connections.

A link to ICPSR via EZproxy is available on the DPLS home page (http://dpls.dacc.wisc.edu); the direct link is http://ezproxy.library.wisc.edu/login?url=http://www.icpsr.umich.edu/.
For instructions on how to use the proxy server, visit http://www.library.wisc.edu/help/remote/.

New Studies at DPLS

· 2001 Religion and Public Life Survey
· National longitudinal surveys of labor market experience, youth cohort: 1979-1998 [male fertility file].
· PollingReport.com and The Polling Report subscription
· Scientists and engineers in the United States, 1993.
· Scientists and engineers in the United States, 1995.
· Scientists and engineers in the United States, 1997.
· Scientists and engineers in the United States, 1999.

IPUMs International: Census Microdata Worldwide

Since 1998, United States census microdata from 1850 to 1990 has been available through IPUMS, the Integrated Public Use Microdata Series at the University of Minnesota. The project pulled together microdata produced both by the U.S. Census Bureau and other researchers, brought uniform coding and formatting to the data, and provided an online extraction system.

IPUMS has now expanded its scope worldwide with the first release of IPUMS International in May 2002. IPUMS International began by making an inventory of international census microdata around the globe, and has begun to make the data available online. The initial release includes samples from Colombia, France, Kenya, Mexico, the United States, and Vietnam. A second release, slated for March 2004, is expected to include Brazil, China, Ghana, Hungary, and Spain. Additional negotiations are ongoing.

Both IPUMS USA and IPUMS International can be accessed at http://www.ipums.umn.edu/. A free online registration is required, in which users describe their research and agree to terms of usage including the over-arching IPUMS motto: “Use it for GOOD – never for EVIL.”

Data Web Site Suggestions?

DPLS staff are always on the lookout for good data-related web sites to add to the Internet Crossroads, our annotated searchable guide to social science data on the web. Are your favorite bookmarks included? We’d love to hear about your data-related web “finds.” Just send us an e-mail to disc@mailplus.wisc.edu with the URL, and a sentence or two about why the site is a favorite for you. If you haven’t visited Crossroads lately, take a look at http://dpls.dacc.wisc.edu/newcrossroads/. Crossroads has gone through some major upgrades in the past year!

Internet Corner

Association of Research Libraries Statistics

The Association of Research Libraries (ARL) presents a web site related to academic library statistics, hosted at the University of Virginia libraries. The interactive database interface allows users to produce rankings of member institutions by various criteria; generate summary statistics across member institutions; and extract and download data in comma-delimited format.

The ARL has collected library statistics since 1961/62. Current statistics include data on collections, staffing, expenditures, library services, and library and university characteristics. However, the site also carries the historical Gerould statistics for sixty institutions, going back as far as 1907 in some cases, presented in HTML table format.

The ARL Statistics site is available at http://fisher.lib.virginia.edu/arl/index.html.

U.S. Census Resources on the Web

Beth Harper, a Government Documents Reference librarian at UW-Madison, compiled this online bibliography of web sites regarding the U.S. Census. The site is divided into three categories by color: blue represents statistics about people, green represents statistics about economics, and yellow indicates general statistical sources. Users can also observe sites within the individual categories. The bibliography within the general site is listed in alphabetical order.

The U.S. Census Resources guide may be reached at http://www.library.wisc.edu/guides/govdocs/census/sitelist.htm.

The Human Life-Table Database

The Human Life-Table Database provides population life tables covering a number of countries and years, describing to what extent a particular cohort dies off with age. Many of the tables provided on this site are official data from national statistical offices, while others represent non-official work by other researchers. The tables, both complete and abridged, are available in ASCII text format and in PDF. Links to mortality databases are also included on the site.

The Human Life-Table Database is a joint project of the Max Planck Institute for Demographic Research (Germany), the Department of Demography at the University of California at Berkeley, and the Institut national d’études démographiques (France).

The Human Life-Table Database is available at

No Child Left Behind

The No Child Left Behind (NCLB) site was published by the White House in conjunction with President Bush’s 2002 education proposal of the same name. The site features an easy to read version of the bill as well as links leading to further detail.

The statistics page for the NCLB site includes a few chart graphics from The Nation’s Report Card 2000 by the National Center for Education Statistics. At the very bottom of the page is a link to a 7.6 MB down-loadable research file in Access 2000 format containing funding allocations for the 2002-2003 school year, by state, school district, and congressional district.

The No Child Left Behind statistics page is at http://www.nclb.gov/next/stats/index.html.

