Please note: Older issues of the newsletter are likely to contain
broken links -- the newsletter is presented here "as published."
DPLS News contains articles
about local, national, and international data issues.
It is published twice a semester by the library staff.
Editor: Joanne Juhnke,
Associate Special Librarian
Contributors: Lu Chou, Senior Special Librarian & Cindy Severt, Senior Special Librarian
The 2000 Census was a marvel of tabulation technology, with state-of-the-art processes for reading hand-marked paper forms. But unpublicized amid the raft of mailings and enumerators visits was an alternative for short-form recipients: visit the Census Bureau web site and fill out the form online. Around 60,000 individuals chose this new option.
Online survey research carries a host of potential benefits. It can be speedy and relatively inexpensive. Responses can be funneled directly into databases, bypassing the need for data entry. Audio and video survey components are becoming increasingly possible, and the technology is still new enough that some respondents are attracted by the very novelty of answering survey questions online.
The downsides are notable too, however. Internet use is far from universal, and users represent a skewed subset of the general population (more likely to be male, Caucasian, wealthier, highly educated). Even within the Internet population, there is no comprehensive directory and no real e-mail equivalent to random-digit-dialing. Online data collection may be vulnerable to spamming or multiple automated responses from a single user. And unscientific surveying, from CNNs admittedly self-selected QuickPolls to misuse of easily-available survey software, can cast doubt on the credibility of all online surveys.
In spite of all the drawbacks, online surveying is clearly a force to be reckoned with, both from a marketing and social science perspective. Creative techniques can minimize the impact of the challenges of the electronic survey.
One approach to the lack of universal penetration of the technology is to focus the surveys on populations that tend to be online. University students and faculty, for example, tend to be highly connected, as do groups such as journalists and programmers.
Another option is to recruit participants from the larger population by traditional methods and then provide the technology to answer the online surveys. One company called Knowledge Networks provides WebTV equipment and free Internet access to individuals in exchange for participation in survey panels. The speedy and broad-based responses to such surveys make the Knowledge Networks product attractive for media and marketing research.
To answer the charge regarding the overall credibility of online surveys, it is important to remember that the new technology has no monopoly on poor methodology. Biased or poorly-stated questions, bad sample selection, and twisted conclusions can be inflicted on any survey technology. As experience with online survey technology grows, best practices will be even easier to identify.
Surveys do dot the Net, and perhaps by 2010 the U.S. Census will be primarily online as well!
For links to conferences, firms, software, and bibliographies regarding online survey research, visit http://www.websm.org/.
About midway through each semester the staff at DPLS typically field daily requests for data ranging from why Minnesotans voted for Jesse Ventura, to whether or not physical appearance influences ones job prospects and earning potential.
While DPLS staff are happy to track down data for users, we also encourage users to find sources on their own. To that end we will gladly tailor a How to Find Data presentation to any class that will be using datasets as part of its course work.
It always pays to look before you leap when diving off the data deep end. Here are a few of the types of questions to consider when starting off on the quest for the perfect dataset.
- Do you want a snapshot (cross-sectional data), or a motion picture (longitudinal data)?
- Do you want the unit of analysis to be, for example, individuals with an average income of $50,000, or households with an average income of $50,000?
- Do you want microdata (responses from individuals), or aggregate data data that has been summarized from microdata?
- Do you want data on, for example, the escalation of violence in high schools, or the incidence of handguns in high schools?
Also keep in mind that the data you need may not be addressed by a single study; it may be necessary to combine information from two or more datasets.
This summer Senior Special Librarian Cindy Severt taught a four-week summer school course for the School of Library and Information Studies. LIS839 Special Collections: Data Libraries introduced library school students to the world of data librarianship (a world of acronyms) with the intention of presenting methods and principles that are pertinent to any field of librarianship. Divided into four modules: What is the Collection? What is Done With the Collection? Managing the Collection; and Data Services Policy, the class included lectures, lab exercises, guest speakers, and much give and take of ideas.
The guest speakers were a highlight of the course because the material they presented bridged the gap between a data collection as an assortment of machine-readable files; and the final policy, product, or output that is a result of using the collection. To non social scientists this was an eye-opening experience. Eric Grodsky, Ph.D. candidate in sociology, talked about how he is using education data in his dissertation. Charlie Fiss, Information Manager of the Center for Demography & Ecology (CDE) demonstrated how to manipulate data with statistical packages to illustrate obvious disparity between 1990 and 2000 redistricting data for the community of Packwaukee. CDE Data Librarian Jack Solock uncovered data sources on the Internet; and David Long, Associate Researcher from the Applied Population Lab, demonstrated the use of geographic information systems (GIS) to transform numeric data into maps.
The most rewarding aspect of all were the students themselves who were exceptionally participatory, inquisitive, and sharp. If they are any indication, the future of data librarianship - of librarianship in general - is in good hands.
Below are some of the new studies DPLS has added to its data collection:
- CPS Utilities, March 1962-2000
- CPS Utilities, October 1968-1999
- Current Population Survey, February 2000: Displaced Workers Employee Tenure and Occupational Mobility Supplement
- Current Population Survey, August 2000: Internet and Computer Use Supplement
- Data on Women and Crime
- General Household Survey: Time Series, 1973-1982 [Great Britain]
- General Social Survey, 1972-2000 [Cumulative File]
- Global Development Finance, 2001
- OECD Social Expenditure Database, 1980-1997
- Statistical Abstract of the United States, 2000
- World Development Indicators, 2001
Data on Women and Crime is a topical CD-ROM from ICPSR. It contains 55 data collections pertaining to data on crime and females of all ages. The topics cover classic family violence studies; court responses to violence against women; female offenders; police responses to domestic violence; sex offenders; victimization; and victim advocacy.
OECD Social Expenditure Database, 1980-1997 provides a unique tool for monitoring trends and analyzing changes in aggregate social expenditure, covering 28 OECD countries. Data are classified under 13 social policy areas.
Statistical Abstract of the United States, 2000 is a broad-based source of statistics on the social, political, and economic organization of the United States. This CD-ROM contains over 1,500 tables from over 250 different governmental, private and international organizations.
While DPLS users rarely see our systems administrator, DPLS could
not function without the skills and expertise of a computer guru!
This summer we threw a tailgate farewell party for graduating senior Ed Niles, who provided DPLS with computer support for four years. A Computer Science major and trombone player in the UW Marching Band, Ed not only kept the machines running smoothly but also shepherded our systems through several major transitions. We wish him well in his new endeavors.
Fortunately we are able to welcome sophomore Jamie Voight to DPLS! A native of McFarland, Jamie is majoring in Computer Engineering at UW and also plays trombone in the UW Marching Band. Although he often works after hours, please be sure to greet him if your paths cross.
UW-Madison has recently subscribed to a broad-based new service from the Organisation for Economic Cooperation and Development (OECD). The service is named SourceOECD, and the address is http://www.sourceoecd.com/. Campus users can also register for off-campus access.
SourceOECD contains a library of OECD publications available as PDF documents, and twenty-four statistical databases. The databases include sources that have been available from DPLS on CD-ROM, such as Education at a Glance and some portions of the OECD Statistical Compendium. More databases are scheduled to be added in the future.
The results of database queries are available in Excel, CSV, and Beyond 20/20 format.
The Incarceration Atlas is a special report from the Mother Jones website, at http://www.motherjones.com/prisons/atlas.html.
The site includes an interactive U.S. map with figures back to 1980 for overall incarceration rates, drug offenders, racial breakdown, and comparison of education spending versus prison spending. Data is available for download in text format.
Accompanying articles are also provided, examining such issues
as how the U.S. Census Bureau counts prisoners.
The FedScope database comes from the Office of Personnel Management and contains data on federal employment. The data covers hirings (accessions) and departures (separations) from 1996 to 2000. The data is organized into cubes that allow examination of three data elements simultaneously, from a list of thirteen: agency, location, Metropolitan Statistical Area, occupation, occupational category, gender, age, length of service, and more. Results can be charted or graphed online, or exported in Excel or PDF format.
Also available are pre-packaged reports for some of the more popular queries.
FedScope can be found online at http://www.fedscope.opm.gov/index.htm.
From the Bureau of Transportation Statistics (BTS) comes this basic handbook on statistical practice. The document explains that since the BTS is a relative newcomer in comparison to other federal statistical agencies, they have drawn on other agencies expertise to outline best-practices. The issues covered include defining errors in data; analyzing data; presenting data; and documenting data quality.
Appendices cover suggested readings and links to other related web sites.
The BTS Guide can be found at http://www.bts.gov/statpol/btsguide.html.
Also from the BTS comes the Intermodal Transportation Database, a beta-version of a broad collection of transportation data from various federal sources such as the Department of Transportation and the Census Bureau.
The site also contains a list of links to additional transportation data sites.
The Intermodal Transportation Database can be found at http://www.itdb.bts.gov/.