DISC News contains articles about local, national, and international data issues.
It is published twice a semester by the library staff.
Editor: Joanne Juhnke, Special Librarian
Staff Contributors: Lu Chou, Senior Special Librarian
(Visit our PDF edition as well!)
Census Day, April 1, has come and gone, and so has the April 16th deadline for mailing your ten-question form so that no Census employee will need to ring your doorbell. The Census Bureau has continued to encourage people to “take 10” (or fewer) minutes to answer the ten questions and mail in the form; however, each passing day increases the chance that the enumerator will come by before the form has been processed. Wisconsin has been leading the nation in mail participation rate--see “Mapping Census 2010 Returns.”
Once you’ve sent in your form or spoken with an enumerator, however, the process continues at a breakneck pace.
The next step involves the Decennial Response Integration System (DRIS), a contract between the Census Bureau and Lockheed Martin. A recent article in the Washington Post (4/19/2010) outlined what happens next: Lockheed collects the forms, scans the responses. and assembles the data that are eventually returned to the Census Bureau for further processing.
The completed forms come in to one of three processing centers: Baltimore, Phoenix, and Jefferson, IN. Within 48 hours of receipt, mail sorting machines read the barcodes on the forms so that Lockheed can let the Census Bureau know which addresses have returned their forms. From this information, the Bureau can determine where to send enumerator staff to knock on doors.
Once the barcodes are processed, the forms are scanned for content, using a specialized formula comprising optical character recognition, mark recognition, and additional algorithms. When necessary, employees examine forms by hand. Lockheed has hired around 5,000 temporary employees to staff the data centers, plus 8,000 additional employees in 11 contact centers nationwide for telephone assistance and follow-up.
The Census Bureau is required by law to deliver population counts by state to the President within nine months of Census Day, by December 31, 2010. These counts determine how many seats each state receives in the US House of Representatives.
The next major data-deadline is imposed by Public Law (P.L.) 94-171, an act passed in 1975 that requires the Census Bureau to provide certain data to state governments for the purposes of redistricting, no later than a year after Census Day. The P.L. 94-171 data provides counts of the total population for a variety of geographic areas, along with race, ethnicity, voting age population, and housing unit data.
Data products such as demographic profiles, summary files, and additional reports will be released as completed between April 2011 and September 2013. The online gateway for these data at the Census Bureau is the American FactFinder interface, at http://factfinder.census.gov/.
As the files become available through the Census Bureau, other organizations gather, add value to, and distribute the data further. Examples include:
- ICPSR - http://www.icpsr.umich.edu/
- IPUMS USA - http://usa.ipums.org/usa/
- Missouri Census Data Center - http://mcdc2.missouri.edu/
- Geolytics - http://www.geolytics.com/ (DISC has the Geolytics Census CDs and Neighborhood Change Database)
Where do you get information on current events? Do you read daily newspapers, visit news websites, watch TV, and/or follow Twitter?
To monitor the vibrant pulse of current events and to learn how the public consumes them, the Pew Research Center’s Project for Excellence in Journalism (PEJ) uses empirical methods to evaluate and study the contents in both mainstream media: print, network TV, cable TV and radio (in their traditional format as well as their websites) and new media: blogs, YouTube and Twitter. PEJ publishes two weekly indexes at their Journalism.org website to summarize most covered events in the previous week. News Coverage Index covers the traditional mass media. New Media Index follows blogs, YouTube and Twitter. Campaign Index tracks the presidential campaign from January 1 to November 3, 2008. Talk Show Index looks at the topics covered on the talk and opinion shows on cable and radio.
Anyone interested in analyzing and monitoring the change in news coverage can check out these four News Coverage Index data sets from PEJ: 2007 News Coverage Index Data Set, 2008 News Coverage Index Data Set, 2008 Additional Content Studies, and 2008 Campaign Coverage Index Data Set. They are freely downloadable from http://www.journalism.org/by_the_numbers/datasets/ after you fill out a simple form and agree to the terms of data usage.
Teaching with Data.org
The Inter-University Consortium for Political and Social Research (ICPSR) has expanded its instructional resource offerings through the Teaching With Data program, at http://www.TeachingWithData.org/.
Through the Teaching With Data web site, the program offers annotated links to data-driven teaching materials primarily aimed at the undergraduate level, though the site-wide search tool includes a K-12 option. Classroom resources include lessons and lectures, exercises and modules, syllabi and reading lists. Data resources include both tabular and downloadable data, data-based maps, and links to various data archives. Tools for analysis, visualization and course development are highlighted as well. Users can browse the site by discipline: anthropology, economics, environmental sciences, geography, history, political science, public policy, social work, and sociology. A “Data in the News” feature links the site to current events.
TeachingWithData.org is a partnership between ICPSR and the Social Science Data Analysis Network (SSDAN), both at the University of Michigan. The project is funded by the National Science Foundation.
NCAA Student-Athlete Experiences Data Archive
The National Collegiate Athletic Association (NCAA) has for many years been collecting data from its member institutions and student athletes. These data are used to inform athletics policies at a national level and also to provide answers for stakeholders in the collegiate athletics enterprise: college administration, athletics staff, students and faculty, and others in higher education. ICPSR has partnered with the NCAA to archive and distribute data through the NCAA Student-Athlete Experiences Data Archive, on the ICPSR site at http://www.icpsr.umich.edu/icpsrweb/NCAA/.
The first study at the new archive is the NCAA Division I Academic Progress Rate, 2009. The data includes team-level Academic Progress Rates, eligibility rates, retention rates, and penalty and award information on Division I student-athletes from the 2003-2004 season through the 2007-2008 season. This dataset is available to ICPSR subscribers both for download and for online analysis.
Several datasets are slated to be added to the archive in 2010. Individual-level data on the experiences of current and former student-athletes will be available from the Growth, Opportunities, Aspirations and Learning of Students in college study (GOALS) and the Study of College Outcomes and Recent Experiences (SCORE). Also forthcoming in 2010 is the Graduation Success Rate Public-Use Dataset.
Wisconsin is #1! At least, we’ve been running in first place when it comes to Census 2010 mailback participation, with an 80% rate as of April 22.
The Census Bureau has encouraged the competitive urge with their “Take 10 Map,” a partnership with Google Maps at http://2010.census.gov/2010census/take10map/. “Take 10” refers to the ten questions on the census form, and the ten minutes (or less!) it takes to respond. The map displays a daily update of the percentage of forms mailed back by the households that received them, otherwise known as the mail participation rate. The main display of the site features the top states and cities but also allows users to view “how they’re doing” by counties or by incorporated places and the tracts within them.
For example, a local view zooming in toward the 53706 ZIP code displays the census tracts around campus. The census tract that includes Eagle Heights boasts a rate of 75%; the tract for central campus shows a rate of 73%; and the tract covering State/Langdon has a rate of 63%. Census forms were distributed in dormitories on April 12, and enumerators began door-to-door work on campus that same week. While personal visits from census workers don’t start until May 1 in most areas, campuses got an earlier start due to the May departure of many students as the semester ends.
The Take 10 map site also allows for text-file downloads of the mail participation rate as of the current day. As of April 22, Wisconsin has yet to match its mail participation rate of 82% from the year 2000—but we’re getting close!
Please note: DISC will be closed
- Friday May 21: State Furlough Day
- Monday May 30 : Memorial Day
- Monday July 5: Independence Day
- Monday September 6: Labor Day
Crossroads Corner highlights web sites recently added to the searchable Internet Crossroads in Social Science Data on the DISC web site.
AidData: Tracking Development Finance
The AidData site, at http://aiddata.org/, offers a portal for data “describing the universe of development finance project-by-project, including all grants and loans committed by all major bilateral and multilateral aid donors.” This project, a partnership between Development Gateway, the College of William & Mary, and Brigham Young University, was unveiled in March at the Aid Transparency and Development Finance Conference at University College in Oxford, UK.
The core of the AidData site is a searchable catalog with close to a million records of development finance flows, that can be specified by donor, recipient, purpose, activity, and years. Results can be viewed online or exported for analysis. Much of the data comes from OECD Creditor Reporting System (CRS), going back to 1973. Other data was collected from donor organizations, either through public documents or through direct contact. So far, AidData only covers funding that originates from governments, but plans to cover funds from private foundations and NGOs (non-governmental organizations) in the future.
The site also includes research publications from AidData, a blog by AidData staff, and links to other sources of development data.
StatPlanet is an interactive tool for data visualization and mapping that can be used either online at http://www.sacmeq.org/statplanet/ or as a free educational download. StatPlanet is a project of the Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ), which conducts large-scale educational surveys across its 15 member countries. The Flash-based StatPlanet online tool includes variables from the first two SACMEQ surveys (1995 and 2000), along with a range of other world development indicators from a variety of sources, for users to create interactive graphs, maps and charts. The StatPlanet application and data can also be downloaded for computers running Windows 95 and newer.
In addition to the main StatPlanet tool, the site also offers the downloadable tools StatPlanet MapMaker and Graph Maker, which allow users to import their own data and publish interactive maps or graphs online in Flash.
The SACMEQ parent site, at http://www.sacmeq.org/, also includes application documents for researchers to request access to the full data from the SACMEQ studies.
SEDLAC—Socioeconomic Database for Latin American Countries
The SEDLAC database, online at http://www.depeco.econo.unlp.edu.ar/sedlac/eng/index.php, brings together data from household surveys in Latin America and the Caribbean, focusing on poverty and other social variables. The site is a joint project of the Center for Distributional, Labor and Social Studies (CEDLAS) of the University of La Plata in Argentina, and the World Bank’s Latin America and the Caribbean Poverty and Gender Group.
The database incorporates over 200 surveys from 25 countries, presenting annual figures at the national and subnational level in Excel spreadsheets, text briefs, and maps. Users can browse the site by thematic category, or create their own tables using pull-down menus. A “Statistics by Gender” area of the site, updated December 2009, breaks out the data by gender and also presents gender inequality variables.