CLIR PUBLICATIONS

Reports / Preserving the Whole: A Two-Track Approach to Rescuing Social Science Data and Metadata

Cover image for Preserving the Whole: A Two-Track Approach to Rescuing Social Science Data and Metadata

Publication

Preserving the Whole: A Two-Track Approach to Rescuing Social Science Data and Metadata

Published
1999-06-01
Creators
Green, Ann, Dionne, JoAnn, Dennis, Martin
Citation

Loading citation...

About

Focusing on the experience of the Yale University (Connecticut) social science data preservation project, this document presents a case study of migration as a preservation strategy, exploring options for migrating data stored in a technically obsolete format and their associated documentation stored on paper. The first section provides background and a project description, an overview of the Yale Roper Collection of public opinion research data sets and paper records, and a summary of the literature search. The following nine steps of the data track are described in the second section: identify equipment; copy files from mainframe-based media to local hard disks; examine documentation; define the column binary format; develop standard variable-naming classifications; read in data with SAS (Statistical Analysis System) and SPSS (Statistical Package for the Social Sciences); identify migration formats; recode data files with SAS; and create spread ASCII data files without recoding. The next section addresses the documentation track, including software/equipment, TextBridge Pro optical character recognition software, PDF (portable document format) files from Adobe Capture, and HTML and SGML/XML marked-up files. Findings and recommendations are presented in the fourth section, including user evaluation, findings about data/documentation conversion, and recommendations to data producers. A glossary is included, and support documents are appended.

Files