About the HELIOS Project
In February 1994, Carnegie Mellon University Libraries embarked on an ambitious project to convert the bulk of the congressional papers of U.S. Senator John Heinz (R-PA) into digital format. Innovative image-management and text-retrieval software created at the university provides the ability to search and retrieve nearly 800,000 images from the collection. The project is supported by the Teresa and H. John Heinz III Foundation, Heinz Company Foundation, and the Howard and Vira I. Heinz Endowments, with additional funding from Carnegie Mellon and CLARITECH Corporation.
The HELIOS project responds to a number of critical problems:
Building the digital archive involves several processes:
- Preservation. HELIOS pioneers a digital archival method of high-resolution scanning and image storage.
- Access. Since HELIOS is available on the Internet, the collection is accessible to many people simultaneously from locations across the world at all hours. In this way the archive can better serve its primary audience-- university students and faculty members, scholars and public-policy professionals -- and extend the reach of primary source materials to new users such as legislative assistants, campaign workers, interest group members, and high school students.
- Scholarship. Researchers can explore the digital archive in new ways using powerful new text-searching and retrieval tools, including automatic, on-the-fly discovery of related terminology, and concept-based matching and relevance ranking. Moreover, HELIOS provides rich contextual information regarding the original source of the digital images.
Researchers can then access HELIOS through the Internet. HELIOS offers unprecedented facilities to search, browse and analyze the archives' holdings.
Researchers are able to:
- Preparing two exhaustive finding aids using traditional archival methods to process the source documents:
- Scanning complete folder contents of series and subseries to create digitized images.
- Converting the digitized images to searchable text (ASCII) using a process called Optical Character Recognition (OCR).
- Verifying the quality of the images and corresponding text, transcribing selected portions of handwritten notes, editing page or document attributes, adding comments at the folder and/or document level, and rearranging portions of the digital collection, if necessary.
- Indexing the documents by CLARIT, a full-text and image-management system distributed by Pittsburgh-based CLARITECH Corporation.
The Heinz collection is a rich source of information about the senator's contributions to the U.S. Congress and the social and political concerns of the nation during his tenure. The digital version of this collection promises to dramatically transform the way in which an archives can serve its users.
- Submit plain-English queries that search the collection for relevant documents and present them in order of relevance.
- View documents in both high-quality image format and ASCII text.
- Browse the archives, moving forward and backward through any level of the collection.
- Edit queries and weight search terms.
- Enhance queries with additional terms suggested by CLARIT and taken from relevant documents.
- Work with documents in the context of their original organization (subgroup, series-, subseries-, and folder-level).
- View notes prepared by the Heinz Archives' staff.
- Print copies of selected images.
Moreover, HELIOS and its underlying technology will serve as the "seed" and framework for a digital archive of wider scope: a unified public-policy resource for a widespread community of researchers. The project plans to forge links to other resources related to the themes of the Heinz collection. This will allow researchers to explore topics in-depth and from multiple perspectives, discover thematic relationships between documents from different sources, and synthesize ideas across conventional disciplinary boundaries.
For more information or to receive a brochure about the HELIOS project, please contact Gabrielle V. Michalek, Head of Archives/Digital Library Initiatives. Other resources for information include:
For help or to post your comments, please contact firstname.lastname@example.org.
Updated August 2000 / http://diva.library.cmu.edu/HELIOS/about.html
Gabrielle V. Michalek, Head of Archives/Digital Library Initiatives
Welcome to HELIOS
H. John Heinz III Archives
Carnegie Mellon University Libraries