K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources

Loading...
Thumbnail Image

Degree type

Discipline

Subject

Funder

Grant number

License

Copyright date

Distributor

Related resources

Author

Crabtree, Jonathan
Schug, Jonathan
Overton, Chris
Stoeckert, Christian J.

Contributor

Abstract

The integration of heterogeneous data sources and software systems is a major issue in the biomed ical community and several approaches have been explored: linking databases, "on-the- fly" integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear "winner". Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application.

Advisor

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Publication date

2001-03-01

Journal title

Volume number

Issue number

Publisher

Publisher DOI

relationships.isJournalIssueOf

Comments

Postprint version. Published in IBM Systems Journal, Volume 40, Issue 2, March 2001, pages 512-531. Publisher URL: http://search.ebscohost.com/login.aspx?direct=true&db=aph&AN=4628447&site=ehost-live

Recommended citation

Collection