Reconstructing Strings from Random Traces

Loading...
Thumbnail Image

Related Collections

Degree type

Discipline

Subject

Funder

Grant number

License

Copyright date

Distributor

Related resources

Author

Batu, Tugkan
McGregor, Andrew

Contributor

Abstract

We are given a collection of m random subsequences (traces) of a string t of length n where each trace is obtained by deleting each bit in the string with probability q. Our goal is to exactly reconstruct the string t from these observed traces. We initiate here a study of deletion rates for which we can successfully reconstruct the original string using a small number of samples. We investigate a simple reconstruction algorithm called Bitwise Majority Alignment that uses majority voting (with suitable shifts) to determine each bit of the original string. We show that for random strings t, we can reconstruct the original string (w.h.p.) for q = O(1/ log n) using only O(log n) samples. For arbitrary strings t, we show that a simple modification of Bitwise Majority Alignment reconstructs a string that has identical structure to the original string (w.h.p.) for q = O(1/n1/2+ε) using O(1) samples. In this case, using O(n log n) samples, we can reconstruct the original string exactly. Our setting can be viewed as the study of an idealized biological evolutionary process where the only possible mutations are random deletions. Our goal is to understand at what mutation rates, a small number of observed samples can be correctly aligned to reconstruct the parent string. In the process of establishing these results, we show that Bitwise Majority Alignment has an interesting selfcorrecting property whereby local distortions in the traces do not generate errors in the reconstruction and eventually get corrected.

Advisor

Date of presentation

2004-01-11

Conference name

Departmental Papers (CIS)

Conference dates

2023-05-16T22:32:05.000

Conference location

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Volume number

Issue number

Publisher

Publisher DOI

relationships.isJournalIssueOf

Comments

Postprint version. Copyright ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms 2004 (SODA 2004), pages 910-918. Publisher URL: http://portal.acm.org/dl.cfm


Postprint version. Copyright ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 15th Annual Symposium on Discrete Algorithms 2004 (SODA 04), pages 910-918. Publisher URL: http://www.siam.org/meetings/da04/index.htm

Recommended citation

Collection