To be clear, we’re not talking about storing DNA itself. DNA is actually
surprisingly stable and, under the right conditions, can last thousands of
years without degrading. No, the idea is to use DNA as a medium for storing
other kinds of information. For example, Nick Goldman and his colleagues from
the European Bioinformatics Institute have used DNA to store all of
Shakespeare’s sonnets, a color photograph and a sound recording of Martin
Luther King Junior. The authors believe that their new technique could one day
solve all our data archiving needs in perpetuity.
The idea of using DNA to store information is not new. DNA
has long been thought an attractive data depository because all it requires for
long-term maintenance is a cool, dark environment. You can also fit an amazing
amount of data in a small space. The authors estimate that all the data that’s
ever been created could fit in the back of one pick-up truck. And best
of all, because the nucleotides don’t change, unlike cassette tapes and DVDs, the same decoding technology
should work a thousand years from now.
To use DNA for data storage, you simply manufacture DNA
using the sequence of As, Ts, Gs and Cs as a code to spell out whatever you
wish. To be clear, these synthetic strands of DNA will not encode any genes.
That is, like magnetic tape or ink, they will not have any function other than
to store or retrieve information. Unfortunately, at this time, it’s exceedingly difficult to
synthesize DNA that’s much longer than a hundred bases long, barely enough for
a sentence. Almost any data file would have to be broken into a huge number of
pieces that would then have to be faithfully joined together. Goldman and his
colleagues improved upon this both by creating a novel code and by using
four-fold redundancy.
Briefly, the researchers took the information to be
DNA-itized (a sonnet in the example below) and converted it first into binary code (shown blue below) and then
into a novel trinary code (0s, 1s and 2s, shown in red) where each digit is represented by two
nucleotides. The resulting sequence of DNA (green) was synthesized in short overlapping
fragments, so that each data point was found in four distinct pieces. Each
fragment of DNA contained tags indicating where to fit it in order to
regenerate the original sequence. The high degree of redundancy ensured accurate
retrieval.
Nature PMID: 23354052.
The scientists were able to send their DNA from the U.S. to Germany, where it was correctly reconstructed and decoded.
Nature PMID: 23354052.
The scientists were able to send their DNA from the U.S. to Germany, where it was correctly reconstructed and decoded.
As of now, even this new method of DNA storage is far too
expensive and has too slow a retrieval rate to be of any practical use. The
authors have every expectation that this will change. Perhaps in as little as
ten years, DNA will be the medium of choice for our data storage needs.
Goldman N, Bertone P, Chen S, Dessimoz C, Leproust EM, Sipos B, & Birney E (2013). Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature PMID: 23354052.
No comments:
Post a Comment