How much information can DNA contain?

«DNA is like a computer program, but far, far more advanced than any other program ever created.»

(Bill Gates)

DNA is a long, filamentous molecule, curled up in the nucleus of cells. It contains the information needed for the construction of living beings, therefore a kind of manual, whose “chapters” are called chromosomes and the “paragraphs” genes. With the support of this extremely stable molecule, much more reliable than any other optical or magnetic medium for storing data, nature has found a foolproof way to store information.

Life is based on the use and maintenance of a genetic memory, inherited from the previous generation, which allows, in addition to living and reproducing, also to transmit this inheritance to the next generation.

What if, in the world of big data, DNA molecules became a more sustainable way to store digital data?

A single milligram of DNA would be enough to store the text of all the books in the world’s largest library, and that would leave room. As a comparison, one gram of DNA can hold up to 700TB of data, which is equivalent to 14,000 50GB Blu-ray discs, or over 200 3TB hard drives.

Researchers at the University of Washington and Microsoft Research have developed a fully automated system for writing, storing and reading data encoded in DNA. Several companies, including Microsoft and Twist Bioscience, are working to improve DNA storage technology. As early as 2017, for example, Church’s group at Harvard adopted CRISPR DNA-editing technology to record images of a human hand in the E. coli genome, which were read with more than 90% accuracy. Researchers at the Technion-Israel Institute of Technology in Haifa and the Interdisciplinary Center (IDC) in Herzliya, on the other hand, have shown that they can store information with a density greater than 10 petabytes (one petabyte is one million gigabytes) in a single gram of DNA by significantly improving the writing process. To understand the entity we could say that it allows you to store all the information stored on YouTube in the volume of a single teaspoon of DNA.

At the end of 2021, the Italian Institute of Technology (IIT), coordinated a project called DNA-Fairylight, led by Roman Krahne and Denis Garoli of the IIT, in which an international team of scientists participates, and funded by the Union European Union, 3.1 million euros for the next three years. DNA-Fairylight aspires to combine DNA synthesis and sequencing technologies with the optical properties of nano-materials: in this way, scientists expect to obtain DNA sequences integrated by colored nano-lights, which will allow reading and writing processes of faster data and more efficient encryption systems. It is essentially based on the synthesis of DNA molecules starting from the digital data that must be archived, in which the binary code (made up of 0 and 1) is converted into the code of the four DNA letters (A, C, G, T – or Adenine, Cytosine, Guanine and Thymine, chemical elements of which it is composed). When you want to go back to digital data, the organic molecules are read, encoded and transformed back into digital information.

In the last ten years, many steps forward have been made in this field: both scalability and practicality in archiving have improved, as well as the optimization of algorithms for encoding and data recovery. However, DNA data storage still requires high costs and DNA reading and writing speed is still too slow to compete with electronic storage: if a millisecond is enough to read data from NAND memory, to extract DNA data takes hours.

The result to which DNA-Fairylight aspires is precisely this: using nano-particles colored and integrated into a DNA sequence, scientists plan to write, read and decode the information present in a faster, more compact and efficient way.

Exploiting DNA therefore means being able to count on a high-density data storage medium, which can be used to innovate biosensing and bioregistration technology and of course to try to retire traditional storage units, with a almost unlimited time.

How much information can DNA contain?

Contatti

AEBiosystem