DNA Data Storage: The Future of Archival Solutions

The global data explosion is staggering. In the past two years, ninety per cent of the world’s digital data has been created. With the rise of search engines, social media, intelligent vehicles, and the Internet of Things, data production continues to surge.

As digital data storage approaches its physical and economic limits, scientists are exploring innovative solutions to address the growing data crisis. DNA data storage is one of the most promising technologies under investigation, which could revolutionise how we archive information, offering long-term stability and immense storage capacity.

DNA data storage is an innovative digital information encoding into synthetic DNA molecules. This technology leverages the natural properties of DNA, which can store vast amounts of data in a compact form. For instance, the DNA contained within all cells in a human body could theoretically store all the movies created in the 21st century three billion times over. This method is a potential solution to the growing data crisis, as traditional storage methods like magnetic and optical media are reaching their limits in capacity and durability.

The Data Storage Crisis

The global data explosion is staggering. In the past two years, ninety per cent of the world’s digital data has been created. With the rise of search engines, social media, intelligent vehicles, and the Internet of Things, data production continues to surge. For example, Google receives around 3.5 billion search requests daily, WhatsApp users exchange up to 65 billion messages, and Tesla drivers accumulate 3 billion miles of data through autopilot systems.

The International Data Corporation projects that by 2025, global data storage demand will reach 175 trillion gigabytes, exceeding the capabilities of current storage technologies such as magnetic tapes, optical discs, and hard drives. These devices are nearing their limits, are prone to damage, and have short lifespans. Consequently, organisations incur high costs in migrating data between outdated and newer systems. Cloud-based storage presents its own challenges, including high operational costs, energy consumption, and ongoing migration expenses.

As of 2020, global data storage capacity is estimated at “64 zettabytes”, with projections reaching “175 zettabytes” by 2025. Storing this volume requires significant energy; data centres consume about “200 terawatt-hours” of electricity annually, roughly equivalent to Argentina’s energy consumption. To accommodate the projected 175 zettabytes, approximately “8.75 billion hard drives” would be needed, highlighting the immense scale of the data storage crisis.

DNA: A Revolutionary Storage Medium

Synthetic DNA presents a compelling alternative to conventional storage methods. DNA is a remarkably dense medium, capable of holding vast amounts of information. For perspective, the DNA in a single human cell could store all the movies produced in the 21st century more than three billion times.

The concept of using DNA for data storage was first proposed by Richard Feynman in 1959. The idea gained traction in 2012 when researchers at Harvard University, led by George Church, successfully encoded a 52,000-word book into DNA strands. This breakthrough demonstrated that DNA could indeed store digital data.

More recent advancements have further validated this technology. In 2017, scientists Janaf Ehrlich and Dina Zielinski developed a new coding system that enhanced DNA storage capacity by encoding DNA into tiny tagged droplets. This work was refined by Twist Bioscience, which created a platform to print DNA on silicon chips. Today, DNA technology can store an impressive 215 million gigabytes of information per gram, vastly outperforming traditional hard drives.

How DNA Storage Works

DNA data storage involves two primary processes: writing and reading the data. Writing begins with translating binary code into DNA sequences, where adenine (A), cytosine (C), guanine (G), and thymine (T) represent binary values. These sequences are synthesised using enzyme catalysts or chemical reactions and then indexed.

To read the data, the DNA is sequenced by machines designed for genome sequencing. The encoded data is decoded back into its original digital format using error-correction algorithms to ensure accuracy.

DNA’s stability is a significant advantage. Under optimal conditions—such as being stored in cold environments—DNA can remain intact for hundreds of thousands of years. This surpasses the longevity of conventional storage media, making DNA a viable option for long-term archival.

Example of How an Ordinary Person Will Use It

Imagine you have a lifetime of family photos and videos that you want to preserve indefinitely without worrying about hard drive failures or cloud subscriptions. With DNA data storage, you could upload these digital files to a service that converts them into DNA sequences. The service would store this DNA in a secure, stable environment. When you want to access your files, the service will decode the DNA back into a digital format and provide it to you through a secure download link.

Current Challenges and Future Prospects

Despite its potential, DNA data storage faces several challenges. The cost of DNA synthesis and sequencing remains high, and current methods are prone to errors due to their reliance on organic chemistry techniques designed for other purposes. However, ongoing research and technological advancements are expected to address these issues.

By 2024, DNA data storage technology has advanced with notable breakthroughs. Twist Bioscience is advancing this field by synthesising over one million synthetic DNA pieces on a single silicon chip and developing next-generation chips capable of writing 10 gigabytes of DNA per chip. This progress could significantly reduce storage costs. Automated storage and retrieval systems have been refined, minimising errors, while new encoding and decoding algorithms enhance data accuracy, making DNA storage increasingly viable for long-term archival use.

Beyond Data Centres: The DNA of Things

Innovative applications of DNA storage are emerging. In Israel, researchers Robert Grass and Eunice Ehrlich are exploring the concept of “DNA of Things” (DoT), where everyday objects could be embedded with DNA to store information. For example, a 3D-printed plastic bunny was created with a digital blueprint encoded in its DNA, and a YouTube video was embedded in a pair of glasses.

These experiments highlight the potential of DNA storage to transform not only data archiving but also how we interact with everyday objects, allowing them to retain and convey information long after traditional storage methods have become obsolete.

Conclusion

DNA data storage presents a promising solution to the growing data crisis due to its high storage density, long-term stability, and low energy consumption. Ongoing research and technological advancements are advancing its adoption. Companies such as Microsoft are planning to integrate DNA storage into their cloud services soon. Advances in error correction codes and enzymatic synthesis are expected to enhance efficiency and reduce costs. The technology’s capability to store 74 million million bytes in a DNA archive the size of a poppy seed underscores its potential. With continued investment and research, DNA storage could become a mainstream alternative to traditional data centres, meeting the demand for effective storage solutions.

Aric Jabari is a Fellow at the Sixteenth Council