Biohackers encoded malware in a strand of DNA

When biologists synthesize DNA, it is assumed that they take care not to create or spread a dangerous piece of genetic code that could be used to create a toxin or, even worse, an infectious disease. However, a group of biohackers showed how DNA can carry a less expected threat – a threat designed to infect not humans or animals but computers.

In a new study that they intend to present at the USENIX Security conference, a group of researchers from the University of Washington has demonstrated for the first time that it is possible to encode malicious software into physical DNA strands, so that when a gene sequencing device analyzes it, the resulting data can be converted into a program that corrupts the gene sequencing software and takes control of the underlying computer. Although this attack is far from practical application for any real spy or criminal, the researchers argue that it could become more likely over time, as DNA sequencing becomes more common, more powerful, and is carried out by third-party services on sensitive computer systems. And, perhaps more importantly for cybersecurity, it also represents an impressive, science fiction-like feat of pure hacker ingenuity.

“We know that if an adversary has control over the data that a computer processes, they can potentially compromise the computer,” says Tadayoshi Kohno, a computer science professor at the University of Washington who led the project, comparing the technique to traditional hacker attacks that package malicious code on websites or in email attachments. “This means that when you’re examining the security of computational biology systems, you don’t just think about network connectivity and the USB drive and the user at the keyboard, but also the information stored in the DNA to be sequenced. This is about examining a different category of threats.”

Sci-Fi Hack

For now, this threat remains more of a plot point in a novel than something that should concern biologists. However, as genetic sequencing is increasingly handled by central services—often by university laboratories that possess the expensive gene-sequencing equipment—this trick involving malicious software transmitted by DNA is becoming increasingly realistic. Particularly given that DNA samples come from external sources, which may be difficult to properly verify.

If hackers managed to pull off the trick, researchers say they could potentially gain access to valuable intellectual property or possibly alter the genetic analysis. Companies could even potentially embed malicious code into the DNA of genetically modified products, as a way to protect trade secrets, researchers suggest. “There are many interesting—or threatening might be a better word—applications of this program coming in the future,” says Peter Ney, a researcher on the program.

Regardless of any practical reason for the research, however, the idea of constructing a computer attack—known as an “exploit”—using only the information stored in a DNA strand posed an epic challenge for the University of Washington team. The researchers began by writing a well-known exploit called “buffer overflow,” which is designed to fill the memory space of a computer allocated for a specific piece of data and then spill over into another part of memory to place its own malicious instructions.

But encoding this attack into actual DNA proved more difficult than initially imagined. DNA sequencing machines work by mixing DNA with chemicals that bind differently to the basic units of DNA code—the chemical bases A, T, G, and C—and emit different colors of light, which are captured in a photograph of the DNA molecules. To speed up processing, images of millions of bases are divided into thousands of pieces and analyzed in parallel. Thus, all the data that comprised their attack had to fit into a few hundred of these bases, to increase the likelihood that they would remain intact throughout the sequencer’s parallel processing.

When the researchers sent their carefully designed attack to the Integrated DNA Technologies service in the form of As, Ts, Gs and Cs, they discovered that DNA also has other physical constraints. To keep their DNA sample stable, they had to maintain a specific ratio of Gs and Cs to As and Ts, because the physical stability of DNA depends on a regular ratio of A-T and G-C pairs. And while a buffer overflow often involves using the same data sequences repeatedly, doing so in this case caused the DNA strand to fold back on itself. All of this meant that the team had to repeatedly rewrite the exploit code to find a form that could survive as actual DNA, which Integrated DNA Technologies would eventually send them in a plastic vial through the mail.

The result, ultimately, was attack software that could survive translation from physical DNA to digital form, known as FASTQ, which is used to store DNA sequences. And when this FASTQ file is compressed with a common compression program, known as fqzcomp, the malicious software exploits this compression software by exploiting a buffer overflow, escapes from the program, and infiltrates the memory of the computer running the software to execute its own arbitrary commands.

A distant threat

But even then, the attack was fully translated only 37% of the time, as the sequencer’s parallel processing often cut it short or the program decoded it backwards. (A DNA strand can be sequenced in both directions, but the code is meant to be read in only one direction. The researchers suggest in their work that future, improved versions of the attack could be configured as a palindrome).

[…]

Beyond hacking, though, the use of DNA for handling computer information is gradually becoming a reality, says Seth Shipman, a member of a Harvard team that recently encoded a video into a DNA sample. This storage method, although mostly theoretical for now, could someday allow data preservation for hundreds of years, thanks to DNA’s ability to maintain its structure far longer than magnetic encoding in flash memory or on a hard drive. And if DNA-based computer storage arrives, DNA-based computer attacks might not be so far off, he says.

“I read about this research and I think it’s clever,” Shipman says. “Is it something we should start checking for now? I doubt it.” But he adds that, with the era of DNA-based data looming perhaps on the horizon, the ability to place malicious code in DNA is something more than a hacker’s trick. “At some point, when more information will be stored in DNA and will be constantly imported and sequenced,” Shipman says, “we’ll be happy we started thinking about these things.”

Source: Wired, April 2017
Original: https://www.wired.com/story/malware-dna-hack
Translation Harry Tuttle