Since its discovery, researchers have hailed Cas9 — a protein “machine” that can be programmed by a strand of RNA to target specific DNA sequences and to precisely cut, paste, and turn on or turn off genes — as a potential key to unlocking a host of new treatments and therapies for genetic conditions, but only if they fully understand how it works.

That’s where David R. Liu and his students Vikram Pattanayak and John Guilinger come in.

Liu, a Harvard professor of chemistry and chemical biology and an investigator with the Howard Hughes Medical Institute, joined with Professor Jennifer Doudna of the University of California, Berkeley, to lead an effort to develop a detailed “specificity profile” for Cas9 — data to reveal how accurately Cas9 can home in on the DNA sequence it is programmed to target, and how susceptible the protein/RNA complex is to acting on decoy off-target sequences instead. The work was described in a paper published last month in Nature Biotechnology.

“A major issue that will determine the extent to which technologies like Cas9 are useful research tools, especially for human therapeutics, is how specific they are,” Liu said.

“It’s widely understood that the ability to manipulate the structure of our genomes has the potential to have a profound impact on human health,” he added. “But before you give a patient some treatment that will change their genes, you need to be very confident that it’s not going to have unintended effects elsewhere, because the difference between cutting an on-target site and an off-target site could mean treating a disease or triggering the development of cancer.”

The Cas9 research is just the latest effort by Liu and colleagues to characterize such tools. A paper published last year outlined the specificity profile for a similar genetic tool, called a zinc finger nuclease, and the lab has also submitted for publication a third paper — on a genome-engineering technique called TALEs (transcription activator-like effectors).

While it holds promise for manipulating the human genome, the Cas9 system originated as the basis for the immune system in bacteria.

Unlike the human immune system, which produces antibodies to protect against disease, bacteria incorporate a small part of a pathogen’s DNA into their own genome. Using that segment of foreign DNA to program the Cas9 machine, bacteria can fight off later infections.

Using Cas9 in the lab begins with researchers identifying a unique site of interest, typically between 20 and 23 base pairs in length from the billions that make up the genome.

Researchers then design a single strand of “guide” RNA that matches only the target segment of DNA. When the Cas9 machine and the guide RNA encounter the target DNA site, Cas9 performs its function. Cas9 cuts DNA naturally, but researchers have engineered Cas9 variants that instead turn on or turn off the expression of targeted genes.

The problem, Liu explained, is that although the Cas9/RNA complex in theory can bind only to a specific location in the genome, its accuracy had never been fully studied.

“We can design RNA that is a perfect match for a specific DNA locus, but your genome is huge,” he said. “Somewhere else in your genome there will be a sequence that might be very similar. The question is: If there is another site in your genome that differs from the target sequence by just a single base, or two bases, or four bases, will Cas9 still cut these off-target sites?”

Researchers also addressed a number of other questions, including whether multiple mismatches at various locations in the targeted DNA were tolerated, whether mismatches of the 20-base pair sequence mattered more or less based on their position across the guide RNA, and how the architecture of the guide RNA might affect the system’s accuracy.

They found that while the entire RNA sequence is important in delivering Cas9 to the proper location, the tolerance for errors can change depending on what site in the genome is targeted and where the mismatches occur across the guide RNA sequence.

Perhaps most importantly, he said, researchers found that both increasing the concentration of Cas9 and changing the architecture of the guide RNA to increase its activity led to a decrease in accuracy.

“A key message is that there is a tradeoff between activity and specificity,” Liu said. “That’s an important lesson, because when scientists develop these tools into — hopefully — therapeutics, they need to make sure that as they improve the activity they’re not introducing new off-target cleavage effects.”

To avoid those problems going forward, Liu and other researchers are working to engineer versions of Cas9 — as well as zinc fingers and TALEs — that are even more accurate than the system evolved by nature.

If Cas9 becomes a valuable tool for research and future therapies, Liu said, a detailed understanding of the principles that determine its DNA targeting specificity will play a key role.

“For obvious reasons, scientists and doctors are very wary of manipulating our genome in ways that are unknown to us,” Liu said. “This data highlights the need to be very careful because some sequences with several mismatches can still be targeted by Cas9.”