This blog post is by University of Idaho graduate student Clinton Elg.
Evolution of a Deadly Bacteria
Vibrio cholerae is bacteria that resides in water and causes deadly cholera disease. While areas of the world with functional sewage and potable water are largely unaffected, there is still no definitive cure for the disease. It remains rampant in less developed regions and often acts as a deadly second act after natural disaster and wars destroy infrastructure.
Bacteria are constantly evolving, and this includes illness-causing bacteria like V. cholera. Evolution leads to new major outbreaks called pandemics, with each new pandemic strain of bacteria acting like an updated software version which outcompetes and outperforms older versions. V. cholerae is now in its seventh pandemic, and the latest strain named “El Tor V. cholerae” contains two unique pieces of DNA not found in earlier pandemics. Scientists have named these unique pieces of DNA Vibrio Seventh Pandemic Island (VSP) I & II. The VSP’s contains around 33 genes, with each gene a DNA “blueprint” that the bacteria will convert into a protein “machine”. What kind of machinery do these VSP genes encode for? How does this new cellular machinery help El Tor V. cholerae outcompete older pandemics of the disease?
Breakthrough at Michigan State University and Tufts University
In 2018, a remarkable discovery was made by PhD Candidate Geoffrey Severin and Miriam Ramliden, in the Chris Waters lab at Michigan State University (MSU) and the Wai-Leung Ng lab at Tufts University, respectively. It had been known that a VSP gene named dncV was important for El Tor V. cholerae to cause disease, but the reason remained elusive. The team discovered that increasing the number of “blueprint” copies of dncV within El Tor V. cholerae produced smaller populations of bacteria when grown on solid surfaces. These bacteria populations are called “colonies”, and scientists call colonies that shrink “small colony variants”. They later discovered that dncV encodes a molecular “switch” that activates the cell shrinking machinery of another VSP gene called capV. In the larger picture, this small colony variation may help explain why this is the leading strain sickening people around the world and is an early clue to unraveling the novel functions of the VSP’s in El Tor V. cholerae.
Building a Fence (Around Gene Networks)
Despite this scientific home run, much work remains. Imagine a human analogy: the tools required to create a fence include a hammer, nails, wood, level, chalk string, shovel, and concrete. From looking at these tools piled on the ground, one might reasonably predict that somebody is planning to build a fence. In the VSP’s of El Tor V. cholera, we have a pile of 33 new and strange tools and we only know what 2 of them do! Now imagine the tools for fence building were jumbled in a pile of other random tools. Without knowing what the tools are, or what tasks they might accomplish, how would you pick out the seven tools and their association with fence building? In a biological sense, we have 31 unknown genes in the VSP’s of El Tor V. cholera that group into an unknown number of “gene networks”. Each gene network is a group of genes that work together to accomplish a specific cellular task.
Geoff and Chris decided that they needed a way to predict the number of gene networks and the genes that constitute each network. They reached out to Dr. Eva Top at University of Idaho and began a collaboration with her PhD student Clint Elg from the Bioinformatics and Computation Biology (BCB) program.
Using The Math Behind Facebook to Predict Gene Networks
To provide predictions of the gene networks in the VSP islands of El Tor V. cholera, Clint turned to what may seem an unlikely place: Facebook. Have you ever been on Facebook and seen a new friend recommended to you? Underlying this are complex mathematical models that predict social circles like your family or your co-workers. The predictions are made by seeing how mutually related (or “correlated”) you are to another user profile: the more you post or respond to another person, the higher your mathematical correlation.
In a similar fashion, we can use the thousands of bacterial DNA genomes available on the internet to see how often certain genes correlate with each other. Instead of predicting their social networks, we are predicting their gene networks! For example, consider the two genes found by Geoff, Wai-Leung, and Chris that provide small colony morphology, vc0178 and vc0179. These genes do nothing individually, but when found together they allow a change in bacteria size. Since evolution selects for DNA that provides some sort of advantage, we should expect these two genes to co-reside in bacterial genomes at a much higher rate than two randomly chosen genes.
The result is an alpha version of software, correlogy, built with the help of mathematician Ben Riddenhour from the Institute for Modeling Collaboration and Innovation (IMCI). Correlogy predicted vc0178 and vc0179 to be highly correlated by using data from thousands of bacterial genomes, matching what Geoff and Chris had biologically demonstrated in the lab! More importantly, the software has predicted gene networks for the remaining 31 VSP genes of unknown function and interactions. These predictions give protein specialists like Geoff and Chris a place to start investigating the VSP genes which fuel the modern Seventh Cholera Pandemic.
BEACON and Collaborative Science
Our research into the deadly disease of cholera is making important discoveries. We would like to express gratitude to the tax-payer funded National Science Foundation (NSF) and particularly the BEACON program. The NSF BEACON program enabled important insight into the evolution of a lethal bacteria by encouraging and funding a meaningful collaboration between biologists, protein specialists, computer scientists, and mathematicians.