BEACON Researchers at Work: What do evolution, cancer, and optical character recognition have in common?

This week’s blog post is by MSU postdoc David Knoester.

Photo of Dave KnoesterIn 2012, cancer accounted for about 1 of every 4 deaths in the United States. That’s 1,500 people each day. By 2020, annual cancer deaths are expected to increase to 40 million people worldwide. (That’s right — 40 Million per year.) The economic ramifications of cancer are similarly staggering: In 2007, the overall cost of cancer in the US is pegged at $227 billion. Odds are good that cancer will, in some way, affect your life. (All statistics: Cancer facts and figures.)

One of the many factors contributing to cancer survival is early detection. Early detection, of course, requires frequent screening. This sounds like a simple thing — But it’s not. Malignant and benign masses tend to look remarkably similar when viewed in X-Ray (see images below), and they can both be difficult to distinguish from normal tissue. For example, even in the case of breast cancer (one of the “easier” cancers), between 10-30% of cancers are not detected (see here, here, and here), and 70% of biopsies following a suspicious mammogram come back negative. More frequent screening also increases physician workload, which in turn raises the cost of health care.

fig2
Example mammograms showing benign mass (left) and malignant tumor (right). Images are quite similar (delineations of masses were added manually). Images from Nandi et al 2006.

I’m researching the use of evolutionary algorithms (EAs) to help detect and diagnose breast cancer. Why do I think EAs could help? Well, for the past few months I’ve been using EAs to detect and identify handwritten characters in raw image data (also known as optical character recognition, or OCR). This is similar to what the post office has to do in order to route mail, but a bit more general. Using a novel EA that discovers Markov networks, we’ve been able to perfectly discriminate among different characters, and correctly classify those images with high accuracy.

However, one of the nice things about using an evolutionary algorithm for a data mining task like this is that we can apply the same technique to different datasets. For example, with few changes to the evolutionary algorithm itself, we’re able to swap out the database of handwritten numerals for Chinese characters and even for human faces.

fig3Example images from the MNIST database of handwritten numerals (A), the CASIA Chinese character database (B), and the ATT Olivetti face database (C). The actual databases contain many more images than those shown here.

In a BEACON project that spans four different departments at MSU, including investigators from Computer Science, Microbiology and Molecular Genetics, Clinical and Translational Sciences, and Radiology, we hope to extend this approach to detect and diagnose breast cancer in digital mammograms. We’ve just started this project, and are excited about the potential benefits to society, both in survival rates and reduced health care cost, not to mention sparing people anxiety, discomfort, and expense.

For more information about Dave’s work, you can contact him at dk at msu dot edu.

This entry was posted in BEACON Researchers at Work and tagged , , , , . Bookmark the permalink.

Comments are closed.