In the last 10 years, it’s become far more common for physicians to keep records electronically. Those records could contain a wealth of medically useful data: hidden correlations between symptoms, treatments and outcomes, for instance, or indications that patients are promising candidates for trials of new drugs. Much of that data, however, is buried in physicians’ freeform notes. One of the difficulties in extracting data from unstructured text is what computer scientists call word-sense disambiguation. In a physician’s notes, the word “discharge,” for instance, could refer to a bodily secretion — but it could also refer to release from a hospital. The ability to infer words’ intended meanings makes it much easier for computers to find useful patterns in mountains of data.
Graduate student Rachel Chasin is a co-authour of the paper concerning algorithmically distinguishing words with multiple possible meanings in medical records; postdoc Anna Rumshisky led the research with Peter Szolovits, MIT professor of computer science and engineering and health science and technology, and Özlem Uzuner, a research affiliate. Continue reading the article on MITnews.