Benyuan Liu was in a remote region of Peru in 2013 when he realized just what kind of an impact he could have on people’s health. The computer science professor was laying the groundwork for a digital health tool that aims to accelerate and improve the diagnosis of tuberculosis (TB), a curable bacterial infection that kills 1.6 million people globally each year, according to the World Health Organization.
“It was an eye-opening experience for me to see how technology can be used to improve the quality of health care,” says Liu, whose parents were both medical doctors. “It opened the door for me to develop technical solutions to address important health problems in our society.”
It also helped spawn UML’s Center for Digital Health, which Liu co-directs with Computer Science Assoc. Prof. Yu Cao. Started in 2016 with a $125,000 grant from the UMass President’s Science & Technology Initiatives Fund, the center brings together computer scientists, biostatisticians, epidemiologists, clinical practitioners, biomedical researchers and information security specialists from UML, UMass Chan Medical School and UMass Boston. Their goal is to use digital technology such as cloud computing, Big Data analytics, sensor monitoring and mobile devices to improve the quality, efficiency and effectiveness of public health care.
Their recently completed TB diagnostic tool, called “eRx,”was developed with input from U.S. and Peruvian physicians, clinicians and other public health professionals. The web-based system enables nurses and health care workers at remote TB clinics to send a patient’s digitized chest X-rays to a cloud-computing server via a smartphone app. A pulmonary specialist can log in to view the images remotely on a computer or tablet and make an immediate diagnosis.
The team also created a database of over 10,000 X-ray images—“one of the largest and best-annotated” of its kind, Cao says — which it used to develop a machine-learning algorithm that can automatically analyze new X-rays and assist physicians in identifying possible signs of TB.
“It can be a game-changer for TB diagnosis,” Cao says of the project, which was funded by a four-year, $1.3 million grant from the National Institutes of Health and the National Science Foundation through the interagency program Smart and Connected Health.
The hope, Liu and Cao say, is that the eRx system can be adapted to better diagnose other infectious diseases such as COVID-19.
“The idea is to develop a social technology platform for ‘citizen scientists’ in the community to monitor the drinking water safety,” explains Liu, who is working on the project with Cao and Computer Science Asst. Prof.
Mohammad Arif Ul Alam. “Using a smartphone app, they can upload the results to the cloud. We will develop a machine-learning algorithm to help identify where the source of contaminants is coming from in the water pipe systems.”
For Cao, projects such as these are an exciting opportunity to apply his theoretical research.
“When you’re contributing to lifesaving tools and infrastructures for health care, you feel you’re making a real impact,” he says. “It’s not just your paper contribution, but also really impacting human life.”
Following his controversial $44 billion acquisition of Twitter, Elon Musk tweeted that he bought the social media platform to “help humanity.”
Researchers from the Kennedy College are a step ahead of him. Second-year computer science Ph.D. student Vijeta Deshpande has been working with Prof. Hong Yu on a project that uses natural language processing and machine learning to analyze Twitter data and create an algorithm that can predict adverse health outcomes at the community level. Their tool could eventually be used by public health officials to direct resources and plan interventions in advance of a crisis.
“I love this project,” says Yu, who hopes that it receives NIH funding to match the support it has already garnered from Chancellor Julie Chen and U.S. Rep. Lori Trahan.
The work stems from a project that Yu and her students took on during the early days of the COVID-19 pandemic, when they noticed more people experiencing mental health issues and food insecurity.
“We wanted to use AI to help,” says Yu, whose team began by mapping the geolocations of the nearly 40,000 food pantries across the U.S. Using data from the 2010 U.S. Census, they then analyzed the socioeconomic status of more than 200,000 “block groups” across the country, a subdivision of census data that zooms into population clusters as small as 600 people and provides a more “homogenous” view than city or county data provides.
“In New York City, for instance, two different blocks can have a huge income disparity,” Yu says.
Their soon-to-be-published findings show that in some rural and urban communities, “it’s the rich neighborhoods that have access to food pantries, while the poor neighborhoods have much less access,” Yu says. “There is a great deal of disparity.”
Knowing that census figures can be quickly outdated, however, Yu’s team then turned to Twitter (and its millions of active U.S. users) for better real-time data on mental health and food insecurity.
“There is a lot of diversity in the population that uses Twitter, so we have a good representation. And the data accessibility is great,” says Deshpande, who notes that Twitter allows academic bodies to download 10 million tweets per month for research purposes.
So far, Deshpande has analyzed 30 million tweets from 1,000 block groups across the country. He began by targeting tweets with the keywords “mental health” and “food insecurity,” then showed a positive correlation to the existing survey data on those topics in the corresponding block groups. He took the model a step further by augmenting it for tweets that mention social determinant factors of health, such as housing insecurity, which only improved the results.
“We were quite excited to see that we are able to get better results with the social determinant data,” says Deshpande, who went on to examine the full text content of the tweets using neural networks and natural language processing, an approach that Yu says reproduces survey data at nearly 80% accuracy.
“If we can train the neural network for future years, we can detach ourselves from conducting surveys and make quick predictions about health outcomes,” Deshpande says.
As co-director of CHORDS, Yu has worked on AI projects that help detect physician errors and identify people at risk of suicide. While she could never be a medical doctor, she is proud of the impact she is making on people’s lives.
“This field is full of golden opportunities,” she says. “We can save lots of lives because of AI. It is possible we could have a way to cure cancer and other diseases. All these major advances in medicine, to a large extent, are because of AI.”
Harnessing Big Data To Fight Cancer
In her Computational Cancer Biology Lab, Asst. Prof. of Biology
Rachel Melamed connects health data to molecular data to try to understand what causes not only cancer, but also other diseases such as Alzheimer’s and diabetes.
“All we do is look at data,” says Melamed, who mines datasets from health records, biobanks, cancer genomics projects and experimental drug studies to investigate how a disease’s development might be influenced by other health conditions and drug combinations.
“Once you have all that information on people, you can try to figure out all these complex causes of disease, like how genetics might interact with something that happens during your life to impact your risk of having cancer,” she says. “Now, we don’t have to look at one factor at a time; we can try to look at the combinations of factors.”
In a recent study of drug combinations, for instance, Melamed and her team found that a combination of fish oil and fenofibrate, a medication used to treat abnormal blood lipid levels, could affect a person’s odds of getting cancer.
“They’re not the most common drugs, and they would have never been tested together before,” she says. “So, the only way to discover something like that is by looking at huge datasets and seeing the association between people who take these drugs and whether or not they get cancer down the line.”
Melamed’s path to becoming a computational biologist — someone who uses data analysis, mathematical modeling and computational simulations to understand biological systems and relationships — started with a bachelor’s degree in computer science from Brown University. A brief stint as a software engineer proved unfulfilling, however, so she joined an immunology lab as a data analyst.
“A lot of people who get a computer science degree go on to do software engineering and work at companies like Facebook or Google, and I was just never really interested in that,” says Melamed, who wanted to make a more “positive societal impact with my work” through public health.
She began to notice “more and more biology data being generated” and realized there was a demand for people who had the computational background to work with that data. So, she got a Ph.D. in biomedical informatics from Columbia University, where she worked on data from the Cancer Genome Atlas, a landmark program that analyzed thousands of samples from 33 types of cancer.
“It used to be that one person, maybe an M.D. at a big medical center, would gather data from their patients, and only they could use it. But now there’s a big effort to make as much data as possible and let everyone use it, which creates so much more possibility,” she says.
After a postdoc in biomedical data science at the University of Chicago, Melamed joined the Kennedy College in 2020. She teaches courses in cancer genomics and data science.
“When a lot of people think of biology, they’re like, ‘OK, what can I do with that? I could be a biology teacher or a doctor.’ And they don’t know about all this other stuff,” she says. “But there are so many companies in the Boston area doing this work and looking for computational biologists. It’s a good field for an undergraduate to pursue.