Web-based analytics have demonstrated their value in predicting the spread of infectious disease, and a new study from Mayo Clinic indicates the value of analyzing Google web searches for keywords related to COVID-19.
Strong correlations were found between keyword searches on the internet search engine Google Trends and COVID-19 outbreaks in parts of the U.S., according to a study published in Mayo Clinic Proceedings. These correlations were observed up to 16 days prior to the first reported cases in some states.
“Our study demonstrates that there is information present in Google Trends that precedes outbreaks, and with predictive analysis, this data can be used for better allocating resources with regards to testing, personal protective equipment, medications and more,” says Mohamad Bydon, M.D., a Mayo Clinic neurosurgeon and principal investigator at Mayo’s Neuro-Informatics Laboratory.
“The Neuro-Informatics team is focused on analytics for neural diseases and neuroscience. However, when the novel coronavirus emerged, my team and I directed resources toward better understanding and tracking the spread of the pandemic,” says Dr. Bydon, the study’s senior author. “Looking at Google Trends data, we found that we were able to identify predictors of hot spots, using keywords, that would emerge over a six-week timeline.”
Several studies have noted the role of internet surveillance in early prediction of previous outbreaks such as H1N1 and Middle East respiratory syndrome. There are several benefits to using internet surveillance methods versus traditional methods, and this study says a combination of the two methods is likely the key to effective surveillance.
The study searched for 10 keywords that were chosen based on how commonly they were used and emerging patterns on the internet and in Google News at that time.
The keywords were:
Sore throat+shortness of breath+fatigue+cough
Coronavirus testing center
Loss of smell
COVID stimulus check
Most of the keywords had moderate to strong correlations days before the first COVID-19 cases were reported in specific areas, with diminishing correlations following the first case.
“Each of these keywords had varying strengths of correlation with case numbers,” says Dr. Bydon. “If we had looked at 100 keywords, we may have found even stronger correlations to cases. As the pandemic progresses, people will search for new and different information, so the search terms also need to evolve.”
The use of web search surveillance data is important as an adjunct for data science teams who are attempting to predict outbreaks and new hot spots in a pandemic. “Any delay in information could lead to missed opportunities to improve preparedness for an outbreak in a certain location,” says Dr. Bydon.
Traditional surveillance, including widespread testing and public health reporting, can lag behind the incidence of infectious disease. The need for more testing, and more rapid and accurate testing, is paramount. Delayed or incomplete reporting of results can lead to inaccuracies when data is released and public health decisions are being made.
“If you wait for the hot spots to emerge in the news media coverage, it will be too late to respond effectively,” Dr. Bydon says. “In terms of national preparedness, this is a great way of helping to understand where future hot spots will emerge.”
Mayo Clinic recently introduced an interactive COVID-19 tracking tool that reports the latest data for every county in all 50 states, and in Washington, D.C., with insight on how to assess risk and plan accordingly. “Adding variables such as Google Trends data from Dr. Bydon’s team, as well as other leading indicators, have greatly enhanced our ability to forecast surges, plateaus and declines of cases across regions of the country,” says Henry Ting, M.D., Mayo Clinic’s chief value officer.
Dr. Ting worked with Mayo Clinic data scientists to develop content sources, validate information and correlate expertise for the tracking tool, which is in Mayo’s COVID-19 resource center on mayoclinic.org.
The study was conducted in collaboration with the Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery. The authors report no conflicts of interest.
Correlations Between COVID-19 Cases and Google Trends Data in the United States: A State-by-State Analysis. Mayo Clinic Proceedings, 20 August 2020.