Human-learned lessons about machine learning in public health surveillance

Presented December 13, 2018.

For public health surveillance, is machine learning worth the effort? What methods are relevant? Do you need special hardware? This talk was motivated by these and other questions asked by ISDS members. It will focus on providing practical—and slightly opinionated—advice about how to determine whether machine learning could be a useful tool for your problem.


December 21, 2018

Digital Epidemiology: designing machine learning approaches to combine Internet-based data sources to monitor and forecast disease activity in multiple locations and spatial resolutions

Presented May 24, 2018.

Mauricio Santillana, MS, PhD describes machine learning methodologies that leverage Internet-based information from search engines, twitter microblogs, crowd-sourced disease surveillance systems, electronic medical records, and historical synchronicities in disease activity across spatial regions, to successfully monitor and forecast disease outbreaks in multiple locations around the globe in near real-time.


May 24, 2018

Social Network Analysis across Healthcare Entities, Orange County, FL, 2016

In the realm of public health, there has been an increasing trend in exploration of social network analyses (SNAs). SNAs are methodological and theoretical tools that describe the connections of people, partnerships, disease transmission, the interorganizational structure of health systems, the role of social support, and social capital1.

January 19, 2018

Automated Processing of Electronic Data for Disease Surveillance

National initiatives, such as Meaningful Use, are automating the detection and reporting of reportable disease events to public health, which has led to more complete, timely, and accurate public health surveillance data. However, electronic reporting has also lead to significant increases in the number of cases reported to public health. In order for this data to be useful to public health, it must be processed and made available to epidemiologists and investigators in a timely fashion for intervention and monitoring.

January 25, 2018

Forecasting Emergency Department Admissions for Pneumonia in Tropical Singapore

Pneumonia, an infection of the lung due to bacterial, viral or fungal pathogens, is a significant cause of morbidity and mortality worldwide. In the past few decades, the threat of emerging pathogens presenting as pneumonia, such as Severe Acute Respiratory Syndrome, avian influenza A(H5N1) and A(H7N9), and Middle East Respiratory Syndrome coronavirus has emphasised the importance of the surveillance of pneumonia and other severe respiratory infections.

January 19, 2018

How can we assess the effects of urban environment on obesity using aggregated data?

Where we live' affects 'How we live'. Information about 'how one lives' collected from the public health surveillance data such as the Behavioral Risk Factor Surveillance System (BRFSS). Neighborhood environment surrounding individuals affects their health behavior or health status are influenced as well as their own traits. Meanwhile, geographical information of subjects recruited in the health behavior surveillance data is usually aggregated at administrative levels such as a county.

January 25, 2018

Animals positive for Yersinia pestis in Armenia

Plague was first identified in Armenia in 1958 when Y. pestis was isolated and cultured from the flea species Ct. teres collected from the burrows of common voles in the northwestern part of the country. In the process of digitalizing archived data, a statistical and spatial analysis of the species composition of mammals and parasites involved in the epizootic process of plague between 1958 and 2016 was performed.


January 21, 2018

Machine Learning for Identifying Relevance to Biosurveillance in Multilingual Text

Global biosurveillance is an extremely important, yet challenging task. One form of global biosurveillance comes from harvesting open source online data (e.g. news, blogs, reports, RSS feeds). The information derived from this data can be used for timely detection and identification of biological threats all over the world. However, the more inclusive the data harvesting procedure is to ensure that all potentially relevant articles are collected, the more data that is irrelevant also gets harvested. This issue can become even more complex when the online data is in a non-native language.

January 25, 2018

Leveraging Discussions on Reddit for Disease Surveillance

In recent years, individuals have been using social network sites like Facebook, Twitter, and Reddit to discuss health-related topics. These social media platforms consequently became new avenues for research and applications for researchers, for instance disease surveillance. Reddit, in particular, can potentially provide more in-depth contextual insights compared to Twitter, and Reddit members discuss potentially more diverse topics than Facebook members. However, identifying relevant discussions remains a challenge in large datasets like Reddit.

January 21, 2018


Contact Us


288 Grove Street, Box 203
Braintree, MA 02184
(617) 779 0880

This Knowledge Repository is made possible through the activities of the Centers for Disease Control and Prevention Cooperative Agreement/Grant #1 NU500E000098-01, National Surveillance Program Community of Practice (NSSP-CoP): Strengthening Health Surveillance Capabilities Nationwide, which is in the interest of public health.

Site created by Fusani Applications