Interacting with Web Data using R: Web APIs and Web Scraping (including RSelenium)

Description: 

Presented April 22, 2019.

Scientists who use data to gain insight for their operational or research interests often need to extract data from web pages or APIs from time to time. While this process can be completed manually, it can take orders of magnitude longer to complete without automation/scripting techniques, especially if a task becomes routine and must be executed on a recurring basis. This webinar will demonstrate working with an API from R to extract information from healthdata.gov. We will also demonstrate scraping static web content using the rvest package, and also how to scrape static content by driving a web browser using RSelenium. Real time demos navigating the websites we scrape will be given, and resources for learning how to navigate a website’s structure (document object model, DOM) using CSS and Xpath will be provided.

Presenter

Spencer George Lourens, Indiana University 

Primary Topic Areas: 
Original Publication Year: 
2019
Event/Publication Date: 
April, 2019

April 25, 2019

Contact Us

National Syndromic
Surveillance Program

Email:nssp@cdc.gov

The National Syndromic Surveillance Program (NSSP) is a collaboration among states and public health jurisdictions that contribute data to the BioSense Platform, public health practitioners who use local syndromic surveillance systems, Center for Disease Control and Prevention programs, other federal agencies, partner organizations, hospitals, healthcare professionals, and academic institutions.

Site created by Fusani Applications