Webscraping by Example: An introduction to BeautifulSoup
11:15 AM - 12:10 PM on July 16, 2016, Room CR5Stevie Slotterback
- Audience level:
- novice
- Watch:
- https://www.youtube.com/watch?v=5U702pICY8k
Description
This is a basic tutorial on the various features of the popular html parser BeautifulSoup. In this tutorial, we will cover the basic functions and data structures that make up the BeautifulSoup package. We will utilize this knowledge as we automate some data extraction tasks on the Buildings Information System (BIS) published by the New York City Department of Buildings.
Abstract
The BeautifulSoup Python package is a useful tool for automating extraction tasks for web-based data sources. One particular web-based data source, the Buildings Information System (BIS) from the New York City Department of Buildings, consistently serves up access to a rich data set with a straightforward format. In this tutorial, we will demonstrate how we can utilize the features of BeautifulSoup to automate data extraction from the BIS database.