Tutorial 2: Scraping an E-commerce Site

Scraping an E-commerce Site
Click here for a video version of this tutorial
(this will open a new window which may take a moment to load)

In this tutorial we'll be scraping search results from a basic e-commerce site. We'll also demonstrate logging in to a web site before scraping data. Data you'll be scraping from web sites is often in the form of "records", or data that might fit into a spreadsheet in rows and columns. It's also often necessary to log in to a web site before you can scrape the data you're interested in. Hopefully getting some practice with these situations in this tutorial will let you apply the experience to other similar situations. For example, you would likely apply the same approach we'll go over here to extracting data such as online directories, real estate listings, or product descriptions.

If you haven't already gone through Tutorial 1 we'd recommend that you do so before continuing with this one. This tutorial, however, doesn't depend on scraping sessions or other objects you might have created in the previous tutorial.

The site we'll be scraping information from is found here: http://www.screen-scraper.com/shop/. Feel free to click around and explore for a minute.

If you're interested in seeing the final scraping session you'll be creating, along with the output file that will get generated, you'll find them in the table below. You can import the scraping session by following the instructions found here. If you're wanting to learn to use screen-scraper you're probably better off not importing the scraping session, and instead following along closely with the tutorial. If, however, you're just trying to get a feel for what it's like to use screen-scraper, it might be helpful to import the scraping session.

AttachmentSize
dvds.txt897 bytes
Shopping Site (Scraping Session).sss10.2 KB