Tutorial 2: Scraping an E-commerce Site

Scraping an E-commerce Site

In this tutorial we'll be scraping search results from a basic e-commerce site. We'll also demonstrate logging in to a web site before scraping data. Data you'll be scraping from web sites is often in the form of "records", or data that might fit into a spreadsheet in rows and columns. It's also often necessary to log in to a web site before you can scrape the data you're interested in. Hopefully getting some practice with these situations in this tutorial will let you apply the experience to other similar situations. For example, you would likely apply the same approach we'll go over here to extracting data such as online directories, real estate listings, or product descriptions.

If you haven't already gone through tutorial 1 we'd recommend that you do so before continuing with this one. This tutorial, however, doesn't depend on scraping sessions or other objects you might have created in the previous tutorials. You may wish to download and import the completed scraping session that goes with this tutorial. The scraping session and complete output file are available below.

The site we'll be scraping information from is found here: http://www.screen-scraper.com/shop/. Feel free to click around and explore for a minute.

The scraping session you are about to create and the output file the scraping session will generate:

AttachmentSize
dvds.txt897 bytes
Shopping Site (Scraping Session).sss10.2 KB