Tutorial 5: Page 2: Tutorial Details

Tutorial Details

There are a number of ways to insert scraped data into a database, which we outline in this FAQ. Take a minute now to look through that. We'll be giving an example of the last option mentioned, which is one of the easier methods to implement.

If you're using the Enterprise Edition of screen-scraper, you should be aware of screen-scraper's ability to handle scraped data in real time (available only in the Enterprise Edition). As of right now, this has been implemented in the Java and PHP drivers for screen-scraper. If you're running the Enterprise Edition, and want to interact with screen-scraper using either of those languages, read over the "Handling Scraped Data in Real Time" section of either our Invoking screen-scraper from Java or Invoking screen-scraper from PHP pages for details on this. The current tutorial doesn't cover this approach, but it's quite a bit easier and cleaner to implement than the method that will be described here. Later on we'll likely create a tutorial for Enterprise Edition users that makes use of this approach.

The basic idea in this tutorial is that we'll have a special scrapeable file that will POST data to a PHP file, which will handle inserting the data into a database. The flow of events will look like this:

  1. Extract the data from the site, saving each value (product title, price, etc.) in session variables.
  2. Invoke a "Save product" scrapeable file, which will POST the extracted data to a PHP file.
  3. The PHP file accepts and validates the data. If the data was incomplete it returns an error message. It then attempts to insert the data into the database. If everything went well it returns a "success" message; otherwise, it returns an error message.
  4. In screen-scraper we use an extractor pattern to check the success/failure status of the error message.

We'll start by modifying our existing "Shopping Site" scraping session a bit, adding to it the scrapeable file that will POST the data to our PHP file.