Mapping Extracted Data

The web interface is only available for enterprise edition users of screen-scraper.

Overview

The mapping tab allows you to alter extracted values. Often once you extract data from a web page you need to put it into a consistent format. For example, you may want products with very similar names to have identical names.

screen-scraper makes use of mapping sets when determining how to map a given extracted value. A mapping set may contain any number of mappings, which screen-scraper will analyze in sequence until it finds a match, or runs out of mappings. As such, you'll often want to put more specific mappings higher in sequence than more general mappings.

Example

Consider the screen-shot of the mapping tab: if the extracted value were Widget 123 screen-scraper would first try to match using the Widget 1 mapping. Because this is an equals match the mapping wouldn't occur, so screen-scraper would proceed to the second mapping. The second mapping would match because a contains type was designated. That is, the text Widget 123 contains the text Widget. As such, the extracted data Widget 123 would become Product ABC, because that is the To value designated for the second mapping.

Using Regular Expressions

When using regular expressions in your mapping you can also make use of back references. Back references allow you to preserve values in the original text when mapped to the To value. For example, if you were mapping the value Widget 123 you could use the regular expression Widget (\d*). In the To column you could then enter the value Product \1, which, when mapped, would convert Widget 123 to Product 123. The value in parentheses in the From column gets inserted via the \1 marker found in the To column.