General Technical

Questions regarding how screen-scraper works or how to get it to do something.

My sub-extractor pattern only gets one instance of my data. How can I get all of the data?

A sub-extractor pattern will, by design, match only once per dataRecord.

If you need to match a datum that appears more than once, you need to use: scrapeableFile.extractData()

I'm unable to install screen-scraper on Ubuntu using a 64-bit chip. How do I resolve this?

If you attempt to install screen-scraper on Ubuntu using a 64-bit chip you may get the following output when launching the installer:

Unpacking JRE ...
Preparing JRE ...
./setup_ss_enterprise.sh: 314: bin/unpack200: not found
Error unpacking jar files. Aborting.
You might need administrative priviledges for this operation.

To resolve this, please ensure that you have the "ia32-libs - ia32 shared libraries for use on amd64 and ia64 systems" package installed on your system.

I'd like to scrape data from a mainframe/tn3270 application. Can screen-scraper handle this?

No. screen-scraper is designed only to scrape data from web sites. If you're looking for a solution that can extract data from older mainframe-type applications, we'd recommend looking at Jagacy.

My web site is hosted on a shared server (virtual hosting). Can I use screen-scraper with it?

In order to install screen-scraper on a machine, you'll likely need administrative or root access. Generally this is not the case with virtual hosting, so you likely will not be able to run screen-scraper on your server.

How do I set up screen-scraper on BSD or Solaris?

We currently have installers that may work for you on Solaris. If these are of interest, please contact us directly.

Can screen-scraper extract information from PDF files?

Sort of, yes. See this blog posting.

Can screen-scraper be scheduled to scrape sites on a periodic basis?

If you're using the Enterprise Edition of screen-scraper, this can be done via the web interface.

Does screen-scraper follow redirects?

screen-scraper will automatically follow certain redirects, so it just depends on what type the web site is making use of. There are three types of redirects that are typically used on the web: