scrapeableFile

Scrapeable File

setRequestEntity

void scrapeableFile.setRequestEntity ( String requestEntity ) (professional and enterprise editions only)

Description

Set POST payload data. This is particularly helpful with scraping some site's implementation of AJAX, where the payload in explicitly set as xml.

setReferer

void scrapeableFile.setReferer ( String url ) (professional and enterprise editions only)

Description

Set referer HTTP header.

Parameters

  • url URL of the referer, as a string.

Return Values

Returns void.

setContentType

void scrapeableFile.setContentType ( String contentType ) (professional and enterprise editions only)

Description

Set POST payload type. This is particularly helpful with scraping some site's implementation of AJAX, where the payload in explicitly set as xml.

saveFileBeforeTidying

void scrapeableFile.saveFileBeforeTidying ( String filePath ) (professional and enterprise editions only)

Description

Write non-tidied contents of the scrapeable file response to a text file.

Parameters

  • filePath File path, as a string, where the file should be saved.

Return Values

Returns void.

wasErrorOnRequest

boolean scrapeableFile.wasErrorOnRequest ( )

Description

Determine if an error occurred with the request. Errors are considered to be server timeouts as well as any status code outside of the range 200-399.

Parameters

This method does not receive any parameters.

Return Values

Returns true for server timeouts as well as any status code outside of the range 200-399; otherwise, it returns false.

saveFileOnRequest

void scrapeableFile.saveFileOnRequest ( String filePath ) (enterprise edition only)

Description

Save the file returned from a scrapeable file request.

Parameters

  • filePath Location where the file should be saved as a string.

Return Values

Returns void.

removeAllHTTPParameters

void scrapeableFile.removeAllHTTPParameters ( ) (professional and enterprise editions only)

Description

Remove all of the HTTP parameters from the current scrapeable file.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

noExtractorPatternsMatched

boolean scrapeableFile.noExtractorPatternsMatched ( )

Description

Determine whether any extractor patterns associated with the scrapeable file found a match.

Parameters

This method does not receive any parameters.

Return Values

Returns boolean corresponding to whether any extractor pattern matched in the scrapeable file.

getStatusCode

int scrapeableFile.getStatusCode ( ) (professional and enterprise editions only)

Description

Determine the HTTP status code sent by the server.

Parameters

This method does not receive any parameters.

Return Values

Returns integer corresponding to the HTTP status code of the response.

getNonTidiedHTML

String scrapeableFile.getNonTidiedHTML ( ) (enterprise edition only)

Description

Retrieve the non-tidied HTML of the scrapeable file.

Parameters

This method does not receive any parameters.

Return Values

Returns the non-tidied contents of the scrapeable file, as a string. On failure it returns null.