 |
Invoking screen-scraper through SOAP |
SOAP is a standard protocol most often used for accessing web services based on XML. There are several libraries available in many popular programming languages which allow for the rapid developement of SOAP clients.
Below is the API specification for the methods available in screen-scraper through SOAP. Many of the libraries available include some method of generating the code necessary to interact with a specific SOAP interface when given a WSDL file. The following links are specific examples of creating a SOAP client for screen-scraper: Java and .NET
Important Note: unlike the standard screen-scraper server, the SOAP server does not deny connections based on domain or IP address. This presents a potential security risk, so please be cautious by using firewalls and other mechanisms in order to protect the SOAP server when it is running.
Method Summary |
Logging Methods |
| string |
getLog(string filename)
Return the content of a given log file. |
| string |
getLog(string filename,boolean start,int lines)
Returns a portion of the content of a given log file. |
| string[] |
getLogNames()
Returns the names of all the files in the log directory of the remote server. |
| long |
getLogSize(string filename)
Return the size of the given logfile in bytes. |
| int |
removeLog(string filename)
Remove a log file from the log directory on the remote server. |
Scraping Methods |
| string[] |
getCompletedScrapingSessions()
Returns the ID's of the completed scraping sessions. |
| string[] |
getDataRecord(string id, string var)
Get a data record for the given variable. |
| string[][] |
getDataSet(string id, string var)
Get the data set contained in a variable in a scraping session. |
|
getRunningScrapingSessions()
Return the ID's of the currently running scraping sessions. |
| string |
getScrapingSessionName(string id)
Returns the name of the scraping session where its key is id. |
| string[] |
getScrapingSessionNames()
Returns an array of names of scraping sessions which this server currently has. |
| long |
getScrapingSessionStartTime(string id)
Returns the starting time of a particular scraping session as a long. |
| string[] |
getScriptNames()
Returns the names of scripts in this server. |
| string |
getVariable(string id, string var)
Get the value of a certain variable in a scraping session. |
| string |
initializeScrapingSession(string name)
Initialize this scraping session to allow it to be scraped. |
| int |
isFinished(string id)
Returns if the session with key=id is finished. |
| int |
removeCompletedScrapingSession(string id)
Remove the scraping session given by id from the list of completed scraping sessions. |
| int |
removeScrapingSession(string name)
Remove a scraping session from the remote server and from it's database. |
| int |
removeScript(string name)
Remove a script from the remote server and it's database. |
| int |
scrape(string id)
Scrape the session given by this ID. |
| int |
setTimeout(string id, int minutes)
Set the time out minutes of a scraping session to scrape. |
| int |
setVariable(string id, string var, string value)
Set a variable within a scraping session. |
| int |
stopScrapingSession(string id)
Stop a scraping session in progress. |
| int |
update(string xml)
Update the remote server with an exported scraping session or script. |
Server Methods |
| boolean |
isAcceptingConnections()
Returns the value to acceptingConnections, which is the value which dictates if the server is handling remote requests to scrape. |
| int |
setAcceptingConnections(boolean accepting)
Sets the value for acceptingConnections, which will either stop the server from handling requests for remote scrapes or allow them. |
isAcceptingConnections
public static boolean
isAcceptingConnections()
- Returns the value to acceptingConnections, which is the value which dictates if the server is handling remote requests to scrape.
- Returns:
- true if the server is will accept requests to scrape.
setAcceptingConnections
public static int
setAcceptingConnections(boolean accepting)
- Sets the value for acceptingConnections, which will either stop the server from handling requests for remote scrapes or allow them.
- Parameters:
accepting - value to change acceptingConnections to.
- Returns:
- int which represents success or a specific error code.
getScrapingSessionNames
public string[]
getScrapingSessionNames()
- Returns an array of names of scraping sessions which this server currently has.
- Returns:
- names of scraping sessions.
getScriptNames
public string[]
getScriptNames()
- Returns the names of scripts in this server.
- Returns:
- names of scripts.
getRunningScrapingSessions
public string[]
getRunningScrapingSessions()
- Return the ID's of the currently running scraping sessions.
- Returns:
- An array of
Strings, which are the ID's.
getCompletedScrapingSessions
public string[] getCompletedScrapingSessions()
- Returns the ID's of the completed scraping sessions. (Also, updates the list.)
- Returns:
- the ID's of completed scraping sessions.
removeCompletedScrapingSession
public int
removeCompletedScrapingSession(string id)
- Remove the scraping session given by id from the list of completed scraping sessions.
- Parameters:
id - the ID of the scraping session to be removed.
- Returns:
- an
int representing success or a failure code.
isFinished
public int
isFinished(string id)
- Returns if the session with key=id is finished.
- Parameters:
id - the ID of the scraping session to check status.
- Returns:
- an
int representing finished (1), not finished (0) or error (0)
getScrapingSessionName
public string
getScrapingSessionName(string id)
- Returns the name of the scraping session where its key is id.
- Parameters:
id - the ID of a scraping session.
- Returns:
- the name of a scraping session, or "-1" if not found.
getScrapingSessionStartTime
public long
getScrapingSessionStartTime(string id)
- Returns the starting time of a particular scraping session as a long.
- Parameters:
id - the ID of a scraping session.
- Returns:
- the starting time of the scraping session, -1 if not yet started, or 0 if session not found.
initializeScrapingSession
public string
initializeScrapingSession(string name)
- Initialize this scraping session to allow it to be scraped.
- Parameters:
name - the name of the scraping session to initialize.
- Returns:
- if success then the ID of this scraping session is returned, otherwise "-1".
scrape
public int
scrape(string id)
- Scrape the session given by this ID.
- Parameters:
id - the ID of a scraping session.
- Returns:
- 0 if an error occurred or 1 if successfully started.
setVariable
public int
setVariable(string id, string var, string value)
- Set a variable within a scraping session. Disallowed if acceptingConnections is false.
- Parameters:
id - the ID of a scraping session that has been initialized.var - the name of the variable to set.value - the value to set the variable to.
- Returns:
- 1 if successfully set, 0 otherwise.
setTimeout
public int
setTimeout(string id, int minutes)
- Set the time out minutes of a scraping session to scrape.
- Parameters:
id - the ID of a scraping session.minutes - the number of minutes before this session will timeout.
- Returns:
- 1 if successful, 0 otherwise.
stopScrapingSession
public int
stopScrapingSession(string id)
- Stop a scraping session in progress.
- Parameters:
id - the ID of a scraping session.
- Returns:
- 1 if successful, 0 otherwise.
getVariable
public string
getVariable(string id, string var)
- Get the value of a certain variable in a scraping session. Note that currently only Strings, DataRecords, and DataSets can be accessed by this method.
- Parameters:
id - the ID of a scraping session.var - the name of the variable to get the value of.
- Returns:
- if this is a valid scraping session and the value of this variable is a string, then
the value is returned, "NULL" if the value is null, and "-1" otherwise.
getDataRecord
public string[]
getDataRecord(string id, string var)
- Get a data record for the given variable.
- Parameters:
id - the ID of a scraping session.var - the name of a variable in this scraping session.
- Returns:
- an array of
Strings like "key=value" or an empty array if an error
happened or the variable is empty.
getDataSet
public string[][]
getDataSet(string id, string var)
- Get the data set contained in a variable in a scraping session.
- Parameters:
id - the ID of a scraping session.var - the name of a variable.
- Returns:
- an array of data records as translated to arrays of
Strings.
update
public int
update(string xml)
- Update the remote server with an exported scraping session or script. As a warning, if the version of screen-scraper this xml was exported from is different from the version of screen-scraper which is running as a server, then the update may not work.
- Parameters:
xml - the XML contained within an exported scraping session file.
- Returns:
- 0 for failure, 1 for success.
removeScrapingSession
public int
removeScrapingSession(string name)
- Remove a scraping session from the remote server and from it's database.
- Parameters:
name - the name of the scraping session to be removed.
- Returns:
- 0 for failure, 1 for success.
removeScript
public int
removeScript(string name)
- Remove a script from the remote server and it's database.
- Parameters:
name - the name of a script to e removed.
- Returns:
- 0 for failure, 1 for success.
getLogNames
public string[]
getLogNames()
- Returns the names of all the files in the log directory of the remote server.
- Returns:
- an array of the names of the log files, or
null if there is no log directory.
getLogSize
public long
getLogSize(string filename)
- Return the size of the given logfile in bytes.
- Parameters:
filename - the name of a file in the log directory.
- Returns:
- a
long representing the length in bytes of this file, or 0 if the file
does not exist or is empty.
getLog
public string
getLog(string filename)
- Return the content of a given log file.
- Parameters:
filename - the name of the file to get the contents of.
- Returns:
- a
String of the contents of the file, or "" if not possible.
getLog
public string
getLog(string filename, boolean start, int lines)
- Returns a portion of the content of a given log file.
- Parameters:
filename - the name of a log file.start - true to return content from the beginning of a file, false
to start counting lines from the end.lines - the number of lines from the log file to return.
- Returns:
- a portion of the content of the given log file, or "" if anything goes wrong.
removeLog
public int
removeLog(string filename)
- Remove a log file from the log directory on the remote server.
- Parameters:
filename - the name of the file to remove.
- Returns:
- 0 for failure, 1 for success.
Recent comments
1 day 19 hours ago
1 day 20 hours ago
1 day 20 hours ago
4 days 20 hours ago
4 days 20 hours ago
5 days 2 hours ago
5 days 20 hours ago
5 days 20 hours ago
6 days 54 min ago
6 days 21 hours ago