SqlDataManager

Overview

This object simplifies your interactions with a JDBC-compliant SQL database. It can work with various types of databases and even in a multi-threaded format to allow scrapes to continue without having to wait for the queries to process. View an example of how to use the SqlDataManager.

This feature is only available for Professional and Enterprise editions of screen-scraper.

Prefer a more traditional approach? See an example of Working with MySQL databases.

In order to use the SqlDataManager you will need to make sure to install the appropriate JDBC driver. This can be done by downloading the driver and placing it in the lib/ext folder in the screen-scraper installation directory.

Event Callbacks

Overview

Add an event callback to SqlDataManager object.

This feature is only available for Professional and Enterprise editions of screen-scraper.

Before adding an event to the SqlDataManager, you must build the schema of any tables you will use because events are related to table operations such as inserting data

Parameters

  • schema Case insensitive schema (table) name
  • when The event assiciated with the schema that should trigger the callback
    • onCreate Triggered whenever the DataManager creates a new DataNode, such as the first addData since the last commit
    • onAddData Triggered after dm.addData is called
    • onWrite Triggered immediately before the DataNode is written (DataWriter.write). Applies to both inserts and updates
    • onInsert Triggered immediately before the data is going to be inserted as a new row in the database as opposed to updating an existing row
    • onUpdate Triggered immediately before existing database values are going to be updated as opposed to inserted as a new row
    • onWriteError Triggered if an exception was thrown when trying to write to the database
    • afterWrite Triggered immediately after the DataNode is written. At this point any values written are in the DataNode, including autogenerated keys
  • listener A callback interface that must be implemented by the client. There is a single method public void handleEvent(DataManagerEvent event) that needs to be implemented. The DataManagerEvent has a method getDataNode() to retrieve the relevant DataNode.

Return Values

Returns a DataManagerEventListener. The same DataManagerEventListener object that was passed in

Change Log

Version Description
5.5 Available for professional and enterprise editions.

Class Locations

  • com.screenscraper.datamanager.DataManager
  • com.screenscraper.datamanager.DataManagerEventListener
  • com.screenscraper.datamanager.DataManagerEventSource.EventFireTime

Examples

Register a callback to log out database write errors to 'person' table to the web interface

import com.screenscraper.datamanager.*;
import com.screenscraper.datamanager.sql.SqlDataManager;
import org.apache.commons.dbcp.BasicDataSource;

// BasicDataSource
BasicDataSource ds = new BasicDataSource();
ds.setDriverClassName( "com.mysql.jdbc.Driver" );
ds.setUsername( "user" );
ds.setPassword( "psswrd" );
ds.setUrl( "jdbc:mysql://127.0.0.1:3306/mydb?UTF8ENCODING" );
ds.setMaxActive( 10 );

// Create Data Manager
dm = new SqlDataManager( ds, session );
dm.buildSchemas();
_session = session;

//This will log out any write errors to the 'person' table to the screen-scraper web interface
dm.addEventListener("person", DataManagerEventSource.EventFireTime.onWriteError,
  new DataManagerEventListener() {
    public void handleEvent(DataManagerEvent event) {
      DataNode n = event.getDataNode();
      _session.webError("Database Write Error",n.getObjectMap());
    }
  }
);

addData

void sqlDataManager.addData ( String table, Map data ) (professional and enterprise editions only)
void sqlDataManager.addData ( String table, String columnName, Object value ) (professional and enterprise editions only)

Description

Add data to fields, in preparation for insertion into a database.

When adding data in a many-to-many relation, if setAutoManyToMany is set to false, a null row should be inserted into the relating table so the datamanager will link the keys correctly between related tables. For example, dm.addData("many_to_many", null);

Before adding data the first time, you must build the schema of any tables you will use, as well as add foreign keys if you are not using a database engine that natively supports them (such as InnoDB for MySQL).

Parameters

  • table Name of the database table that the data corresponds to, as a string.
  • data (this or columnName and value) Map using field names as keys to desired values to be added in the database for fields. This can be a dataRecord object.
  • columnName (requires value) The name of the column/field in the database table that the data is being added for, as a string.
  • value (requires columnName) The value being inserted into the column/field.
  • The SqlDataManager will attempt to convert a value that is given to the correct format for the database. For example, if the database requires an int for a column named age, dm.addData("table", "age", "32") will convert the String "32" to an int before adding it to the database. See the table below the examples for other types of java objects and how they map to SQL types.

The table and columnName parameters are not case sensitive. The same is true for the key values in the data map.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Add Data from Data Record

 // Get datamanager
 dm = session.getv( "DATAMANAGER" );

 // Add DataRecord Information into person table
 dm.addData( "person", dataRecord );

 // Create and add query to buffer
 dm.commit( "person" );

Add Data In a Specific Field

 // Get datamanager
 dm = session.getv( "DATAMANAGER" );

 // Add DataRecord Information into person table
 dm.addData( "person", dataRecord );

 // Add Specific Other Data
 dm.addData( "person", "date_collected", "2010-07-13" );

 // Create and add query to buffer
 dm.commit( "person" );

Java Object and SQL Type Mappings

Since the DataManager is designed with screen-scraper in mind all inputs support using the String type in addition to their corresponding Java object type, but the String needs to be parseable into the corresponding data type. For example if there is a column that is defined as an Integer in the database then the String needs to be parseable by Integer.parseInt(String value). Here is a mapping of the sql types (based on java.sql.Types) to Java objects:

SQL Type       Java Object
java.sql.Types.CHAR String
java.sql.Types.VARCHAR String
java.sql.Types.LONGVARCHAR String
java.sql.Types.LONGNVARCHAR String
java.sql.Types.NUMERIC BigDecimal
java.sql.Types.DECIMAL BigDecimal
java.sql.Types.TINYINT Integer
java.sql.Types.SMALLINT Integer
java.sql.Types.INTEGER Integer
java.sql.Types.BIGINT Long
java.sql.Types.REAL Float
java.sql.Types.FLOAT Double
java.sql.Types.DOUBLE Double
java.sql.Types.BIT Boolean
java.sql.Types.BINARY ByteArray
java.sql.Types.VARBINARY ByteArray
java.sql.Types.LONGVARBINARY ByteArray
java.sql.Types.DATE SQLDate or Long
java.sql.Types.TIME SQLTime or Long
java.sql.Types.TIMESTAMP SQLTime or Long
java.sql.Types.ARRAY Object
java.sql.Types.BLOB ByteArray
java.sql.Types.CLOB Object
java.sql.Types.JAVA_OBJECT Object
java.sql.Types.OTHER Object

See Also

  • commit() [SqlDataManager] - Commits a prepared row of data to the queue to be updated in the database
  • commitAll() [SqlDataManager] - Commits all prepared rows of data to the queue to be updated in the database

addForeignKey

void sqlDataManager.addForeignKey ( String table, String columnName, String foreignTable, String foreignColumnName ) (professional and enterprise editions only)

Description

Manually setup table connection (key matching).

If SqlDataManager.buildSchemas is called, any foreign keys manually added before that point will be overridden or erased.

Parameters

  • table Name of the database table with the primary key, as a string.
  • columnName Column/field name of the primary key, as a string.
  • foreignTable Name of the database table with the foreign key, as a string.
  • foreignColumnName Column/field name of the foreign key, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

If the database has some indication of foreign keys then these will be followed automatically. If the database does not allow for foreign key references then you will need to build the table connections using this method.

Examples

Setup Table Connections

 // Import classes
 import com.screenscraper.datamanager.*;
 import com.screenscraper.datamanager.sql.*;
 import org.apache.commons.dbcp.BasicDataSource;

 // Set Variables
 host = "127.0.0.1:3306";
 database = "mydb";
 username = "user";
 password = "pwrd";
 parameters = "autoReconnect=true&useCompression=true";

 // Build the BasicDataSource for the database connection
 BasicDataSource ds = new BasicDataSource();
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( username );
 ds.setPassword( password );
 ds.setUrl( "jdbc:mysql://" + host + "/" + database + "?" + parameters );

 // Get MySQL datamanager
 dm = new SqlDataManager( ds, session );

 // Build Schemas
 dm.buildSchemas();

 // Setup table connections
 // parameter order: "child_table", "child_column", "parent_table", "parent_column"
 dm.addForeignKey( "job", "person_id", "person", "id");
 dm.addForeignKey( "address", "person_id", "person", "id");

addSessionVariables

void sqlDataManager.addSessionVariables ( String table ) (professional and enterprise editions only)

Description

Manually add session variable data to fields, in preparation for insertion into a database.

Parameters

  • table Name of the database table that the data corresponds to, as a string.

The keys from the session will be matched in a case insensitive way to the column names of the database.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Add Data from Session Variables

 // Get datamanager
 dm = session.getv( "DATAMANAGER" );

 // Add session variables into person table
 dm.addSessionVariables( "person" );

 // Create and add query to buffer
 dm.commit( "person" );

See Also

  • addSessionVariablesOnCommit() [SqlDataManager] - Adds the session variables to a database table when a row of the table is committed
  • clearSessionVariables() [SqlDataManager] - Clears the session variables corresponding to columns in a database table
  • clearSessionVariablesOnCommit() [SqlDataManager] - Clears the session variables corresponding to columns in a database table when a row of data is committed to the table

addSessionVariablesOnCommit

void sqlDataManager.addSessionVariablesOnCommit ( boolean automate ) (professional and enterprise editions only)

Description

Add corresponding session variables to the tables automatically when it is committed.

Parameters

  • automate If true then session variables whose names match field names (case insensitive) will be automatically added to queries when the fields are committed.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Automate Session Variables

 // Import classes
 import com.screenscraper.datamanager.*;
 import com.screenscraper.datamanager.sql.*;
 import org.apache.commons.dbcp.BasicDataSource;

 // Set Variables
 host = "127.0.0.1:3306";
 database = "mydb";
 username = "user";
 password = "pwrd";
 parameters = "autoReconnect=true&useCompression=true";

 // Build the BasicDataSource for the database connection
 BasicDataSource ds = new BasicDataSource();
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( username );
 ds.setPassword( password );
 ds.setUrl( "jdbc:mysql://" + host + "/" + database + "?" + parameters );

 // Get MySQL datamanager
 dm = new SqlDataManager( ds, session );

 // Build Schemas For all Tables
 dm.buildSchemas();

 // Write Information to Database
 // automatically using session variables
 dm.addSessionVariablesOnCommit( true );

See Also

  • addSessionVariables() [SqlDataManager] - Adds the session variables to a database table for columns that match the session variable names
  • clearSessionVariables() [SqlDataManager] - Clears the session variables corresponding to columns in a database table
  • clearSessionVariablesOnCommit() [SqlDataManager] - Clears the session variables corresponding to columns in a database table when a row of data is committed to the table

buildSchemas

void sqlDataManager.buildSchemas ( ) (professional and enterprise editions only)
void sqlDataManager.buildSchemas ( List tables ) (professional and enterprise editions only)

Description

Collect the database schema information, including foreign key relations between tables.

Schemas must be built for any tables that will be used by this DataManager before data can be added.

Parameters

  • tables (option) A list of table names, as strings, for which to build schemas.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Build Database Schema using a BasicDataSource

 // Import classes
 import com.screenscraper.datamanager.*;
 import com.screenscraper.datamanager.sql.*;
 import org.apache.commons.dbcp.BasicDataSource;

 // Set Variables
 host = "127.0.0.1:3306";
 database = "mydb";
 username = "user";
 password = "pwrd";
 parameters = "autoReconnect=true&useCompression=true";

 // Build the BasicDataSource for the database connection
 BasicDataSource ds = new BasicDataSource();
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( username );
 ds.setPassword( password );
 ds.setUrl( "jdbc:mysql://" + host + "/" + database + "?" + parameters );

 // Get MySQL datamanager
 dm = new SqlDataManager( ds, session );

 // Build Schemas For all Tables
 dm.buildSchemas();

Build Database Schema using an SshDataSource

 import com.screenscraper.datamanager.sql.*;

 // SshDataSource
 ds = new SshDataSource( "[email protected]", "ssPass" );
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( "user" );
 ds.setPassword( "psswrd" );

 // Accepted values for the first parameter of setUrl are:
 //   SshDataSource.MYSQL
 //   SshDataSource.MSSQL
 //   SshDataSource.ORACLE
 //   SshDataSource.POSTGRESQL
 ds.setUrl( SshDataSource.MYSQL, 3306, "database" );

 // Create Data Manager
 dm = new SqlDataManager( ds, session );

 // Build Schemas For all Tables
 dm.buildSchemas();

clearAllData

void sqlDataManager.clearAllData ( ) (professional and enterprise editions only)

Description

Clear all data from the data manager without writing it to the database. This includes all data previously committed but not yet written.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Write to Database

 // Get data manager
 dm = session.getv( "DATAMANAGER" );

 // Clear information from the datamanager
 dm.clearAllData();

clearSessionVariables

void sqlDataManager.clearSessionVariables ( String table ) (professional and enterprise editions only)

Description

Clear session variables corresponding to the fields of a specific table (case insensitive).

Parameters

  • table Name of the table whose field names will be used to clear session variables.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Clear Session Variables

 // Get data manager
 dm = session.getv( "DATAMANAGER" );

 // Clear session variables for people table
 dm.clearSessionVariables( "people" );

See Also

  • addSessionVariables() [SqlDataManager] - Adds the session variables to a database table for columns that match the session variable names
  • addSessionVariablesOnCommit() [SqlDataManager] - Adds the session variables to a database table when a row of the table is committed
  • clearSessionVariablesOnCommit() [SqlDataManager] - Clears the session variables corresponding to columns in a database table when a row of data is committed to the table

clearSessionVariablesOnCommit

void sqlDataManager.clearSessionVariablesOnCommit ( boolean clearVars ) (professional and enterprise editions only)

Description

Clear session variables corresponding to a committed table automatically.

Parameters

  • clearVars If true then session variables whose names match field names (case insensitive) will be automatically cleared when the table is committed.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Automate Session Variables

 // Import classes
 import com.screenscraper.datamanager.*;
 import com.screenscraper.datamanager.sql.*;
 import org.apache.commons.dbcp.BasicDataSource;

 // Set Variables
 host = "127.0.0.1:3306";
 database = "mydb";
 username = "user";
 password = "pwrd";
 parameters = "autoReconnect=true&useCompression=true";

 // Build the BasicDataSource for the database connection
 BasicDataSource ds = new BasicDataSource();
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( username );
 ds.setPassword( password );
 ds.setUrl( "jdbc:mysql://" + host + "/" + database + "?" + parameters );

 // Get MySQL datamanager
 dm = new SqlDataManager( ds, session );

 // Build Schemas For all Tables
 dm.buildSchemas();

 // Write Information to Database
 // automatically using session variables
 dm.addSessionVariablesOnCommit( true );

 // Clear session variables on commit
 // to avoid carry over
 dm.clearSessionVariablesOnCommit( true );

See Also

  • addSessionVariables() [SqlDataManager] - Adds the session variables to a database table for columns that match the session variable names
  • addSessionVariablesOnCommit() [SqlDataManager] - Adds the session variables to a database table when a row of the table is committed
  • clearSessionVariables() [SqlDataManager] - Clears the session variables corresponding to columns in a database table

close

void sqlDataManager.close ( ) (professional and enterprise editions only)

Description

Close data manager's connections.

If there is data that has not yet been written to the database when this method is called it will not be written.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Close Data Manager

 // Get Data Manager
 dm = session.getv( "DATAMANAGER" );

 // Close Data Manager
 dm.close();

commit

void sqlDataManager.commit ( String table ) (professional and enterprise editions only)

Description

Commit a prepared row of data into queue. Once called the data can no longer be edited. When working with multiple tables that relate by a foreign key, it is important to commit rows in the correct order. The rows in each of the child tables should be committed before the parent, or they will not be correctly linked when written to the database.

This does not write the row of data to the database, but rather puts it in queue to be written at a later time.

Parameters

  • table Name of the database table that the data corresponds to, as a string.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Commit Database Row

 // Get datamanager
 dm = session.getv( "DATAMANAGER" );

 // Add session variables into person table
 dm.addSessionVariables( "person" );

 // Create and add query to buffer
 dm.commit( "person" );

See Also

  • flush() [SqlDataManager] - Writes committed rows to the database
  • commitAll() [SqlDataManager] - Commits all prepared rows of data to the queue to be updated in the database

commitAll

void sqlDataManager.commitAll ( ) (professional and enterprise editions only)

Description

Commit prepared rows of data for all tables into queue. Once called the data can no longer be edited.

Parameters

This method does not receive any parameters.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Commit Database Row

 // Get datamanager
 dm = session.getv( "DATAMANAGER" );

 // Add session variables into tables
 dm.addSessionVariables( "person" );
 dm.addSessionVariables( "address" );
 dm.addSessionVariables( "jobs" );

 // Create and add queries to buffer
 dm.commitAll();

See Also

  • flush() [SqlDataManager] - Writes committed rows to the database
  • commit() [SqlDataManager] - Commits a prepared row of data to the queue to be updated in the database

flush

boolean sqlDataManager.flush ( ) (professional and enterprise editions only)

Description

Write committed data to the database. Any data that has not been committed using either the commit or commitAll method will be lost and not written to the database.

Parameters

This method does not receive any parameters.

Return Values

Returns true data was successfully written to the database; otherwise, it returns false.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Write to Database

 // Get data manager
 dm = session.getv( "DATAMANAGER" );

 // Write Information to Database
 dm.flush();

See Also

  • commit() [SqlDataManager] - Commits a prepared row of data to the queue to be updated in the database
  • commitAll() [SqlDataManager] - Commits all prepared rows of data to the queue to be updated in the database

getConnection

Connection sqlDataManager.getConnection ( ) (professional and enterprise editions only)

Description

Retrieve the connection object of the data manager. This can be helpful if you want to do something that the data manager cannot do easily, such as query the database.

Be sure to close the connection once it is no longer needed. Failure to do so could exhaust the connection pool used by the datamanger, which will cause the scraping session to hang.

Parameters

This method does not receive and parameters.

Return Values

Returns a connection object matching the one used in the data manager.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Retrieve Database Connection

 // Import SQL object
 import java.sql.*;

 // Get datamanager
 dm = session.getv( "DATAMANAGER" );

 // Retrieve connection
 connection = dm.getConnection();

 try {
     PreparedStatement ps = connection.prepareStatement( "UPDATE table SET status=?" );
     ps.setString( 1, session.getv("STATUS") );
     ps.executeUpdate();
 } finally {
      connection.close();
 }

getLastAutoIncrementKey

DataObject sqlDataManager.getLastAutoIncrementKey (String table) (professional and enterprise editions only)

Description

Retrieve the last autogenerated primary key, if any, for the given table

Parameters

case insensitve table name

Return Values

Returns a com.screenscraper.datamanager.DataObject containing the primary key.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Retrieve AutoIncrement Key

 //Save some data
 dm.addData("table", "column", "important data");
 dm.commit("table");
 dm.flush("table");

 //Retrieve the key associated with the data we just saved as an Integer
 key = dm.getLastAutoIncrementKey("table").getInt();

setAutoManyToMany

void sqlDataManager.setAutoManyToMany ( boolean enable ) (professional and enterprise editions only)

Description

Sets whether or not the data manager should automatically take care of many-to-many relationships.

Parameters

  • enable Whether the data manager should automatically run a commit for many-to-many tables when the connected tables are committed, as a boolean.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

If the many-to-many table has more information than just the keys then you will want to leave this feature turned off so that you can add more data than just the keys before committing.

Examples

Set Automatic Commits for Many-to-many Tables

 // Import classes
 import com.screenscraper.datamanager.*;
 import com.screenscraper.datamanager.sql.*;
 import org.apache.commons.dbcp.BasicDataSource;

 // Set Variables
 host = "127.0.0.1:3306";
 database = "mydb";
 username = "user";
 password = "pwrd";
 parameters = "autoReconnect=true&useCompression=true";

 // Build the BasicDataSource for the database connection
 BasicDataSource ds = new BasicDataSource();
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( username );
 ds.setPassword( password );
 ds.setUrl( "jdbc:mysql://" + host + "/" + database + "?" + parameters );

 // Get MySQL datamanager
 dm = new SqlDataManager( ds, session );

 // Set Automatic Commit on Many-to-many tables
 dm.setAutoManyToMany( true );

setGlobalMergeEnabled

void sqlDataManager.setGlobalMergeEnabled ( boolean merge )

This feature is only available for Professional and Enterprise editions of screen-scraper.

Description

Set global merge status. When conflicts exist in data, a merge of true will take the newer values and save them over previous null values.

When merging or updating values in a table, that table must have a Primary Key. When the Primary Key is set to autoincrement, if the value of that key was not set with the addData method the DataManager will create a new row rather than update or merge with an existing row. One solution is to use an SqlDuplicateFilter to set fields that would identify an entry as a duplicate and automatically insert the value of the autoincrement key when data is committed.

By default if the data that you are inserting has the same primary key as data already in the database it will ignore the insert. This behavior can be modified by the dm.setGlobalUpdateEnabled and dm.setGlobalMergeEnabled methods of the DataManager. This allows for four different settings to insert data:

Update Merge Resulting Action
false false Ignore row on duplicate
true false Update only values whose corresponding columns are currently NOT NULL in the database
false true Update only values whose corresponding columns are currently NULL in the database
true true Update all values to new data

Parameters

  • merge Whether to turn on global merge or not, as a boolean.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Set Global Database Merge

 // Import classes
 import com.screenscraper.datamanager.*;
 import com.screenscraper.datamanager.sql.*;
 import org.apache.commons.dbcp.BasicDataSource;

 // Set Variables
 host = "127.0.0.1:3306";
 database = "mydb";
 username = "user";
 password = "pwrd";
 parameters = "autoReconnect=true&useCompression=true";

 // Build the BasicDataSource for the database connection
 BasicDataSource ds = new BasicDataSource();
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( username );
 ds.setPassword( password );
 ds.setUrl( "jdbc:mysql://" + host + "/" + database + "?" + parameters );

 // Get MySQL datamanager
 dm = new SqlDataManager( ds, session );

 // Build Schemas For all Tables
 dm.buildSchemas();

 // Set Global Update
 dm.setGlobalUpdateEnabled( true );

 // Set Global Merge
 dm.setGlobalMergeEnabled( true );

See Also

  • setGlobalUpdateEnabled() [SqlDataManager] - Sets whether values in the database should be overwritten by committed values, even when the committed value is null
  • setMergeEnabled() [SqlDataManager] - Sets whether null values in a specific table of the database should be overwritten by committed values
  • setUpdateEnabled() [SqlDataManager] - Sets whether values in a specific table of the database should be overwritten by committed values, even when the committed value is null

setGlobalUpdateEnabled

void sqlDataManager.setGlobalUpdateEnabled ( boolean update )

This feature is only available for Professional and Enterprise editions of screen-scraper.

Description

Set update status globally. When conflicts exist in data, an update of true will take the newer values and save them over previous non-null values.

When merging or updating values in a table, that table must have a Primary Key. When the Primary Key is set to autoincrement, if the value of that key was not set with the addData method the DataManager will create a new row rather than update or merge with an existing row. One solution is to use an SqlDuplicateFilter to set fields that would identify an entry as a duplicate and automatically insert the value of the autoincrement key when data is committed.

By default if the data that you are inserting has the same primary key as data already in the database it will ignore the insert. This behavior can be modified by the dm.setGlobalUpdateEnabled and dm.setGlobalMergeEnabled methods of the DataManager. This allows for four different settings to insert data:

Update Merge Resulting Action
false false Ignore row on duplicate
true false Update only values whose corresponding columns are currently NOT NULL in the database
false true Update only values whose corresponding columns are currently NULL in the database
true true Update all values to new data

Parameters

  • update Whether to turn on global update or not, as a boolean.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Set Global Database Update

 // Import classes
 import com.screenscraper.datamanager.*;
 import com.screenscraper.datamanager.sql.*;
 import org.apache.commons.dbcp.BasicDataSource;

 // Set Variables
 host = "127.0.0.1:3306";
 database = "mydb";
 username = "user";
 password = "pwrd";
 parameters = "autoReconnect=true&useCompression=true";

 // Build the BasicDataSource for the database connection
 BasicDataSource ds = new BasicDataSource();
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( username );
 ds.setPassword( password );
 ds.setUrl( "jdbc:mysql://" + host + "/" + database + "?" + parameters );

 // Get MySQL datamanager
 dm = new SqlDataManager( ds, session );

 // Build Schemas For all Tables
 dm.buildSchemas();

 // Set Global Update
 dm.setGlobalUpdateEnabled( true );

See Also

  • setGlobalMergeEnabled() [SqlDataManager] - Sets whether null values in the database should be overwritten by committed values
  • setMergeEnabled() [SqlDataManager] - Sets whether null values in a specific table of the database should be overwritten by committed values
  • setUpdateEnabled() [SqlDataManager] - Sets whether values in a specific table of the database should be overwritten by committed values, even when the committed value is null

setLoggingLevel

void sqlDataManager.setLoggingLevel ( Level level ) (professional and enterprise editions only)

Description

Set the error logging level. Currently only DEBUG and ERROR levels are supported. At the DEBUG level, all queries and results will be output to the log.

Parameters

  • level log4j logging level object.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Set Logging Level

 // Get MySQL datamanager
 dm = session.getVariable( "DATAMANAGER" );

 // Set Logging Level
 dm.setLoggingLevel( org.apache.log4j.Level.ERROR );

 // Build Schemas
 dm.buildSchemas();

setMergeEnabled

void sqlDataManager.setMergeEnabled ( boolean merge )

This feature is only available for Professional and Enterprise editions of screen-scraper.

Description

Set merge status for a table. When conflicts exists in data, a merge of true will take the newer values and save them over previous null values.

When merging or updating values in a table, that table must have a Primary Key. When the Primary Key is set to autoincrement, if the value of that key was not set with the addData method the DataManager will create a new row rather than update or merge with an existing row. One solution is to use an SqlDuplicateFilter to set fields that would identify an entry as a duplicate and automatically insert the value of the autoincrement key when data is committed.

By default if the data that you are inserting has the same primary key as data already in the database it will ignore the insert. This behavior can be modified for a specific table by the dm.setUpdateEnabled and dm.setMergeEnabled methods of the DataManager. This allows for four different settings to insert data:

Update Merge Resulting Action
false false Ignore row on duplicate
true false Update only values whose corresponding columns are currently NOT NULL in the database
false true Update only values whose corresponding columns are currently NULL in the database
true true Update all values to new data

Parameters

  • table Name of the database table, as a string.
  • merge Whether to turn on global merge or not, as a boolean.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Set Database Table Merge

 // Get datamanager
 dm = session.getv( "DATAMANAGER" );

 // Set Merge
 dm.setMergeEnabled( "person", true );

See Also

  • setGlobalMergeEnabled() [SqlDataManager] - Sets whether null values in the database should be overwritten by committed values
  • setGlobalUpdateEnabled() [SqlDataManager] - Sets whether values in the database should be overwritten by committed values, even when the committed value is null
  • setUpdateEnabled() [SqlDataManager] - Sets whether values in a specific table of the database should be overwritten by committed values, even when the committed value is null

setMultiThreadWrite

void sqlDataManager.setMultiThreadWrite ( int numThreads ) (professional and enterprise editions only)

Description

Set number of threads that the data manager can have open at once. When set higher than one, the scraping session can continue to run and download pages while the database is being written. This can decrease the time required to run a scrape, but also makes debugging harder as there is no guarantee about the order in which data will be written. It is recommended to leave this setting alone while developing a scrape. Also, the flush method will always return true if more than one thread is being used to write to the database, even if the write failed.

Parameters

  • numThreads The number of threads that the data manager can start and use to write data, as an integer.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Set Thread Count

 // Import classes
 import com.screenscraper.datamanager.*;
 import com.screenscraper.datamanager.sql.*;
 import org.apache.commons.dbcp.BasicDataSource;

 // Set Variables
 host = "127.0.0.1:3306";
 database = "mydb";
 username = "user";
 password = "pwrd";
 parameters = "autoReconnect=true&useCompression=true";

 // Build the BasicDataSource for the database connection
 BasicDataSource ds = new BasicDataSource();
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( username );
 ds.setPassword( password );
 ds.setUrl( "jdbc:mysql://" + host + "/" + database + "?" + parameters );
 ds.setMaxActive( 100 );

 // Get MySQL datamanager
 dm = new SqlDataManager( ds, session );

 // Set number of threads that can be opened
 // when interacting with the database
 dm.setMultiThreadWrite(10);

 // Build Schemas For all Tables
 dm.buildSchemas();

setUpdateEnabled

void sqlDataManager.setUpdateEnabled ( boolean update )

This feature is only available for Professional and Enterprise editions of screen-scraper.

Description

Set update status for a given table. When conflicts exists in data, an update of true will take the newer values and save them over previous non-null values.

When merging or updating values in a table, that table must have a Primary Key. When the Primary Key is set to autoincrement, if the value of that key was not set with the addData method the DataManager will create a new row rather than update or merge with an existing row. One solution is to use an SqlDuplicateFilter to set fields that would identify an entry as a duplicate and automatically insert the value of the autoincrement key when data is committed.

By default if the data that you are inserting has the same primary key as data already in the database it will ignore the insert. This behavior can be modified for a specific table by the dm.setUpdateEnabled and dm.setMergeEnabled methods of the DataManager. This allows for four different settings to insert data:

Update Merge Resulting Action
false false Ignore row on duplicate
true false Update only values whose corresponding columns are currently NOT NULL in the database
false true Update only values whose corresponding columns are currently NULL in the database
true true Update all values to new data

Parameters

  • table The name of the database table, as a string.
  • update Whether to turn on global update or not, as a boolean.

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Set Database Table Update

 // Get datamanager
 dm = session.getv( "DATAMANAGER" );

 // Set Update on person table
 dm.setUpdateEnabled( "person", true );

See Also

  • setGlobalMergeEnabled() [SqlDataManager] - Sets whether null values in the database should be overwritten by committed values
  • setGlobalUpdateEnabled() [SqlDataManager] - Sets whether values in the database should be overwritten by committed values, even when the committed value is null
  • setMergeEnabled() [SqlDataManager] - Sets whether null values in a specific table of the database should be overwritten by committed values

SqlDataManager

SqlDataManager SqlDataManager ( BasicDataSource dataSource, ScrapingSession session ) (professional and enterprise editions only)

Description

Initiate a SqlDataManager object.

Before adding data to the SqlDataManager, you must build the schema of any tables you will use, as well as add foreign keys if you are not using a database engine that natively supports them (such as InnoDB for MySQL).

Parameters

  • dataSource A BasicDataSource object.
  • session The scraping session to which the data manager should be associated.

Return Values

Returns a SqlDataManager. If an error is experienced it will be thrown.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Class Location

com.screenscraper.datamanager.sql.SqlDataManager

Examples

Create a SQL Data Manager

 import com.screenscraper.datamanager.sql.*;
 import org.apache.commons.dbcp.BasicDataSource;

 // BasicDataSource
 BasicDataSource ds = new BasicDataSource();
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( "user" );
 ds.setPassword( "psswrd" );
 ds.setUrl( "jdbc:mysql://127.0.0.1:3306/mydb?UTF8ENCODING" );
 ds.setMaxActive( 100 );

 // Create Data Manager
 dm = new SqlDataManager( ds, session );

Create a SQL Data Manager Over SSH Tunnel

 import com.screenscraper.datamanager.sql.*;

 // SshDataSource
 ds = new SshDataSource( "[email protected]", "ssPass" );
 ds.setDriverClassName( "com.mysql.jdbc.Driver" );
 ds.setUsername( "user" );
 ds.setPassword( "psswrd" );

 // Accepted values for the first parameter of setUrl are:
 //   SshDataSource.MYSQL
 //   SshDataSource.MSSQL
 //   SshDataSource.ORACLE
 //   SshDataSource.POSTGRESQL
 ds.setUrl( SshDataSource.MYSQL, 3306, "database" );

 // Create Data Manager
 dm = new SqlDataManager( ds, session );

SqlDuplicateFilter

Overview

SqlDuplicateFilters are designed to filter duplicates when more information than just a primary key might define a duplicate entry. For example, you might define a unique person by their SSN, driver's license number, or by their first name, last name, and phone number. It is also possible that a single person may have multiple phone numbers, and if any of them match then the duplicate constraint should be met. Using an SqlDuplicateFilter can check for conditions such as this and correctly recognize duplicate entries.

This feature is only available for Professional and Enterprise editions of screen-scraper.

Examples

Register a new duplicate filter

 // Import classes
 import com.screenscraper.datamanager.sql.*;

 //Get the data manager
 SqlDataManager dm = session.getVariable( "_DATAMANAGER" );

 // Register a new duplicate filter
 // Check for duplicate people, so register it for the people table
 SqlDuplicateFilter nameFilter = SqlDuplicateFilter.register("people", dm);

 //Add constraints to match when a first name, middle initial, and last name match a different row in the database
 nameFilter.addConstraint( "people", "first_name" );
 nameFilter.addConstraint( "people", "middle_initial" );
 nameFilter.addConstraint( "people", "last_name" );

Match Duplicates across tables

Sometimes the data will need to be filtered across multiple tables, or possibly different constaints might indicate a duplicate. An example of this is a person might be a duplicate if their SSN matches OR if their driver's license number matches. Alternatively, they may be a duplicate when they have the same first name, last name, and phone number.

import com.screenscraper.datamanager.sql.SqlDuplicateFilter;

/*
Perform the setup of the SqlDataManager, as shown previously, and name the variable dm.
*/


//register an SqlDuplicateFilter with the DataManager for the social security number
SqlDuplicateFilter ssnDuplicate = SqlDuplicateFilter.register( "person", dm );
ssnDuplicate.addConstraint( "person", "ssn" );

//register an SqlDuplicateFilter with the DataManager for the drivers license
SqlDuplicateFilter licenseDuplicate = SqlDuplicateFilter.register( "person", dm );
licenseDuplicate.addConstraint( "person", "drivers_license" );

//register an SqlDuplicateFilter with the DataManager for the name/phone number
//where the person table has a child table named phone.
SqlDuplicateFilter namePhoneDuplicate = SqlDuplicateFilter.register( "person", dm );
namePhoneDuplicate.addConstraint( "person", "first_name" );
namePhoneDuplicate.addConstraint( "person", "last_name" );
namePhoneDuplicate.addConstraint( "phone", "phone_number" );

Duplicate filters are checked in the order they are added, so consider perfomance when creating duplicate filters. If, for instance, most duplicates will match on the social security number, create that filter before the others. Also make sure to add indexes into your database on those columns that you are selecting by or else performance will rapidly degrade as your database gets large.

Duplicates will be filtered by any one of the filters created. If multiple fields must all match for an entry to be a duplicate, create a single filter and add each of those fields as constraints, as shown in the third filter created above. In other words, constraints added to a single filter will be ANDed together, while seperate filters will be ORed.

addConstraint

void sqlDuplicateFilter.addConstraint ( String table, String column ) (professional and enterprise editions only)

Description

Add a constraint that checks the value of new entries against the value of entries already in the database for a given column and table.

Parameters

  • table Name of the database table, either the same table the filter is registered to or one of it's children
  • column The column that will be checked in the table for a duplicate with new values

Return Values

Returns void.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Register a new duplicate filter

 // Import classes
 import com.screenscraper.datamanager.sql.*;

 //Get the data manager
 SqlDataManager dm = session.getVariable( "_DATAMANAGER" );

 // Register a new duplicate filter
 // Check for duplicate people, so register it for the people table
 SqlDuplicateFilter nameFilter = SqlDuplicateFilter.register("people", dm);

 //Add constraints to match when a first name, middle initial, and last name match a different row in the database
 nameFilter.addConstraint( "people", "first_name" );
 nameFilter.addConstraint( "people", "middle_initial" );
 nameFilter.addConstraint( "people", "last_name" );

Sometimes the data will need to be filtered across multiple tables, or possibly different constaints might indicate a duplicate. An example of this is a person might be a duplicate if their SSN matches OR if their driver's license number matches. Alternatively, they may be a duplicate when they have the same first name, last name, and phone number.

import com.screenscraper.datamanager.sql.SqlDuplicateFilter;

/*
Perform the setup of the SqlDataManager, as shown previously, and name the variable dm.
*/


//register an SqlDuplicateFilter with the DataManager for the social security number
SqlDuplicateFilter ssnDuplicate = SqlDuplicateFilter.register( "person", dm );
ssnDuplicate.addConstraint( "person", "ssn" );

//register an SqlDuplicateFilter with the DataManager for the social security number
SqlDuplicateFilter licenseDuplicate = SqlDuplicateFilter.register( "person", dm );
licenseDuplicate.addConstraint( "person", "drivers_license" );

//register an SqlDuplicateFilter with the DataManager for the name/phone number
//where the person table has a child table named phone.
SqlDuplicateFilter namePhoneDuplicate = SqlDuplicateFilter.register( "person", dm );
namePhoneDuplicate.addConstraint( "person", "first_name" );
namePhoneDuplicate.addConstraint( "person", "last_name" );
namePhoneDuplicate.addConstraint( "phone", "phone_number" );

Duplicate filters are checked in the order they are added, so consider perfomance when creating duplicate filters. If, for instance, most duplicates will match on the social security number, create that filter before the others. Also make sure to add indexes into your database on those columns that you are selecting by or else performance will rapidly degrade as your database gets large.

Duplicates will be filtered by any one of the filters created. If multiple fields must all match for an entry to be a duplicate, create a single filter and add each of those fields as constraints, as shown in the third filter created above. In other words, constraints added to a single filter will be ANDed together, while seperate filters will be ORed.

register

SqlDuplicateFilter SqlDuplicateFilter.register ( String table, SqlDataManager dataManager ) (professional and enterprise editions only)

Description

Create an SqlDuplicateFilter for a specific table and register it with the data manager.

Parameters

  • table Name of the database table with the primary key, as a string.
  • dataManager The data manager that will use this filter when adding entries to the database.

Return Values

Returns an SqlDuplicateFilter that can then be configured for duplicate entries.

Change Log

Version Description
5.0 Available for professional and enterprise editions.

Examples

Register a new duplicate filter

 // Import classes
 import com.screenscraper.datamanager.sql.*;

 //Get the data manager
 SqlDataManager dm = session.getVariable( "_DATAMANAGER" );

 // Register a new duplicate filter
 // Check for duplicate people, so register it for the people table
 SqlDuplicateFilter nameFilter = SqlDuplicateFilter.register("people", dm);

 //Add constraints to match when a first name, middle initial, and last name match a different row in the database
 nameFilter.addConstraint( "people", "first_name" );
 nameFilter.addConstraint( "people", "middle_initial" );
 nameFilter.addConstraint( "people", "last_name" );

Match Duplicates across tables

 // Import classes
 import com.screenscraper.datamanager.sql.*;

 //Get the data manager
 SqlDataManager dm = session.getVariable( "_DATAMANAGER" );

 // Register a new duplicate filter
 // Check for duplicate people, so register it for the people table
 SqlDuplicateFilter personFilter = SqlDuplicateFilter.register("people", dm);

 // Catch duplicates when a new entry has the same first name, last name, and phone number as another entry
 // Note that phone is a child table of people
 personFilter.addConstraint( "people", "first_name" );
 personFilter.addConstraint( "people", "last_name" );
 personFilter.addConstraint( "phone", "phone_number" );