Most efficient way for scripts to get dynamic configuration

We've got a TON of scrapes, and we have sample test cases for them all. The information about what goes into a test case is stored in another system.

I can think of a lot of different ways to get this info when a script runs (read flat file from filesystem, read xml from filesystem, talk to a database, hit a webservice, etc.).

But what is the BEST and MOST EFFICIENT method for doing something like this? I'm not familiar enough with java to know that, say, the filesystem is more expensive to hit than a database, etc.

Anyone else doing something like this?

Most efficient way for scripts to get dynamic configuration

Hi,

If speed is your main goal, I don't know of any method faster than simply hitting the file system (especially on a Linux box), so I suppose it would be option 4. It may be, though, that a database is quite a bit more convenient, even if a bit slower, so that may be a consideration as well. Bear in mind, also, that you're simply running Java, so if you wanted to do a bit of research on this topic as it pertains to Java in general, you could probably get a bit more detail that might be helpful.

Kind regards,

Todd

Most efficient way for scripts to get dynamic configuration

I didn't phrase that right. This is more about internal to screen-scraper, not invoking screen-scraper externally.

I've got screen-scraper running some default scripts at the beginning and end of each scrape. Rather than have a script that has "x should be defaulted to y" I'd like to have a datasource that can be modified externally.

I'd like screen-scraper, upon starting a scraping session, to hit a database (or xml file, or web service, or.. ?) to find out what parameters it should default to certain values etc.

Which method would (in theory) be most efficient? Assuming that whichever technology you suggest is implemented in a reasonably proper manner. (i.e. Best practices, not just hacking it)

So the question might be this:

Which of the following are the quickest and least resource intensive methods for screen-scraper scripts (via your interpreted java scripting) to communicate with a datasource prior to or during the scraping session?

1. Web service to external box
2. Database access to external box
3. Database access to database on same box as server
4. XML file on the filesystem
5. Other: _______________

Most efficient way for scripts to get dynamic configuration

Hi,

We've typically handled this kind of thing by storing the parameters in a database. We typically retrieve them from the database in a Java application (but could obviously be .NET, PHP, or whatever), then invoke screen-scraper remotely as needed.

Hope this helps.

Kind regards,

Todd Wilson