Parse out Duplicates?

I have a piece of code that I need to parse out.

onmouseover="ddrivetip('Herold Herold Assistant*
Jun 2009
7.17','#F5E7AF')" onmouseout="hideddrivetip()" />

I need to retrieve the data into two variables:

Company: Herold
Product: Herold Assistant*

Since "Herold" is a duplicate word, how do I go about extracting these two variables?

That one would be hard for a

That one would be hard for a human to tell is a company/product if he didn't already know. Are there any rules you can cling to like companies are only one word? Do you have a static list of possible companies?

Companies are not one word

Companies are not one word, however, I do have a list of companies that will be in the dataset.

Okay, that is possible.

Okay, that is possible. First you will need a way to compare your list to what is scraped. If it's small enough I would just save them in an array with a script at the beginning of the scrape.

String[] companies = {"ABC Corp", "XYZ Ltd", "ACME"};
session.setVariable("COMPANIES", companies);

Then you would extract the company block when you find it.

onmouseover="ddrivetip('~@COMPANY_AND_PRODUCT@~

Finally, you need a script that will check for the company in the extracted data, and this will use a lot of Sting manipulation.

// Local reference to variables
companies = session.getVariable("COMPANIES");

// Iterate array of possible companies
for (i=0; i {
company = companies[i];
if (dataRecord.get("COMPANY_AND_PRODUCT").startsWith(company))
{
// The scraped string displays the company
foundCompany = dataRecord.get("COMPANY_AND_PRODUCT").substring(0, company.length());
product = dataRecord.get("COMPANY_AND_PRODUCT").substring(company.length()+1, dataRecord.get("COMPANY_AND_PRODUCT").length());
}
}

Of course, that's off the top of my head, and will need some refinement, but should get you on the right path.

Thank you! that worked

Thank you! that worked perfectly.