I get a Page Not Found 404 error when using proxy server

Hi,

I am trying to scrape a site, and it give me a Page Not Found 404 error whenever I use the proxy server. It works fine if I'm not using the proxy server.

Any idea of how I might get through this?

Here is the site:

http://www.icaac.org/
then I click --> "here" at the link for July 30, 2007 in "Click Here to search the 47th ICAAC program", which takes me here:

http://www.abstractsonline.com/viewer/?mkey=%7BD52CF5B5-E7A0-40B1-B430-C...

At the next page, I click Browse, which takes me here:

http://www.abstractsonline.com/viewer/browseOptions.asp?MKey=%7BD52CF5B5...

but then when I click poster session

the URL is :

http://www.abstractsonline.com/viewer/browseBySessionType.asp?BrowseQual......

but I get a page not found error.

Any thoughts?

Thanks,

Bhavesh

I get a Page Not Found 404 error when using proxy server

Bhavesh,

We found what was causing this. The culprit are the curly braces {} in the url. Use of those characters in a URL is not valid; however, our favorite browsers have learned to be flexible and will allow such things. The HTTP client we use in screen-scraper is especially strict about well-formed elements so we're loosening it up a bit.

A fix can be found by updating your version of screen-scraper. Updates are available only for professional edition users. Please update your by checking the box in the settings window, "Allow upgrading to unstable versions".

http://www.screen-scraper.com/support/faq/faq.php#NoUpdates

If you're not using the professional edition but would consider purchasing a license at a discount please private message me.

Thanks,
Scott

i'm glad it's not just me...

Scott,

Thank you very much for your help. I am definitely curious to see what you discover. I'm still trying to figure it out on my end too.

The following URL works without the proxy server, but not with it:

http://www.abstractsonline.com/viewer/viewAbstractPrintFriendly.asp?CKey={76D8F0F8-DB9F-400F-B881...

Is there a cookie issue perhaps?

Bhavesh

I get a Page Not Found 404 error when using proxy server

Bhavesh,

We're looking into this right now. I'm having the same issue on my end, which is good because that means we can replicate the problem and more easily find a solution to it.

Clicking on the "Browse" button does not actually use JavaScript; however, I did notice that some of the links that produce this same result did call JavaScript window.open. A JavaScript call to window.open has no affect on screen-scraper's ability to proxy the transaction, though.

We'll let you know what we find...

Thanks,
Scott

javascript window

i was looking at the soure html, and it is doing a javascript call to open a window.

is that what the problem is?

thanks,

Bhavesh