size of tidy.log 40 gigabytes

Hi,

Yesterday we had a problem with the server where I installed screenscraper professional edition, running as a server. The tidy.log file was 40 gigabytes big. Because of that the disk of the server had little space left. What I saw before with this logfile was 6 bytes, empty, 10 bytes, 2 bytes, ... It seemed to be emptied often. How can it grow so hard ?

Thanks,
Tamara Vos

size of tidy.log 40 gigabytes

Tamara,

We've had a realization this morning. The tidy.log file was deprecated at some point between 3.0 and the current version. If you upgrade your version you'll find that tidy.log is no longer being created.

-Scott

size of tidy.log 40 gigabytes

tamara,

There have been quite a few functional improvements and bug fixes since 3.0. I recommend that you make a back up of your current sessions (simply export them to safe place), then go into the settings and check the box next to "Allow upgrading to unstable versions" and, next, choose "Check for updates" under the options menu. Follow the instructions presented and you'll be upgraded to 3.0.36a.

Give you scrapes a try and you may find that the issue with tidying goes away. As I said before, it's unclear what's causing the tidying log to fill up like it is but because of the numerous advances made since 3.0 you may find overall improvement in your scrapes.

Please let us know how it goes.

Thanks,
Scott

size of tidy.log 40 gigabytes

Well, I cannot switch off tidying off for that page only. I scrape hotel websites with 1 scrapeable file with a variable url. But it happened only for a few hotel websites, so I'll investigate it further to identify when and how it happens.

I'm using screenscraper version 3.0. I'll give you more information about the situation that caused the bug.

Tamara

size of tidy.log 40 gigabytes

tamara,

Wow. That's amazing it was able to grow so big. We're guessing that's possibly a result of a bug in the HTML Tidy program that is integrated into screen-scraper.

The solution for you would be to identify what page caused the Tidy errors and turn off tidy for that page. You can selectively turn off tidy by unchecking the box where it says, "Tidy HTML after scraping" under the advanced tab for that particular scrapeable file.

I hope you were able to recover your hard drive space without too much trouble. We apologize for the difficulties this may have caused.

To better help us troubleshoot any future occurrences of this bug could you tell me what version of screen-scraper are you running? This is available at Help > About screen-scraper.

Please let us know if this works for you.

Thanks,
Scott