NavigationUser loginscreen-scraper.com welcomes...
Currently online
There are currently 0 users and 2 guests online.
|
Tutorial 6: Page 4: Generating the XML Feed
Let's run a quick test just to make sure the scraping session works. After that, we'll add a few more bells and whistles. Start up screen-scraper as a server. If you need help on that try this page. Once that's up, assuming you haven't altered the default "SOAP Server" port (which is also the web server port), and that you're running screen-scraper on your local machine, try entering this URL in to your browser: http://localhost:8779/ss/xmlfeed?scraping_session=Shopping+Site&SEARCH=bug If all goes well the browser should take a little bit to load, then you should see an XML document appear containing the extracted information. If you got an error message or the document didn't appear as you expected it to, check screen-scraper's log. Just as with scraping sessions run remotely, screen-scraper will create a log file in its "log" folder corresponding to each RSS/Atom scraping session. Dealing with the URL directly can be a bit cryptic, what with the encoding and all. As such, let's make use of a little HTML file that will allow us to generate feeds using different search parameters and formats. You can access it here. Note that this HTML file assumes that you're running screen-scraper as a server on your local machine on port 8779. If any of that isn't the case you'll want to download the HTML file to your local machine, alter it with your settings, then open it back up in your browser. Try experimenting with the form a bit. It gives you control over most all of the features that are available, including the format of the feed. Also take a close look at the URL. screen-scraper simply converts the GET parameters in the URL to session variables in the scraping session. If you'd like, you can even open the feed in your favorite RSS/Atom reader to ensure that the format is valid.
|
SearchNew Video!Tags Throughout this Site |
Recent comments
17 hours 25 min ago
17 hours 42 min ago
18 hours 30 min ago
18 hours 38 min ago
18 hours 48 min ago
18 hours 59 min ago
2 days 15 hours ago
3 days 21 hours ago
3 days 22 hours ago
5 days 18 hours ago