NavigationUser loginscreen-scraper.com welcomes...
Currently online
There are currently 0 users and 4 guests online.
|
Tutorial 1: Page 3: Proxy Server Setup
An HTTP proxy server is basically just a program that sits in between a web browser and a web server, passing bits between each. screen-scraper contains a proxy server that allows you to view all requests that your web browser sends, and the corresponding responses that web servers send in return. The proxy server records all of the pages requested by your browser as you surf so that they can be easily scraped by screen-scraper at a later point.
![]() OK, enough talk; it's time to fire up screen-scraper. If you're running Windows this is done by selecting the appropriate link from the "Start" menu. On Unix/Linux or Mac OS X use the "screen-scraper" link that was created when you installed screen-scraper. Once screen-scraper has fully loaded you'll see a tree on the left which will contain the objects we'll be creating. Right now we need to set up screen-scraper's proxy server. In screen-scraper you'll generally use a proxy session for each web site you'd like to extract information from. A proxy session holds all of the HTTP requests and responses recorded from your browser for the period of time you run it. Create a proxy session now by clicking the "New Proxy Session" button (looks like a globe) or by selecting "New Proxy Session" from the "File" menu. screen-scraper should now look like this:
![]() Give the proxy session a name by typing "Hello World" into the "Name" field. The "Port" field determines the port number that your web browser will use when communicating with screen-scraper's proxy server. The bottom checkbox causes the proxy server to ignore binary files (which are generally not very interesting when you're scraping text-based data). For now we're only concerned with the "Port" field, which you should be able to leave as 8777. Next we need to set up your web browser so that it will use screen-scraper as a proxy server. If you have two web browsers installed on your computer we recommend using one of them to continue through the tutorial and the other to interact with the proxy server. For example, if you have Internet Explorer and Firefox installed you may want to view the tutorial pages using Firefox and use Internet Explorer with the proxy server. Odds are you're using Internet Explorer as your primary browser, so we'll give detailed instructions on setting it up. If you're using a different web browser try one of the following links: Firefox, Opera, Mozilla, or Netscape Open up Internet Explorer, then click on "Internet Options" from the "Tools" menu. You should get a dialog box like this: ![]() From here click on the "Connections" tab, then on the "LAN Settings" button. Click on the checkbox beginning with "Use a proxy server for...", then on the "Advanced..." button. The dialog box should now look like this: ![]() In the "HTTP" and "Secure" fields type "localhost" under the "Proxy address to use" column, and "8777" under "Port" (assuming you haven't changed the default port number from 8777). Hit the "OK" button a few times till you get back to your web browser. NOTE: Depending on your operating system, instead of "localhost" you may need to use either "127.0.0.1" or the IP address of the machine. If you have trouble connecting to screen-scraper's proxy with your web browser, please see this FAQ. At this point your browser is set up such that any time you click on a link or submit a form the request will first go to screen-scraper, where it will be recorded, and then get sent to the web server it was intended for. The web server will respond back to screen-scraper, which will record the response, then send it along to your web browser. If you're running Mac OS X, and are using screen-scraper Professional or Enterprise Edition, there's one more step you'll need to take. In screen-scraper, click the wrench icon to bring up the "Settings" dialog box. Click on the "Servers" button in the panel on the left, then remove any text contained in the "Hosts to allow to connect" text box. Because of the way Mac OS X handles IP addresses, we do this so that screen-scraper will accept connections from your web browser. At this point we can get the proxy server running. Do this now in screen-scraper and clicking on the "Start Proxy Server" button for your proxy session. After this click on the "Progress" tab, which will display all of the requests and responses recorded by the proxy server. You're now ready to have screen-scraper record a few pages for you...
|
SearchNew Video!Tags Throughout this Site |
Recent comments
4 hours 8 min ago
4 hours 15 min ago
6 hours 21 min ago
1 day 1 hour ago
1 day 2 hours ago
1 day 2 hours ago
1 day 3 hours ago
1 day 3 hours ago
1 day 3 hours ago
3 days 12 sec ago