Getting data (Data Source)


Web pages by url generator


This method can be used to extract numerated pages like you find in a google search or at Ebay. It lets you easily harvest pages that are typically generated by a search query, like seraching for a Porsche on Ebay. Resulting pages are typically generated through a post like "http://www.myserver.com/results.asp?page=1". By cutting up the url you can generate pages. This works as follows (You will find the example in "generator example.hhp").

In the Data Source window, select Web pages by url generator. Under Url part 1, put in the url bit that presents the first part of the url before the numerator, in this case "http://www.happyharvester.com/test/top1000/page". Under Part 2 put ".html". Now we need to think about the numerator part. We choose Generate numbers, put "1" in Start and "49" in Stop. Step should contain "1". HH can now generate page1.html to page49.html.  Click on the Preview button and this page is displayed.

This is then an alternative way to harvest this type of web page.

To read of more ways to get data, go here.
To read about how to extract data, go here.