Discontinuation of GoogleScraper

Posted on December 24, 2018 in GoogleScraper • Tagged with discontinuation, GoogleScraper, scraping • 1 min read

Discontinuation of GoogleScraper in favor of https://www.npmjs.com/package/se-scraper


Continue reading

GoogleScraper Tutorial - How to scrape 1000 keywords with Google

Posted on September 05, 2018 in GoogleScraper • Tagged with tutorial, GoogleScraper, scraping • 3 min read

Tutorial that teaches how to use GoogleScraper to scrape 1000 keywords with 10 selenium browsers.


Continue reading

A lot of work to do for GoogleScraper in the future and request for comments!

Posted on March 01, 2015 in Googlescraper • Tagged with Software, Python, Programming, Googlescraper • 3 min read

Hello dear readers

I get a lot of mail regarding questions about GoogleScraper. I really appreciate them, but at some stage I cannot answer them anymore. In the last weeks I didn't have a lot of time (and motivation I must admit) to put into GoogleScraper.

The reason is, that I am still unconfortable with the architecture of GoogleScraper. There are basically two ways to use the tool:

  • As a command line tool
  • From another program over the API (programming approach)

and furthermore there are 3 very different modes GoogleScraper runs in:

  • http mode
  • selenium mode which again can be divided in Firefox, Chrome and PhantomJS selenium browsers
  • asynchronous mode

whereas I think that selenium is the hardest to work with (very buggy and complex to program in). This leads to a complex software architecture, mainly because the two operational modes (CLI tool and API) have different priorities of how to handle exceptions.

The CLI tool should be VERY robust and it should to everything it can to continue scraping with the remaining ressources (like proxies, RAM, when lots of selenium instances become an issue, networking bandwith, ...), because the user cannot handle these problems by himself when he calls GoogleScraper …


Continue reading