Scraping and Extracting Links from any major Search Engine like Google, Yandex, Baidu, Bing and Duckduckgo

Posted on in Meta • Tagged with Scraping, Baidu, Extracting, Google, Programming, Python, Searchengine, Bing, Meta

Prelude

It's been quite a while since I worked on my projects. But recently I had some motivation and energy left, which is quite nice considering my full time university week and a programming job besides.

I have a little project on GitHub that I worked on every now and again in the last year or so. Recently it got a little bit bigger (I have 115 github stars now, would've never imagined that I ever achieve this) and I receive up to 2 mails with job offers every week (Sorry if I cannot accept any request :( ).

But unfortunately my progress with this project is not as good as I want it to be (that's probably a quite common feeling under us programmers). It's not a problem of missing ideas and features that I want to implement, the hard part is to extend the project without blowing legacy code up. GoogleScraper has grown evolutionary and I am waisting a lot of time to understand my old code. Mostly it's much better to just erease whole modules and reimplement things completely anew. This is essentially what I made with the parsing module.

Parsing SERP pages with many search engines

So I …


Continue reading