Battling incomplete information: Connect market demand with market supply by Google advertisement scraping and lead crawling

Posted on in Scraping, Crawling • Tagged with puppeteer, web scraping, headless chrome, marketing

In this blog post, it is explained how a lack of perfect information about the market allows the clever middleman to connect market supply with market demand by advertisement scrawping and lead crawling.


Continue reading

Scraping 1 million keywords on the Google Search Engine

Posted on in Scraping • Tagged with puppeteer, web scraping, headless chrome, 1 million, queue, architecture

Scraping one million keywords is not a easy task. There are proxy problems, big data problems and reliability issues. In this blog post, the most valuable insights are shared.


Continue reading

Scraping with puppeteer and headless chrome deployed to AWS Lambda

Posted on in Scraping • Tagged with puppeteer, web scraping, AWS lambda, headless chrome

In this blog post, we demonstrate how a web scraping function is deployed to the AWS cloud with puppeteer and headless chrome.


Continue reading

Struktur: A completely new approach to web scraping

Posted on in Scraping • Tagged with puppeteer, web scraping, CSS selectors, XPath queries

I will shop an alternative approach to web scraping without using css selectors and XPath queries. We make use of the fact that most web pages visually render the information of interest in a coherent, structured way. This technique requires a remotely controllable web browser such as puppeteer, that is capable of rendering web pages visually.


Continue reading