Avoid Puppeteer or Playwright for Web Scraping

Posted on May 20, 2021 in Scraping • Tagged with web scraping, crawling, puppeteer, playwright, CDP • 10 min read

In this blog post I explain why it is best to avoid puppeteer and playwright for web scraping.


Continue reading

7 Common Mistakes in Professional Scraping

Posted on March 01, 2021 in Scraping • Tagged with web scraping, crawling, puppeteer, playwright • 13 min read

In this blog post, I am talking about my several year long experience with web scraping and common mistakes I made along the road. The more I dive into web scraping, the more I realize how easy it is to take wrong decisions when scraping a site. For that reason, I compiled a list of seven common mistakes in regard to web scraping.


Continue reading

Browser Red Pills: Why are you browsing my website from AWS Lambda?

Posted on January 17, 2021 in Security • Tagged with red pill, Bot, Advanced Bots, JavaScript, Puppeteer, Playwright • 6 min read

Advanced bots use modern browsers and automation frameworks such as puppeteer and playwright. It becomes increasingly hard to distinguish bots from real human traffic, therefore, new methods are required.


Continue reading