7 Common Mistakes in Professional Scraping

Posted on March 01, 2021 in Scraping • Tagged with web scraping, crawling, puppeteer, playwright • 8 min read

In this blog post I am talking about my year long experience with web scraping and some common mistakes. The more I dive into web scraping, the more I realize how easy it is to make mistakes in scraping. For that reason, I compiled a list of seven common mistakes when it comes to web scraping.

Continue reading

Why does this Website know that I am sitting on the Toilet?

Posted on February 05, 2021 in Security • Tagged with JavaScript, deviceorientation, devicemotion • 4 min read

Android mobile devices give to any website device orientation and device motion data. This data is quite sensitive in nature and should not be granted to websites without obtaining explicit user consent.

Continue reading

Headful Google Chrome with Xvfb on AWS Lambda Container

Posted on January 23, 2021 in Tutorials • Tagged with AWS Lambda, Xvfb, Docker, Container • 4 min read

The following write-up is an attempt to launch headful Google Chrome with Xvfb on AWS Lambda container.

Continue reading

Browser Red Pills: Why are you browsing my website from AWS Lambda?

Posted on January 17, 2021 in Security • Tagged with red pill, Bot, Advanced Bots, JavaScript, Puppeteer, Playwright • 10 min read

Advanced bots use modern browsers and automation frameworks such as puppeteer and playwright. It becomes increasingly hard to distinguish bots from real human traffic, therefore, new methods are required.

Continue reading

Browser based Port Scanning with JavaScript

Posted on January 10, 2021 in Security • Tagged with browser, port scanning, JavaScript • 13 min read

In this article, various techniques to conduct port scanning from within the browser are developed. Modern JavaScript is used.

Continue reading

Breaking the Google Audio reCAPTCHA with Google's own Speech to Text API

Posted on January 02, 2021 in Security • Tagged with uncaptcha3, ReCaptcha, Google, Speech to Text API • 2 min read

In this project, I make use of a method from early 2019 that demonstrates how to solve the Audio reCAPTCHA with Google's own Speech to Text API. This method still works, which is quite astonishing.

Continue reading