In this blog post I will demonstrate how it is possible to detect several scraping services: luminati.io, ScrapingBee, scraperapi.com, scrapingrobot.com, scrapfly.io.
Continue reading
Posted on March 11, 2021 in Scraping • Tagged with detecting, scraping, security, fingerprint • 13 min read
In this blog post I will demonstrate how it is possible to detect several scraping services: luminati.io, ScrapingBee, scraperapi.com, scrapingrobot.com, scrapfly.io.
Posted on March 01, 2021 in Scraping • Tagged with web scraping, crawling, puppeteer, playwright • 13 min read
In this blog post, I am talking about my several year long experience with web scraping and common mistakes I made along the road. The more I dive into web scraping, the more I realize how easy it is to take wrong decisions when scraping a site. For that reason, I compiled a list of seven common mistakes in regard to web scraping.
Posted on February 05, 2021 in Security • Tagged with JavaScript, deviceorientation, devicemotion • 3 min read
Android mobile devices give to any website device orientation and device motion data. This data is quite sensitive in nature and should not be granted to websites without obtaining explicit user consent.
Posted on January 23, 2021 in Tutorials • Tagged with AWS Lambda, Xvfb, Docker, Container • 4 min read
The following write-up is an attempt to launch headful Google Chrome with Xvfb on AWS Lambda container.
Posted on January 17, 2021 in Security • Tagged with red pill, Bot, Advanced Bots, JavaScript, Puppeteer, Playwright • 6 min read
Advanced bots use modern browsers and automation frameworks such as puppeteer and playwright. It becomes increasingly hard to distinguish bots from real human traffic, therefore, new methods are required.
Posted on January 10, 2021 in Security • Tagged with browser, port scanning, JavaScript • 10 min read
In this article, various techniques to conduct port scanning from within the browser are developed. Modern JavaScript is used.