Building modern and production ready Node.js API: my boring setup As a founder of ScrapeNinja.net and a big fan of API-first products, I build and deploy a lot of APIs. Since I am a "sprint" guy, I am energized with the quick
GPT "summarize this url" prompt is broken, and I fixed it I have recently noticed that if I feed the URL of a random article into GPT, and ask it to summarize, using "please summarize <url>" prompt, it gives a pretty good
How to web scrape Zillow using ScrapeNinja and JavaScript Web scraping is a popular technique that allows developers to quickly and easily extract data from websites. It's especially useful for extracting real estate information, such as property listings and median home prices.
Low Code Web Scraping Recipe: track Apple.com for refurbished iPhones and get push alert on specific model There is a number of projects which allow website monitoring, but I needed a pretty custom one - I wanted to check Apple.com refurbished section for iphone 12 models and get push
node.js How to set proxy in Puppeteer: 3 ways Puppeteer is an incredibly useful tool for automating web browsers. It allows to run headless (or non-headless) Chrome instances, automatically interacting with websites and pages in ways that would normally require manual input
node.js How to set proxy in node-fetch Executing http(s) request via proxy might be helpful in a lot of cases, this helps to make your http request look like it was executed from a different country or location. Setting
node.js Web scraping in Javascript: node-fetch vs axios vs got vs superagent There is a number of ways to perform web requests in Node.js: node-fetch, axios, got, superagentNode.js can perform HTTP requests without additional packagesWhile I don't ever use this approach because of
cURL examples: requests with proxy, set user agent, send POST JSON request, and more cURL is a small *nix utility to perform network requests.This is a quick cheat sheet on how cURL can be used for web scraping or any other cases when you need to
I have tested out Zapier, Make.com and Pipedream.com from a developer perspective A few days ago, I took a deep dive into integrating my ScrapeNinja web scrapers into Zapier, Pipedream.com, and Integromat (Make.com) to better understand the market situation among low-code and no-code
Web scraping in Google Sheets: ImportXML & alternatives In case you want to import some random website data into Google Sheets, the obvious way to start this exciting adventure is to use importXML() function. The main advantage is that this function
Running untrusted JavaScript in Node.js ScrapeNinja Scraping API recently got an exciting feature called Extractors. Extractors are pieces of user-supplied Javascript code which are executed in ScrapeNinja backend so ScrapeNinja returns pure JSON with data, from any HTML
Cheerio: parse HTML in Javascript. Playground and cheatsheet Cheerio is a de-facto standard to parse HTML in a server-side Javascript (Node.js) now. It is a fast, flexible, and lean implementation of jQuery-like syntax designed specifically for the server.Github: https:
crypto What are USDT transfer fees now in BEP20, ERC20, TRC20? Have you ever wondered what is the best blockchain to send USDT and why Ethereum blockchain is so expensive (is it?) in terms of gas fees, and what is the best alternative? I
How to remove background from a signature: 3 tools Let's say you want to sign some PDF with you "real" human signature. Of course you can draw your signature using your mouse or touchpad, but this "fully digital" signature usually turns out
puppeteer Puppeteer: click an element and get raw JSON from XHR/AJAX response This lately became a pretty popular question when scraping with Puppeteer: let's say you want to interact with the page (e.g. click the button) and retrieve the raw ajax response (usually, JSON)
puppeteer Puppeteer API service for web scraping Okay, let's admit it - web scraping via Puppeteer and Playwright is the most versatile and flexible way of web scraping nowadays. Unfortunately it's also the most cumbersome, time consuming way of scraping,
sports Morning sports is my happiness magic pill I am an indie hacker and CTO in my mid-thirties, and my life improved so much a few years ago when morning activity magic clicked for me. After a period of burnout, reduced
php How to do web scraping in PHP Web scraping is a big and hot topic now, and PHP is a pretty fast language which is convenient for rapid prototyping and is wildly popular across web developers. I have pretty extensive
scrapeninja ScrapeNinja: never handle retries and proxies in your code again I am glad to announce that ScrapeNinja scraping solution just received major update and got new features: RetriesRetries are must have for every scraping project. Proxies fail to process your request, the target
Simple proxy checker script via CURL While working on the ScrapeNinja scraping solution, I often need to verify if particular proxy is alive and if it is performing well. Since I don't want to use various online services, especially
Sending Requests in Web Scraping: cURL, Chrome, Firefox, REST.client, netcat Contents:Chrome Dev ToolsCopy as cURLcURL options: proxy, show only headersFirefox: edit&resend; multi-account containerscURL to Python scraper converterVS Code REST.client extensionHTTP server one-liner for debuggingWhile working with scraping, I have
Making PDF look like scanned. Top 4 tools to apply scanner effect, reviewed. Some bigger companies still require wet signatures on documents, which was a source of constant hassle for me during recent years. My workflow was:Receive email with the PDF documentDownload the documentPrint the
scraping How to bypass CloudFlare 403 (code:1020) errors [UPDATED 2023] I've recently started getting Cloudflare 1020 (403) errors when scraping some random e-commerce website. At first, I thought that the website didn't like my scraper IP address, but changing IP addresses to clean
VS Code Remote for Node.js caveat: dealing with detached nodemon process I develop all my new projects on a remote Hetzner Cloud machine, using wonderful and almost too-good-to-be-true VS Code Remote. I recommend this setup for everyone who does not like spinning fans of
Clickhouse as an alternative to ElasticSearch and MySQL, for log storage and analysis, in 2021 In 2018, I've written an article about Clickhouse, this piece of content is still pretty popular across the internet, and even was translated a few times. More than two years have passed since,