cURL examples: requests with proxy, set user agent, send POST JSON request, and more

cURL is a small *nix utility to perform network requests.

This is a quick cheat sheet on how cURL can be used for web scraping or any other cases when you need to appear as sending web request from another ip address.

cURL set proxy

Setting proxy URL for cURL:

curl --proxy http://login:pw@proxy-hostname.com:port

Shortcut for --proxy option is -x, so this is the exact equivalent:

curl -x http://login:pw@proxy-hostname.com:port

cURL supports http, https, and socks proxies.

For a simple proxy checker script, powered by cURL and available in your terminal, see this post.

Setting headers in cURL

it is always a good idea to set browser user-agent AND proxy:

curl --proxy 'http://login:pw@proxy-hostname.com:port' -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36'

Send POST request with JSON in cURL

curl -X POST https://apiroad.net/post-json.php -H 'Content-Type: application/json'
   -d '{"login":"my_login","password":"my_password"}'

note how -X option is used to specify POST request. Do not be confused, lower case -x is used to set proxy, uppercase -X is used to specify HTTP method!

Get your machine ip address via cURL:

curl https://lumtest.com/myip.json

Sample output:

{"ip":"176.132.26.163","country":"TR","asn":{"asnum":35984,"org_name":"Superonline A.S."},"geo":{"city":"Antalya","region":"07","region_name":"Antalya","postal_code":"07070","latitude":36.2409,"longitude":30.3219,"tz":"Europe/Istanbul","lum_city":"antalya","lum_region":"07"}}%   

of course, the same trick can be used to see the proxy location:

curl -x http://login:pw@proxyhost:port https://lumtest.com/myip.json

View HTTP response code and more details, in cURL

curl -v https://lumtest.com/myip.json 

-v flag means verbose, so it outputs a lot of useful data along with response

Sample output:

*   Trying 3.94.40.55:443...
* Connected to lumtest.com (3.94.40.55) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (OUT), TLS handshake, Client hello (1):
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=lumtest.com
*  start date: Oct 12 00:00:00 2022 GMT
*  expire date: Oct 18 23:59:59 2023 GMT
*  subjectAltName: host "lumtest.com" matched cert's "lumtest.com"
*  issuer: C=GB; ST=Greater Manchester; L=Salford; O=Sectigo Limited; CN=Sectigo RSA Domain Validation Secure Server CA
*  SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x125811400)
> GET /myip.json HTTP/2
> Host: lumtest.com
> user-agent: curl/7.79.1
> accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
< HTTP/2 200 
< server: nginx
< date: Mon, 12 Dec 2022 16:21:47 GMT
< content-type: application/json; charset=utf-8
< content-length: 296
< cache-control: no-store
< access-control-allow-origin: *
< 
* Connection #0 to host lumtest.com left intact
{"ip":"176.132.26.163","country":"TR","asn":{"asnum":35984,"org_name":"Superonline A.S."},"geo":{"city":"Antalya","region":"07","region_name":"Antalya","postal_code":"07070","latitude":36.2409,"longitude":30.3219,"tz":"Europe/Istanbul","lum_city":"antalya","lum_region":"07"}}%    

Generate Python web scraper from cURL

Convering cURL to Python is possible using ScrapeNinja cURL converter

scrapeninja.net will do even more - not only it will convert your cURL to Python Requests code, it will also leverage special scraping cloud API which will automatically:

  1. Use rotating residential proxies (and you can choose multiple proxy locations)
  2. Have TLS fingerprint of Chrome browser instead of TLS fingerprint of Python which helps to unblock a lot of websites
  3. Leverage smart retry strategies