Our Scraping API is currently in beta. Features and endpoints may change before the final release.Send feedback

Scraping API Documentation

Comprehensive documentation to help you integrate and use our Scraping API in your applications.

Quick Navigation

Getting Started
Authentication
API Endpoints
Request Parameters
Error Handling
Rate Limits
FAQs

Getting Started

The Proxylite Scraping API allows you to extract data from any website while we handle all the complex parts like proxy management, JavaScript rendering, and CAPTCHA solving. Follow the steps below to integrate our API into your application.

Step 1: Sign Up & Get API Key

Create an account on Proxylite and subscribe to the Scraping API plan that suits your needs. You can find your API key in your account dashboard.

Step 2: Make Your First Request

Use your API key to authenticate and make a simple request to the API endpoint.

curl -X POST https://api.proxylite.io/v1/scrape \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
  "url": "https://example.com",
  "render_js": true,
  "proxy_type": "residential"
}'

Step 3: Parse the Response

The API will return the scraped content in JSON format. Parse this response to extract the data you need.

{
  "success": true,
  "request_id": "req_12345abcdef",
  "url": "https://example.com",
  "status_code": 200,
  "html": "<!DOCTYPE html><html>...</html>",
  "text": "Example Domain Example Domain...",
  "headers": {
    "content-type": "text/html; charset=UTF-8",
    ...
  },
  "performance": {
    "total_time": 0.875,
    "time_to_first_byte": 0.324
  }
}

Authentication

All API requests must include your API key in the request headers for authentication.

API Key Authentication

Include your API key in the Authorization header using the Bearer token format:

Authorization: Bearer YOUR_API_KEY

API Endpoints

The primary endpoint for the Scraping API is:

POST /v1/scrape

Submits a URL to be scraped. Requires authentication.

https://api.proxylite.io/v1/scrape

Request Parameters

The API accepts the following parameters in the JSON request body:

url (string): The URL of the page you want to scrape.

Error Handling

The API uses standard HTTP status codes to indicate the success or failure of a request. A 2xx status code indicates success, while 4xx and 5xx codes indicate errors.

Common Error Codes

400 Bad Request: Invalid request parameters (e.g., missing URL, malformed JSON).
401 Unauthorized: Invalid or missing API key.
403 Forbidden: API key does not have permission for the requested action or resource.
429 Too Many Requests: Rate limit exceeded.
500 Internal Server Error: An unexpected error occurred on our end.
503 Service Unavailable: The service is temporarily unavailable.

Error Response Format

Error responses will include a JSON body with details about the error:

{
  "success": false,
  "error": {
    "code": "INVALID_API_KEY",
    "message": "The provided API key is not valid."
  }
}

Rate Limits

To ensure fair usage and stability, the API enforces rate limits based on your subscription plan. Exceeding the rate limit will result in a 429 Too Many Requests error.

Checking Your Limits

The following headers are included in the API response to help you track your usage:

X-RateLimit-Limit: The total number of requests allowed in the current window.
X-RateLimit-Remaining: The number of requests remaining in the current window.
X-RateLimit-Reset: The Unix timestamp (in seconds) when the rate limit window resets.

Refer to your plan details in the dashboard for specific rate limit values.

Frequently Asked Questions (FAQs)

How do I handle CAPTCHAs?

Proxylite automatically attempts to solve CAPTCHAs for you when using residential proxies or when JavaScript rendering is enabled. For complex cases, specific parameters might be available (check advanced options).

Can I scrape sites that require login?

Yes, you can pass session cookies or authentication tokens via the custom_headers parameter to scrape authenticated pages.

What's the difference between datacenter and residential proxies?

Datacenter proxies are faster and cheaper but easier to detect. Residential proxies use real user IP addresses, making them harder to block but slightly slower and more expensive. Choose based on your target website's sensitivity.