Our Scraping API is currently in beta. Features and endpoints may change before the final release.Send feedback
Scraping API Documentation
Comprehensive documentation to help you integrate and use our Scraping API in your applications.
Getting Started
The Proxylite Scraping API allows you to extract data from any website while we handle all the complex parts like proxy management, JavaScript rendering, and CAPTCHA solving. Follow the steps below to integrate our API into your application.
Step 1: Sign Up & Get API Key
Create an account on Proxylite and subscribe to the Scraping API plan that suits your needs. You can find your API key in your account dashboard.
Step 2: Make Your First Request
Use your API key to authenticate and make a simple request to the API endpoint.
curl -X POST https://api.proxylite.io/v1/scrape \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"url": "https://example.com",
"render_js": true,
"proxy_type": "residential"
}'
Step 3: Parse the Response
The API will return the scraped content in JSON format. Parse this response to extract the data you need.
{
"success": true,
"request_id": "req_12345abcdef",
"url": "https://example.com",
"status_code": 200,
"html": "<!DOCTYPE html><html>...</html>",
"text": "Example Domain Example Domain...",
"headers": {
"content-type": "text/html; charset=UTF-8",
...
},
"performance": {
"total_time": 0.875,
"time_to_first_byte": 0.324
}
}
Authentication
All API requests must include your API key in the request headers for authentication.
API Key Authentication
Include your API key in the Authorization header using the Bearer token format:
Authorization: Bearer YOUR_API_KEY
API Endpoints
The primary endpoint for the Scraping API is:
POST /v1/scrape
Submits a URL to be scraped. Requires authentication.
https://api.proxylite.io/v1/scrape
Request Parameters
The API accepts the following parameters in the JSON request body:
url
(string): The URL of the page you want to scrape.
Error Handling
The API uses standard HTTP status codes to indicate the success or failure of a request. A 2xx
status code indicates success, while 4xx
and 5xx
codes indicate errors.
Common Error Codes
400 Bad Request
: Invalid request parameters (e.g., missing URL, malformed JSON).401 Unauthorized
: Invalid or missing API key.403 Forbidden
: API key does not have permission for the requested action or resource.429 Too Many Requests
: Rate limit exceeded.500 Internal Server Error
: An unexpected error occurred on our end.503 Service Unavailable
: The service is temporarily unavailable.
Error Response Format
Error responses will include a JSON body with details about the error:
{
"success": false,
"error": {
"code": "INVALID_API_KEY",
"message": "The provided API key is not valid."
}
}
Rate Limits
To ensure fair usage and stability, the API enforces rate limits based on your subscription plan. Exceeding the rate limit will result in a 429 Too Many Requests
error.
Checking Your Limits
The following headers are included in the API response to help you track your usage:
X-RateLimit-Limit
: The total number of requests allowed in the current window.X-RateLimit-Remaining
: The number of requests remaining in the current window.X-RateLimit-Reset
: The Unix timestamp (in seconds) when the rate limit window resets.
Refer to your plan details in the dashboard for specific rate limit values.
Frequently Asked Questions (FAQs)
How do I handle CAPTCHAs?
Proxylite automatically attempts to solve CAPTCHAs for you when using residential proxies or when JavaScript rendering is enabled. For complex cases, specific parameters might be available (check advanced options).
Can I scrape sites that require login?
Yes, you can pass session cookies or authentication tokens via the custom_headers
parameter to scrape authenticated pages.
What's the difference between datacenter and residential proxies?
Datacenter proxies are faster and cheaper but easier to detect. Residential proxies use real user IP addresses, making them harder to block but slightly slower and more expensive. Choose based on your target website's sensitivity.