Anyhunt
Guides

Error Handling

Understanding error codes and implementing retry strategies

Error Handling

Learn how to handle errors from Anyhunt APIs and implement robust retry strategies.

Error Response Format

All API errors follow RFC 7807 (Problem Details):

{
  "type": "https://anyhunt.app/errors/ERROR_CODE",
  "title": "Error Title",
  "status": 400,
  "detail": "Human-readable error message",
  "code": "ERROR_CODE",
  "requestId": "req_123",
  "details": {}
}

Common Error Codes

Client Errors (4xx)

CodeHTTP StatusDescription
INVALID_URL400URL is malformed or uses unsupported protocol
URL_NOT_ALLOWED400URL blocked by SSRF protection (localhost, private IPs)
INVALID_PARAMETER400Request parameter validation failed
SELECTOR_NOT_FOUND400CSS selector not found on page
UNAUTHORIZED401Missing or invalid API key
FORBIDDEN403API key lacks required permissions
NOT_FOUND404Resource (job, scrape) not found
RATE_LIMITED429Too many requests - slow down
QUOTA_EXCEEDED429Monthly quota exhausted

Server Errors (5xx)

CodeHTTP StatusDescription
PAGE_TIMEOUT504Page load exceeded timeout
BROWSER_ERROR500Browser crashed or failed
NETWORK_ERROR500Network request failed
INTERNAL_ERROR500Unexpected server error

Handling Specific Errors

Rate Limiting

When rate limited, the response includes retry information:

{
  "type": "https://anyhunt.app/errors/RATE_LIMITED",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "Too many requests",
  "code": "RATE_LIMITED",
  "details": {
    "retryAfter": 60,
    "limit": 100,
    "remaining": 0,
    "resetAt": "2024-01-15T11:00:00.000Z"
  }
}

Headers:

HeaderDescription
X-RateLimit-LimitRequests allowed per window
X-RateLimit-RemainingRequests remaining
X-RateLimit-ResetWindow reset timestamp
Retry-AfterSeconds to wait (when limited)

Quota Exceeded

{
  "type": "https://anyhunt.app/errors/QUOTA_EXCEEDED",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "Monthly quota exceeded",
  "code": "QUOTA_EXCEEDED",
  "details": {
    "quota": 10000,
    "used": 10000,
    "resetAt": "2024-02-01T00:00:00.000Z"
  }
}

Page Timeout

{
  "type": "https://anyhunt.app/errors/PAGE_TIMEOUT",
  "title": "Gateway Timeout",
  "status": 504,
  "detail": "Page load timed out after 30000ms",
  "code": "PAGE_TIMEOUT",
  "details": {
    "url": "https://slow-website.com",
    "timeout": 30000
  }
}

Retry Strategies

Exponential Backoff

Implement exponential backoff for transient errors:

async function scrapeWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch('https://server.anyhunt.app/api/v1/scrape', {
        method: 'POST',
        headers: {
          'Authorization': 'Bearer ah_your_api_key',
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({ url, ...options }),
      });

      const data = await response.json();

      if (response.ok) {
        return data;
      }

      const code = data.code;
      const detail = data.detail || `Request failed (${response.status})`;

      // Don't retry client errors (except rate limits)
      if (response.status >= 400 && response.status < 500 && code !== 'RATE_LIMITED') {
        throw new Error(`Client error: ${detail}`);
      }

      // Handle rate limiting
      if (code === 'RATE_LIMITED') {
        const retryAfter = data.details?.retryAfter || 60;
        console.log(`Rate limited. Waiting ${retryAfter}s...`);
        await sleep(retryAfter * 1000);
        continue;
      }

      // Retry server errors with exponential backoff
      if (attempt < maxRetries) {
        const delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s
        console.log(`Attempt ${attempt + 1} failed. Retrying in ${delay}ms...`);
        await sleep(delay);
      }
    } catch (error) {
      if (attempt === maxRetries) {
        throw error;
      }
    }
  }

  throw new Error('Max retries exceeded');
}

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Python Implementation

import time
import requests
from typing import Optional

def scrape_with_retry(
    url: str,
    options: dict = None,
    max_retries: int = 3
) -> dict:
    options = options or {}

    for attempt in range(max_retries + 1):
        try:
            response = requests.post(
                'https://server.anyhunt.app/api/v1/scrape',
                headers={
                    'Authorization': 'Bearer ah_your_api_key',
                    'Content-Type': 'application/json',
                },
                json={'url': url, **options},
            )

            data = response.json()

            if response.ok:
                return data

            error_code = data.get('code')
            detail = data.get('detail') or f"Request failed ({response.status_code})"

            # Don't retry client errors (except rate limits)
            if 400 <= response.status_code < 500 and error_code != 'RATE_LIMITED':
                raise Exception(f"Client error: {detail}")

            # Handle rate limiting
            if error_code == 'RATE_LIMITED':
                retry_after = data.get('details', {}).get('retryAfter', 60)
                print(f"Rate limited. Waiting {retry_after}s...")
                time.sleep(retry_after)
                continue

            # Retry server errors with exponential backoff
            if attempt < max_retries:
                delay = (2 ** attempt)
                print(f"Attempt {attempt + 1} failed. Retrying in {delay}s...")
                time.sleep(delay)

        except requests.RequestException as e:
            if attempt == max_retries:
                raise
            delay = (2 ** attempt)
            time.sleep(delay)

    raise Exception('Max retries exceeded')

Error Recovery Patterns

Circuit Breaker

Prevent cascading failures with a circuit breaker:

class CircuitBreaker {
  constructor(threshold = 5, resetTimeout = 60000) {
    this.failures = 0;
    this.threshold = threshold;
    this.resetTimeout = resetTimeout;
    this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
    this.nextAttempt = 0;
  }

  async execute(fn) {
    if (this.state === 'OPEN') {
      if (Date.now() < this.nextAttempt) {
        throw new Error('Circuit breaker is OPEN');
      }
      this.state = 'HALF_OPEN';
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  onSuccess() {
    this.failures = 0;
    this.state = 'CLOSED';
  }

  onFailure() {
    this.failures++;
    if (this.failures >= this.threshold) {
      this.state = 'OPEN';
      this.nextAttempt = Date.now() + this.resetTimeout;
    }
  }
}

// Usage
const breaker = new CircuitBreaker(5, 60000);

async function safeScrape(url) {
  return breaker.execute(() => scrapeWithRetry(url));
}

Graceful Degradation

Handle partial failures in batch operations:

async function scrapeBatchWithFallback(urls) {
  const results = [];
  const errors = [];

  for (const url of urls) {
    try {
      const result = await scrapeWithRetry(url);
      results.push({ url, success: true, data: result });
    } catch (error) {
      errors.push({ url, success: false, error: error.message });
      // Continue processing other URLs
    }
  }

  return {
    results,
    errors,
    successRate: results.length / urls.length,
  };
}

Best Practices

  1. Check HTTP status - Use response.ok and parse RFC7807 error bodies
  2. Implement retries - Use exponential backoff for transient errors
  3. Respect rate limits - Use Retry-After header values
  4. Log errors - Keep records for debugging
  5. Set timeouts - Don't wait indefinitely for responses
  6. Handle partial failures - In batch operations, process what you can
  7. Monitor error rates - Track errors to detect issues early

Debugging Tips

Enable Verbose Logging

const response = await fetch('https://server.anyhunt.app/api/v1/scrape', {
  // ...
});

console.log('Status:', response.status);
console.log('Headers:', Object.fromEntries(response.headers));
console.log('Body:', await response.text());

Check Request ID

Every response includes a request ID for support:

X-Request-Id: req_abc123xyz

Include this ID when reporting issues.