Anyhunt
API Reference

Map API

Discover all URLs on a website via sitemap or browser crawling

Map API

The Map API discovers all URLs on a website. It first attempts to parse the sitemap, then falls back to browser-based crawling if needed. Returns a list of URLs without content.

Endpoints

MethodPathDescription
POST/api/v1/mapGet website URLs (synchronous)

Discover URLs

POST /api/v1/map

Request Body

ParameterTypeDefaultDescription
urlstringrequiredWebsite base URL
searchstring-Filter URLs by keyword
limitnumber5000Maximum URLs to return (1-10000)
ignoreSitemapbooleanfalseSkip sitemap, use browser crawl only
includeSubdomainsbooleanfalseInclude subdomain URLs

Example Request

curl -X POST https://server.anyhunt.app/api/v1/map \
  -H "Authorization: Bearer ah_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "search": "blog",
    "limit": 100
  }'

Response

{
  "links": [
    "https://example.com/blog",
    "https://example.com/blog/post-1",
    "https://example.com/blog/post-2",
    "https://example.com/blog/archive"
  ],
  "count": 4
}

How It Works

  1. Sitemap Parsing (default)

    • Fetches /sitemap.xml and /sitemap_index.xml
    • Parses XML to extract all <loc> URLs
    • Recursively parses sitemap index files
  2. Browser Crawling (fallback or when ignoreSitemap: true)

    • Starts from the homepage
    • Extracts all links from each page
    • Limited to configured MAP_MAX_CRAWL_PAGES (default 100)
  3. Filtering

    • Applies search keyword filter if specified
    • Deduplicates URLs
    • Filters out external domains (unless includeSubdomains: true)

Use Cases

Site Discovery Before Crawling

# First, discover all URLs
curl -X POST https://server.anyhunt.app/api/v1/map \
  -H "Authorization: Bearer ah_your_api_key" \
  -d '{"url": "https://docs.example.com"}'

# Then, crawl specific sections
curl -X POST https://server.anyhunt.app/api/v1/crawl \
  -H "Authorization: Bearer ah_your_api_key" \
  -d '{
    "url": "https://docs.example.com",
    "includePaths": ["/guides/*"]
  }'

Find Specific Content

# Find all pricing-related pages
curl -X POST https://server.anyhunt.app/api/v1/map \
  -H "Authorization: Bearer ah_your_api_key" \
  -d '{
    "url": "https://example.com",
    "search": "pricing"
  }'

Include Subdomains

# Get URLs from all subdomains
curl -X POST https://server.anyhunt.app/api/v1/map \
  -H "Authorization: Bearer ah_your_api_key" \
  -d '{
    "url": "https://example.com",
    "includeSubdomains": true
  }'

Notes

  • This endpoint is synchronous - the response includes all discovered URLs
  • Large sites may take longer to map (up to 30 seconds)
  • Use limit to control response size for very large sites
  • Results are deduplicated automatically