快速开始
创建账号并发起你的第一个 API 请求
快速开始
本指南将帮助你设置 Anyhunt 并在几分钟内抓取你的第一个网页。
第一步:创建账号
- 访问 console.anyhunt.app
- 使用邮箱或 GitHub 账号注册
- 验证你的邮箱地址
第二步:获取 API 密钥
- 在侧边栏中导航到 API 密钥
- 点击 创建 API 密钥
- 为你的密钥命名(例如:"开发环境")
- 复制你的 API 密钥 - 你只能看到它一次!
请妥善保管你的 API 密钥。切勿在客户端代码或公开仓库中暴露它。
第三步:发起第一个请求
使用 curl 或你喜欢的 HTTP 客户端来抓取网页:
curl -X POST https://server.anyhunt.app/api/v1/scrape \
-H "Authorization: Bearer ah_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"formats": ["markdown", "screenshot"],
"onlyMainContent": true
}'响应
{
"id": "scrape_abc123",
"url": "https://example.com",
"markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
"screenshot": {
"url": "https://cdn.anyhunt.app/scraper/scrape_abc123.png",
"width": 1280,
"height": 800,
"format": "png"
},
"metadata": {
"title": "Example Domain",
"description": "Example Domain for illustrative examples"
}
}第四步:使用结果
响应包含你请求格式的提取内容:
- markdown - 从页面提取的干净可读文本
- screenshot - 截图托管的 CDN URL
- html - 清理后的 HTML 内容
- links - 页面上发现的所有链接
下一步
代码示例
Node.js
const response = await fetch('https://server.anyhunt.app/api/v1/scrape', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.ANYHUNT_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
url: 'https://example.com',
formats: ['markdown', 'links'],
onlyMainContent: true,
}),
});
const data = await response.json();
console.log(data.markdown);
console.log(data.links);Python
import requests
response = requests.post(
'https://server.anyhunt.app/api/v1/scrape',
headers={
'Authorization': f'Bearer {API_KEY}',
'Content-Type': 'application/json',
},
json={
'url': 'https://example.com',
'formats': ['markdown', 'links'],
'onlyMainContent': True,
}
)
data = response.json()
print(data['markdown'])
print(data['links'])Go
package main
import (
"bytes"
"encoding/json"
"net/http"
)
func main() {
payload := map[string]interface{}{
"url": "https://example.com",
"formats": []string{"markdown", "links"},
"onlyMainContent": true,
}
body, _ := json.Marshal(payload)
req, _ := http.NewRequest("POST", "https://server.anyhunt.app/api/v1/scrape", bytes.NewBuffer(body))
req.Header.Set("Authorization", "Bearer "+apiKey)
req.Header.Set("Content-Type", "application/json")
client := &http.Client{}
resp, _ := client.Do(req)
defer resp.Body.Close()
}常见使用场景
网页抓取
将任意网页内容提取为干净的 markdown:
{
"url": "https://blog.example.com/article",
"formats": ["markdown"],
"onlyMainContent": true
}截图捕获
捕获完整页面截图,用于视觉测试或归档:
{
"url": "https://example.com",
"formats": ["screenshot"],
"screenshotOptions": {
"fullPage": true,
"format": "webp",
"quality": 90
}
}链接发现
发现页面上的所有链接,用于 SEO 分析或爬取:
{
"url": "https://example.com",
"formats": ["links"]
}多页爬取
爬取整个网站以从多个页面提取内容:
curl -X POST https://server.anyhunt.app/api/v1/crawl \
-H "Authorization: Bearer ah_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://docs.example.com",
"maxDepth": 2,
"limit": 50
}'详情请参阅 Crawl API 文档。