API Reference

Beta

Base URL: https://api.maango.io

Beta: This API is in active development. Endpoints and response shapes may change. If something looks off, let us know at contact@maango.io.

All endpoints except /v1/auth/signup require an Authorization: Bearer header.

Quick Start

Get up and running in under a minute.

Get your API key

Or use curl:

curl -X POST https://api.maango.io/v1/auth/signup \
  -H "Content-Type: application/json" \
  -d '{"email": "you@example.com"}'

Make your first call

curl -H "Authorization: Bearer YOUR_KEY" \
  https://api.maango.io/v1/domain/nytimes.com

Response:

{
  "domain": "nytimes.com",
  "found": true,
  "stance": "blocks_all_ai",
  "use_cases": {
    "training": "blocked",
    "search": "blocked",
    "inference": "blocked"
  },
  "bots": { "blocked": ["GPTBot", "ClaudeBot", "...29 total"], "allowed": [] },
  "signals": { "robots_txt": true, "llms_txt": false, "ai_txt": false, "tdmrep": false, "content_signals": false, "agents_json": false }
}

Integrate into your agent

import httpx

MAANGO_KEY = "maango_sk_xxxxx"

async def check_before_visit(url: str) -> bool:
    domain = url.split("//")[-1].split("/")[0].replace("www.", "")
    r = await httpx.get(
        f"https://api.maango.io/v1/domain/{domain}",
        headers={"Authorization": f"Bearer {MAANGO_KEY}"}
    )
    policy = r.json()
    if not policy.get("found"):
        return True  # Unknown domain, proceed with caution
    if policy["stance"] == "blocks_all_ai":
        return False  # Blocked, don't visit
    return True

Endpoints

POST/v1/auth/signup

Get an API key. No authentication required.

Parameters

Name	Type	Required	Description
email	string	Yes	Your email address

Example Request

curl -X POST https://api.maango.io/v1/auth/signup \
  -H "Content-Type: application/json" \
  -d '{"email": "developer@example.com"}'

Example Response

{
  "api_key": "maango_sk_a1b2c3d4e5f6...",
  "email": "developer@example.com",
  "limits": {
    "per_minute": 200,
    "per_day": 10000,
    "per_month": 100000
  },
  "message": "Store this key securely. It will not be shown again."
}

Errors

400Invalid email format

429Too many signups from this IP (limit: 3 per day)

GET/v1/domain/{domain}

Check any domain's AI policy. The primary endpoint.

Requires authentication

Parameters

Name	Type	Required	Description
domain	string	Yes	The domain to look up (e.g. nytimes.com)

Example Request

curl https://api.maango.io/v1/domain/nytimes.com \
  -H "Authorization: Bearer maango_sk_xxxxx"

Example Response

{
  "domain": "nytimes.com",
  "found": true,
  "stance": "blocks_all_ai",
  "detail": "blocks_all_ai",
  "use_cases": {
    "training": "blocked",
    "search": "blocked",
    "inference": "blocked"
  },
  "bots": {
    "blocked": [
      "CCBot", "GPTBot", "Scrapy", "YouBot", "BLEXBot",
      "Diffbot", "Timpibot", "ClaudeBot", "Omgilibot",
      "cohere-ai", "omgilibot", "Bytespider", "Claude-Web",
      "Claude-User", "FacebookBot", "TurnitinBot", "ChatGPT-User",
      "ImagesiftBot", "anthropic-ai", "DataForSeoBot",
      "DuckAssistBot", "OAI-SearchBot", "PerplexityBot",
      "magpie-crawler", "Google-Extended", "Claude-SearchBot",
      "Applebot-Extended", "meta-externalagent",
      "meta-externalfetcher"
    ],
    "allowed": []
  },
  "signals": {
    "robots_txt": true,
    "llms_txt": false,
    "ai_txt": false,
    "tdmrep": false,
    "content_signals": false,
    "agents_json": false
  },
  "site": {
    "tranco_rank": 155,
    "cdn": "fastly",
    "cms": "wordpress",
    "behind_cloudflare": false,
    "has_paywall": false,
    "has_tos": true,
    "tos_url": "https://nytimes.com/terms"
  },
  "last_updated": "2026-02-27T03:19:29Z"
}

Errors

400Invalid domain format

401Missing or invalid API key

429Rate limit exceeded

Domain not found? If the domain isn't in the registry, you'll get "found": false, "queued": true. The domain is queued for analysis and will be available shortly. In your code, treat "found": false as "unknown" — either allow with caution or block depending on your use case.

Data freshness: The last_updated field shows when the domain was last crawled. Maango re-crawls the full dataset regularly. If a domain's policy is older than 30 days, consider rechecking — policies can change as publishers update their robots.txt or adopt new standards.

Stance → Detail mapping

Each stance can have multiple detail values. Use stance for broad decisions and detail for fine-grained logic:

Stance	Possible detail values
blocks_all_ai	blocks_all_ai, wildcard_block
blocks_training	blocks_training
selective	selective
allows_all	allows_all, has_llms_txt
no_policy	no_ai_rules, no_robots, has_signals, robots_blocked, unreachable

The `detail` field

A more granular classification than stance. Possible values:

Value	Description
blocks_all_ai	Explicitly blocks all AI bots
blocks_training	Blocks training use specifically
selective	Allows some bots/use-cases, blocks others
allows_all	Explicitly allows AI access
no_ai_rules	Has robots.txt but no AI-specific directives
has_llms_txt	No AI bot rules but provides an llms.txt file
has_signals	No AI bot rules but has other signals (content signals, meta tags)
no_robots	No robots.txt file found at all
robots_blocked	robots.txt fetch was blocked (e.g. Cloudflare 403)
wildcard_block	Blocks all bots with a wildcard rule, not AI-specific
unreachable	Domain couldn't be reached during crawl

GET/v1/domain/{domain}/full

Full raw crawl data for a domain. Returns all JSONB fields parsed including raw robots.txt content, meta tags, headers, and ToS snippets.

Requires authentication

Parameters

Name	Type	Required	Description
domain	string	Yes	The domain to look up

Example Request

curl https://api.maango.io/v1/domain/nytimes.com/full \
  -H "Authorization: Bearer maango_sk_xxxxx"

Example Response

{
  "id": 3561,
  "domain": "nytimes.com",
  "robots_txt_exists": true,
  "llms_txt_exists": false,
  "ai_txt_exists": false,
  "tdmrep_exists": false,
  "agents_json_exists": false,
  "ai_stance": "blocks_all_ai",
  "policy_detail": "blocks_all_ai",
  "ai_openness_score": 10.0,
  "policy_completeness": "moderate",
  "confidence": 0.7,
  "ai_bots_blocked_count": 29,
  "ai_bots_allowed_count": 0,
  "ai_bots": {
    "GPTBot": "blocked",
    "ClaudeBot": "blocked",
    "CCBot": "blocked"
    // ... all 29 bots
  },
  "all_bots": {
    "GPTBot": {
      "status": "blocked",
      "is_ai_bot": true,
      "allowed_paths": [],
      "disallowed_paths": ["/"]
    }
    // ... all bots including non-AI
  },
  "crawl_rules": {
    "crawl_delays": {},
    "blocked_paths": [],
    "sitemaps_count": 10,
    "directives_count": 325,
    "has_wildcard_block": false
  },
  "use_case_policies": {
    "training": "blocked",
    "search": "blocked",
    "inference": "blocked"
  },
  "conflicts": ["robots_blocks_ai_but_tos_allows_or_silent"],
  "tos_url": "https://nytimes.com/terms",
  "tos_ai_stance": "silent",
  "cdn_provider": "fastly",
  "cms": "wordpress",
  "has_paywall": false,
  "behind_cloudflare": false,
  "markdown_for_agents": false,
  "parsed_at": "2026-02-12T23:33:52.922662+00:00"
  // ... additional fields
}

Errors

400Invalid domain format

401Missing or invalid API key

429Rate limit exceeded

The confidence field (0.0–1.0) indicates how reliably Maango could determine the domain's AI policy. Low values (below 0.5) typically mean the site was partially unreachable, had ambiguous signals, or conflicting directives. Use it to decide whether to trust the stance or fall back to a default policy.

GET/v1/search

Search domains by name with optional stance filter.

Requires authentication

Parameters

Name	Type	Required	Description
q	string	Yes	Search query (min 2 characters)
stance	string	No	Filter by stance: blocks_all_ai, selective, allows_all, no_policy, blocks_training
limit	integer	No	Results per page, 1-100 (default: 20)
offset	integer	No	Pagination offset (default: 0)

Example Request

curl "https://api.maango.io/v1/search?q=news&stance=blocks_all_ai&limit=5" \
  -H "Authorization: Bearer maango_sk_xxxxx"

Example Response

{
  "results": [
    {
      "domain": "news.com.au",
      "stance": "blocks_all_ai",
      "detail": "blocks_all_ai",
      "tranco_rank": 842
    }
  ],
  "limit": 5,
  "offset": 0,
  "has_more": true
}

Errors

400Query too short (min 2 characters) or invalid parameters

401Missing or invalid API key

429Rate limit exceeded

POST/v1/batch

Look up policies for up to 25 domains in a single request.

Requires authentication

Parameters

Name	Type	Required	Description
domains	string[]	Yes	Array of 2-25 domains to look up

Example Request

curl -X POST https://api.maango.io/v1/batch \
  -H "Authorization: Bearer maango_sk_xxxxx" \
  -H "Content-Type: application/json" \
  -d '{"domains": ["nytimes.com", "github.com", "wikipedia.org"]}'

Example Response

{
  "domains": [
    {
      "domain": "nytimes.com",
      "found": true,
      "stance": "blocks_all_ai",
      "use_cases": {
        "training": "blocked",
        "search": "blocked",
        "inference": "blocked"
      },
      "bots": {
        "blocked": ["GPTBot", "ClaudeBot", "..."],
        "allowed": []
      }
    },
    {
      "domain": "github.com",
      "found": true,
      "stance": "selective",
      "use_cases": {
        "training": "blocked",
        "search": "allowed",
        "inference": "allowed"
      },
      "bots": {
        "blocked": ["CCBot", "GPTBot"],
        "allowed": ["Googlebot"]
      }
    },
    {
      "domain": "wikipedia.org",
      "found": true,
      "stance": "allows_all",
      "use_cases": {
        "training": "allowed",
        "search": "allowed",
        "inference": "allowed"
      },
      "bots": {
        "blocked": [],
        "allowed": []
      }
    }
  ],
  "not_found": []
}

Errors

400Less than 2 or more than 25 domains provided

401Missing or invalid API key

429Rate limit exceeded

Note: /v1/compare is a backward-compatible alias for this endpoint and works identically.

Batching more than 25 domains? Split into chunks and run concurrently:

import httpx, asyncio

MAANGO_KEY = "maango_sk_xxxxx"

async def batch_lookup(domains: list[str], chunk_size: int = 25):
    results = []
    async with httpx.AsyncClient() as client:
        chunks = [domains[i:i+chunk_size] for i in range(0, len(domains), chunk_size)]
        tasks = [
            client.post(
                "https://api.maango.io/v1/batch",
                headers={"Authorization": f"Bearer {MAANGO_KEY}", "Content-Type": "application/json"},
                json={"domains": chunk}
            )
            for chunk in chunks
        ]
        responses = await asyncio.gather(*tasks)
        for r in responses:
            results.extend(r.json().get("domains", []))
    return results

Integration Examples

Python: Pre-flight check for a LangChain agent

Add Maango as a policy checker before every web fetch in a LangChain tool. The agent won't visit blocked domains.

import httpx
from langchain.tools import tool

MAANGO_KEY = "maango_sk_xxxxx"

async def is_domain_allowed(url: str) -> bool:
    """Check if the domain allows AI access before visiting."""
    domain = url.split("//")[-1].split("/")[0].replace("www.", "")
    r = await httpx.get(
        f"https://api.maango.io/v1/domain/{domain}",
        headers={"Authorization": f"Bearer {MAANGO_KEY}"}
    )
    policy = r.json()

    if not policy.get("found"):
        return True  # Not in registry, proceed with caution

    # Block if the domain blocks all AI
    if policy["stance"] == "blocks_all_ai":
        return False

    # Check specific use case
    use_cases = policy.get("use_cases", {})
    if use_cases.get("inference") == "blocked":
        return False

    return True

@tool
async def fetch_webpage(url: str) -> str:
    """Fetch a webpage, respecting AI policies."""
    if not await is_domain_allowed(url):
        return f"Cannot access {url}: domain blocks AI agents."

    async with httpx.AsyncClient() as client:
        r = await client.get(url, follow_redirects=True)
        return r.text[:5000]

JavaScript: Check policy in a Node.js agent

A simple wrapper for any Node.js agent that needs to check domain policies.

const MAANGO_KEY = "maango_sk_xxxxx";

async function checkPolicy(url) {
  const domain = new URL(url).hostname.replace("www.", "");

  const res = await fetch(
    `https://api.maango.io/v1/domain/${domain}`,
    { headers: { Authorization: `Bearer ${MAANGO_KEY}` } }
  );

  if (!res.ok) {
    console.warn(`Maango API error: ${res.status}`);
    return { allowed: true, reason: "api_error" };
  }

  const policy = await res.json();

  if (!policy.found) {
    return { allowed: true, reason: "not_in_registry" };
  }

  if (policy.stance === "blocks_all_ai") {
    return { allowed: false, reason: policy.stance, domain };
  }

  if (policy.use_cases?.inference === "blocked") {
    return { allowed: false, reason: "inference_blocked", domain };
  }

  return { allowed: true, reason: policy.stance, domain };
}

// Usage
const result = await checkPolicy("https://nytimes.com/article/...");
if (!result.allowed) {
  console.log(`Skipping ${result.domain}: ${result.reason}`);
} else {
  // Proceed with fetch
}

Python: Batch check domains from a list

Loop through a list of URLs, check each domain, and filter out blocked ones.

import httpx
import asyncio

MAANGO_KEY = "maango_sk_xxxxx"

async def batch_check(urls: list[str]) -> dict:
    """Check a list of URLs and categorize by policy."""
    allowed = []
    blocked = []
    unknown = []

    async with httpx.AsyncClient() as client:
        for url in urls:
            domain = url.split("//")[-1].split("/")[0].replace("www.", "")
            try:
                r = await client.get(
                    f"https://api.maango.io/v1/domain/{domain}",
                    headers={"Authorization": f"Bearer {MAANGO_KEY}"}
                )
                policy = r.json()

                if not policy.get("found"):
                    unknown.append({"url": url, "domain": domain})
                elif policy["stance"] == "blocks_all_ai":
                    blocked.append({"url": url, "domain": domain, "stance": policy["stance"]})
                else:
                    allowed.append({"url": url, "domain": domain, "stance": policy["stance"]})
            except Exception as e:
                unknown.append({"url": url, "domain": domain, "error": str(e)})

    return {"allowed": allowed, "blocked": blocked, "unknown": unknown}

# Usage
urls = [
    "https://nytimes.com/article/example",
    "https://github.com/some/repo",
    "https://wikipedia.org/wiki/AI",
    "https://reddit.com/r/tech",
]

results = asyncio.run(batch_check(urls))
print(f"Allowed: {len(results['allowed'])}")
print(f"Blocked: {len(results['blocked'])}")
print(f"Unknown: {len(results['unknown'])}")

curl: Quick domain lookup

For people who just want to check a domain from the terminal.

# Check a single domain
curl -s https://api.maango.io/v1/domain/nytimes.com \
  -H "Authorization: Bearer maango_sk_xxxxx" | python3 -m json.tool

# Just get the stance
curl -s https://api.maango.io/v1/domain/nytimes.com \
  -H "Authorization: Bearer maango_sk_xxxxx" | python3 -c "
import json, sys
d = json.load(sys.stdin)
print(f'{d["domain"]}: {d["stance"]}')
print(f'  Training: {d["use_cases"]["training"]}')
print(f'  Search:   {d["use_cases"]["search"]}')
print(f'  Inference:{d["use_cases"]["inference"]}')
"

# Search for news sites that block AI
curl -s "https://api.maango.io/v1/search?q=news&stance=blocks_all_ai&limit=10" \
  -H "Authorization: Bearer maango_sk_xxxxx" | python3 -m json.tool

Batch compare policies across competitors

Use the /batch endpoint to see how different sites in the same industry handle AI (up to 25 domains per request).

import httpx

MAANGO_KEY = "maango_sk_xxxxx"

def compare_industry(domains: list[str]):
    """Compare AI policies across a set of competing domains."""
    r = httpx.post(
        "https://api.maango.io/v1/batch",
        headers={
            "Authorization": f"Bearer {MAANGO_KEY}",
            "Content-Type": "application/json"
        },
        json={"domains": domains}
    )
    data = r.json()

    print(f"{'Domain':<25} {'Stance':<20} {'Training':<12} {'Search':<12} {'Inference'}")
    print("-" * 80)

    for d in data["domains"]:
        uc = d.get("use_cases", {})
        print(f"{d['domain']:<25} {d['stance']:<20} {uc.get('training', '?'):<12} {uc.get('search', '?'):<12} {uc.get('inference', '?')}")

    if data.get("not_found"):
        print(f"\nNot in registry: {', '.join(data['not_found'])}")

# Compare major news sites
compare_industry([
    "nytimes.com",
    "washingtonpost.com",
    "theguardian.com",
    "bbc.com",
    "reuters.com"
])

# Compare social media platforms
compare_industry([
    "twitter.com",
    "facebook.com",
    "reddit.com",
    "linkedin.com",
    "tiktok.com"
])

Error Format

All errors return a consistent JSON format:

{
  "error": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Try again in 23 seconds.",
  "retry_after": 23,
  "limit_type": "minute"
}

Code	Error	Description
400	invalid_domain	The domain format is invalid
400	invalid_params	Request parameters are invalid
401	unauthorized	Missing or invalid API key
429	rate_limit_exceeded	Rate limit hit (includes retry_after)
500	internal_error	Server error

Questions? Reach out to contact@maango.io

API Reference

Quick Start

Get your API key

Make your first call

Integrate into your agent

Endpoints

Parameters

Example Request

Example Response

Errors

Parameters

Example Request

Example Response

Errors

Stance → Detail mapping

The detail field

Parameters

Example Request

Example Response

Errors

Parameters

Example Request

Example Response

Errors

Parameters

Example Request

Example Response

Errors

Integration Examples

Python: Pre-flight check for a LangChain agent

JavaScript: Check policy in a Node.js agent

Python: Batch check domains from a list

curl: Quick domain lookup

Batch compare policies across competitors

Error Format

The `detail` field