Getting Started

Set up Krawl locally and make your first research request

Prerequisites

  • Python 3.12+
  • PostgreSQL (for result persistence, memory, lookouts)
  • API keys for search providers

Local Development

# Clone and install
git clone https://github.com/phdowling/krawl.git
cd krawl
pip install -e ".[dev]"

Environment Variables

Create a .env file in the project root. Krawl uses pydantic-settings and loads from .env automatically.

Required

VariableDescription
EXA_API_KEYExa search API key — primary web search provider
GITHUB_TOKENGitHub personal access token — repo/code search
COINGECKO_API_KEYCoinGecko Pro API key — token data, market cap, charts

Required for LLM (at least one)

VariableDescription
AWS_BEDROCK_ACCESS_KEY_IDAWS access key for Bedrock (primary LLM provider)
AWS_BEDROCK_SECRET_ACCESS_KEYAWS secret key for Bedrock
AWS_BEDROCK_REGIONAWS region (default: us-east-1)
ANTHROPIC_API_KEYAnthropic direct API key (fallback provider, or primary if no Bedrock)

Optional

VariableDefaultDescription
MOGRA_API_KEY""API key for endpoint auth. If empty, auth is disabled
XAI_API_KEY""xAI API key for X/Twitter search via Grok
XAI_BASE_URLhttps://api.x.ai/v1xAI API base URL
X_BEARER_TOKEN""X API v2 bearer token (direct, pay-per-use)
FIRECRAWL_API_KEY""Firecrawl for JS-heavy site scraping
TAVILY_API_KEY""Tavily search
SERPER_API_KEY""Serper (Google search)
BRAVE_API_KEY""Brave search
NANSEN_API_KEY""Nansen on-chain analytics
DUNE_API_KEY""Dune Analytics queries
LUNARCRUSH_API_KEY""LunarCrush social metrics
COINGLASS_API_KEY""Coinglass derivatives data
MESSARI_API_KEY""Messari research data
DATABASE_URL""PostgreSQL connection string
OPENAI_API_KEY""OpenAI API key (unused by default)

Tuning Parameters

VariableDefaultDescription
MAX_STEPS40Maximum agent tool-calling steps
DEFAULT_BREADTH4Parallel queries per depth level
DEFAULT_DEPTH3Recursion depth levels
MAX_BREADTH10Maximum breadth per level
MAX_DEPTH5Maximum recursion depth
RATE_LIMIT5/minuteAPI rate limit (slowapi format)
SYNTHESIS_TIMEOUT120.0Synthesis LLM call timeout (seconds)
SYNTHESIS_MAX_SOURCES25Max sources included in synthesis
VERIFY_CITATIONStrueEnable retrieve-then-cite verification
AUDIT_TRAILtrueEnable audit trail logging
LOOKOUT_MAX_PER_USER10Max active lookouts per API key
LOOKOUT_MIN_INTERVAL_MINUTES60Minimum time between lookout runs

Model Overrides

You can override the default models via env vars:

VariableDefault
MODEL_PLANNINGbedrock/us.anthropic.claude-sonnet-4-6
MODEL_RESEARCHbedrock/us.anthropic.claude-sonnet-4-6
MODEL_SYNTHESISbedrock/us.anthropic.claude-opus-4-7
MODEL_QUERY_GENbedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0
MODEL_X_SEARCHgrok-4.20-0309-non-reasoning
LEARNING_EXTRACTION_MODELbedrock/us.anthropic.claude-sonnet-4-6
GAP_ANALYSIS_MODELbedrock/us.anthropic.claude-sonnet-4-6

Run the Server

uvicorn app.main:app --host 0.0.0.0 --port 8080 --reload

Health Check

curl http://localhost:8080/health
{"status": "ok", "version": "0.2.0"}

First Research Request

curl -N -X POST http://localhost:8080/research \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the latest developments in AI agents?",
    "mode": "deep",
    "breadth": 4,
    "depth_levels": 3
  }'

The -N flag disables output buffering so you see SSE events as they arrive.

First Search Request

For a simple single-source search without the full research pipeline:

curl -X POST http://localhost:8080/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "AI agents 2025",
    "source": "exa"
  }'

Authentication

When MOGRA_API_KEY is set, all endpoints require the X-API-Key header:

curl -H "X-API-Key: your-key" http://localhost:8080/health

When MOGRA_API_KEY is empty or unset, authentication is disabled and all endpoints are publicly accessible. The server logs a warning on startup when auth is disabled.

CORS

Default allowed origins:

  • https://krawl.sh
  • http://localhost:5173
  • http://localhost:3000

Configure via CORS_ORIGINS env var (JSON array format).

On this page