---
name: agentic-search-api
description: |
  Use when an agent needs live web search, URL scraping/extraction, or
  cited research from the open web. Backed by an OpenAI-compatible chat
  endpoint that auto-routes to a multi-source search (web + Wikipedia +
  Hacker News + StackOverflow + GitHub + Bluesky), fuses and reranks the
  hits, scrapes the top URLs, and writes a cited LLM summary.
  Triggers: "search the web", "look up X", "find the latest on", "what
  changed in", "current events", "scrape this URL", "extract content
  from <url>", "cite sources for", "find videos/images/news about",
  "summarize https://...". Returns OpenAI-shape responses plus
  `search_results[]` and `agentic_search.{path, duration_seconds,
  balance_before, usage.steps[], trace, program, evidence, contradictions,
  comparison, freshness}` extensions.
version: 1.2.4
license: MIT
allowed-tools:
  - Bash
  - Read
  - WebFetch
author: Traylinx <https://traylinx.com>
metadata:
  hermes:
    tags: [Search, Web, Scraping, Research, Chat, OpenAI-Compatible, Citations, Traylinx]
    related_skills: [duckduckgo-search, browse, sherlock, polymarket]
    category: research
    upstream: https://docs.traylinx.com
    production_url: https://search.traylinx.com
---

# agentic-search-api

OpenAI-compatible API that does **multi-source web search + URL scraping +
cited LLM summaries**. One call fans out across the live web plus Wikipedia,
Hacker News, StackOverflow, GitHub, and Bluesky, fuses and reranks the
results, then writes a cited answer. Drop-in `base_url` replacement for any
OpenAI SDK — change the URL, send `model="agentic-search"`, get cited answers.

| | |
|---|---|
| **Base URL** | `https://search.traylinx.com` |
| **Auth** | `Authorization: Bearer $TRAYLINX_API_KEY` (get one at <https://traylinx.com>) |
| **Primary endpoint** | `POST /v1/chat/completions` |
| **Compatibility** | OpenAI Chat Completions (any SDK works) |
| **Response** | OpenAI envelope + `search_results[]` + `agentic_search.*` |

---

## When to use this skill

Use it when the user asks an agent to:

- Answer a question that needs **current** information (news, releases, prices, scores).
- **Extract content** from a known URL (HTML / markdown / plain text).
- Do **cited web research** with sources and a per-step token breakdown.
- Find **videos**, **images**, or **news** about a topic.
- **Summarize a page** and return the summary with citations.
- Replace `WebSearch` / `WebFetch` with something that does both + LLM
  synthesis in a single call.

## When NOT to use it

- The data is local → read the file, don't hit the network.
- A plain `curl` or `WebFetch` is enough (public static HTML, no JS) → skip
  the API and save a round-trip.
- The user wants sub-second real-time data (live trades, ticking scores) →
  this API has a 22s deadline and is "current" not "real-time".
- The user wants a free no-auth DuckDuckGo search → use the
  `duckduckgo-search` skill instead.
- The page is JS-heavy / behind login → use the `browse` skill (real
  Chrome via CDP).

---

## Authentication

Two modes — pick one per request.

### User bearer (default — use this unless you know you need A2A)

```
Authorization: Bearer <TRAYLINX_API_KEY>
```

Get a token at <https://traylinx.com> → API Keys. Format:
`sk-lf-XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX`.

Multi-tenant: pass **each end-user's own** key — it's validated per request and
billed to that user's wallet. Works from any HTTPS client (browser, server,
script, or a Tytus pod); nothing host-specific is required.

Store in env var, never in code:
```bash
export TRAYLINX_API_KEY="sk-lf-XXXX-XXXX-XXXX-XXXX-XXXX"
```

### Agent-to-Agent (Traylinx Sentinel)

For service-to-service calls. Three headers required:

```
X-Agent-Secret-Token: <agent_secret_token>     # 1h lifetime, get from Sentinel /oauth/token
X-Agent-User-Id:     <agent_user_id>           # UUID of the agent owner
X-LLM-API-Key:       <user_llm_key>            # downstream user's LLM key
```

Get an `agent_secret_token`:
```bash
curl -X POST "https://api.makakoo.com/ma-authentication-ms/v1/api/oauth/token" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=client_credentials" \
  -d "client_id=$CLIENT_ID" \
  -d "client_secret=$CLIENT_SECRET" \
  -d "scope=a2a"
# response: { agent_secret_token: "...", expires_in: 7200 }
# agent_secret_token expires in 1h; re-fetch.
```

---

## Endpoint 1: `POST /v1/chat/completions` — **primary, use this**

OpenAI-compatible. The API reads the **last `role:"user"` message** in
`messages`, figures out the intent, and runs the right pipeline. The
SDK doesn't need to know about search/scrape at all.

### Request

| Field | Type | Required | Notes |
|---|---|---|---|
| `model` | string | yes | Send `"agentic-search"`. Value is **ignored** — the router picks the backend. |
| `messages` | array | yes | OpenAI shape. Only the last `role:"user"` is inspected for intent. |
| `stream` | bool | no | Accepted but **ignored** — full answer always returned in one shot today. |
| `max_tokens` | number | no | Honored by the summarization model. Default ~1000. |

### Intent detection (automatic)

| Last user message contains... | API does |
|---|---|
| a URL (`https?://...`) | scrape that URL + summarize |
| `video` / `videos` / `watch` | video search |
| `image` / `images` / `photo` / `picture` | image search |
| `news` | news-filter web search |
| `latest` / `current` / `today` / `this week` / `this month` / `this year` / `now` / `newest` / `recent` / `2025` / `2026` / `president` / `prime minister` | **freshness fallback** (faster, tuned for very recent topics) |
| anything else | **orchestrator** (classify → multi-source fan-out → fuse + rerank → scrape top URLs → cited LLM summary, 22s cap) |

Inspect `response.agentic_search.path` to know which path fired.

> **Sources (orchestrator path):** the engine fans out across the live web
> plus Wikipedia, Hacker News, StackOverflow, GitHub, and Bluesky — selected
> by intent (code/tech queries pull Hacker News, StackOverflow, and GitHub;
> every query gets web + Wikipedia + Bluesky). The hits are fused
> (reciprocal-rank fusion, deduped by domain) and reranked by relevance
> before scraping and summary, so `search_results[]` spans all of them with
> the strongest results on top. No client changes needed — it's the same
> `search_results[]` shape, just richer.

### Response (OpenAI envelope + extensions)

```json
{
  "id": "chatcmpl-1777016549490",
  "object": "chat.completion",
  "created": 1777016549,
  "model": "agentic-search",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Kansas City Chiefs won Super Bowl LVIII 25–22 in OT..."
    },
    "finish_reason": "stop"
  }],
  "usage": {"prompt_tokens": 3233, "completion_tokens": 584, "total_tokens": 3817},

  "search_results": [
    {
      "title": "List of Super Bowl champions - Wikipedia",
      "url":   "https://en.wikipedia.org/wiki/List_of_Super_Bowl_champions",
      "snippet": "Kansas City entered the 2023 NFL season as defending...",
      "date": "2 weeks ago",
      "scraped": true
    }
  ],

  "agentic_search": {
    "path": "orchestrator",                 // or "fallback"
    "duration_seconds": 4.37,
    "balance_before": 63.66,                // wallet credits before this call (100 = $1 USD)
    "usage": {
      "prompt_tokens": 3233, "completion_tokens": 584, "total_tokens": 3817,
      "steps": [
        {"name":"classification", "prompt_tokens":272,  "completion_tokens":170, "total_tokens":442},
        {"name":"query_building", "prompt_tokens":190,  "completion_tokens":111, "total_tokens":301},
        {"name":"search",  "calls": 2},
        {"name":"scrape",  "calls": 2},
        {"name":"summary", "prompt_tokens":2771, "completion_tokens":303, "total_tokens":3074}
      ]
    }
  }
}
```

`response.usage` (top-level) = sum of `response.agentic_search.usage.steps`.

> **v2 trace (additive):** the response *additionally* carries
> `agentic_search.trace` — the search `mode` (`fast` / `evidence` / `pro_sync`),
> query variants, per-stream status/timings, the `fusion` and `scrape` counts,
> a `rerank` summary, and per-provider `extra_providers` status for Wikipedia /
> Hacker News / StackOverflow / GitHub / Bluesky. It's bounded and
> redaction-safe and never changes the answer or `search_results`, so treat it
> as optional observability. Full schema:
> <https://github.com/traylinx/agentic-search-engines-ts/blob/main/docs/agentic-search-v2.md>.

> **v2 Search-as-Code extensions (additive):** on the orchestrator path the
> response also carries these structured, bounded, redaction-safe fields — none
> of them change the answer or `search_results`, and every one is optional
> (absent on the freshness/fast path, or when a feature has nothing to report):
>
> - `agentic_search.program` — the run as a reproducible spec: `intent[]`,
>   `sources[]`, `query_variants[]`, the `fusion` + `rerank` recipe, `budget`,
>   and a ready-to-run `reproduce` block (the exact `POST /v1/chat/completions`
>   call that reproduces this search).
> - `agentic_search.evidence[]` — ranked, source-attributed findings, each
>   `{rank, text, url, title, source, age, age_days, fresh, from_scrape}`.
>   `from_scrape=true` means a deep-scraped span; `false` means the engine snippet.
> - `agentic_search.contradictions[]` — `{a, b, issue}` where `a`/`b` are
>   `evidence` ranks the model flagged as factually conflicting (present only
>   when a real conflict is found).
> - `agentic_search.comparison` — `{entities[], rows:[{dimension, cells[]}]}`,
>   auto-filled for "X vs Y" queries; each row's `cells` align with `entities`.
> - `agentic_search.freshness` — `{window_days, fresh_count, dated_count, total,
>   freshest_age_days}`; when present, `evidence` is recency-boosted (fresh first).
>
> These are live on the hosted API today. Read whichever you need and ignore the
> rest — strict OpenAI clients never see them.

### Examples

**curl:**
```bash
curl -X POST https://search.traylinx.com/v1/chat/completions \
  -H "Authorization: Bearer $TRAYLINX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "agentic-search",
    "messages": [{"role": "user", "content": "Who won the last Super Bowl?"}]
  }'
```

**Python (OpenAI SDK):**
```python
from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["TRAYLINX_API_KEY"],
    base_url="https://search.traylinx.com/v1",
)

r = client.chat.completions.create(
    model="agentic-search",
    messages=[{"role": "user", "content": "Who won the last Super Bowl?"}],
)
print(r.choices[0].message.content)
# For the extensions (search_results, agentic_search):
data = r.model_dump()
print(f"{len(data.get('search_results', []))} sources, path={data['agentic_search']['path']}")
```

**JavaScript (openai SDK):**
```javascript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.TRAYLINX_API_KEY,
  baseURL: "https://search.traylinx.com/v1",
});

const r = await client.chat.completions.create({
  model: "agentic-search",
  messages: [{ role: "user", content: "What changed in TypeScript 5.6?" }],
});
console.log(r.choices[0].message.content);
console.log(`${r.search_results?.length ?? 0} sources`);
```

**Scrape a specific URL:**
```bash
curl -X POST https://search.traylinx.com/v1/chat/completions \
  -H "Authorization: Bearer $TRAYLINX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "agentic-search",
    "messages": [{"role": "user", "content": "scrape https://example.com and summarize"}]
  }'
```

**Find videos:**
```bash
curl -X POST .../v1/chat/completions \
  -H "Authorization: Bearer $TRAYLINX_API_KEY" -H "Content-Type: application/json" \
  -d '{"model":"agentic-search","messages":[{"role":"user","content":"videos of golden retriever puppies"}]}'
```

---

## Endpoint 2: `POST /v1/search` — explicit flow

Use this when you want to **pick the pipeline yourself** instead of
letting the chat-completions router decide.

### Request

| Field | Type | Default | Notes |
|---|---|---|---|
| `query` | string | required | the question or directive |
| `flow` | enum | `"agentic"` | `agentic` \| `search_only` \| `scrap_only` |
| `result_filter` | string[] | `["web"]` | `web` \| `news` \| `images` \| `videos` |
| `items` | number | `10` | max results |
| `country` | string | `"US"` | ISO country code |
| `params` | object | `{}` | flow-specific overrides — see below |

### Flow types

| Flow | What runs | Needs `params.url`? | Notes |
|---|---|---|---|
| `agentic` | classify → search → scrape → summarize | no | Rich citations. 22s hard deadline. |
| `search_only` | multiple search backends in parallel, dedup, return raw hits | no | Cheapest. No LLM. |
| `scrap_only` | Scrape one URL | **yes** | `params.url` mandatory. |

### `params` for `scrap_only`

| Key | Type | Default | Notes |
|---|---|---|---|
| `url` | string | **required** | absolute URL to scrape |
| `selector` | string | `"body"` | CSS selector to extract |
| `formats` | string[] | `["markdown"]` | `markdown` \| `html` \| `text` |
| `timeout` | number | `30000` | ms |

### `params` for `agentic` (advanced overrides)

| Key | Type | Default | Notes |
|---|---|---|---|
| `use_search` | bool | `true` | run the web-search step |
| `use_scrap` | bool | `true` | scrape the top URLs |
| `use_summary` | bool | `true` | run the final LLM summary |
| `result_filter` | string[] | `["news","general"]` | filter search results |
| `items` | number | `15` | max results per search |
| `max_tokens` | number | `800` | summary length |
| `url` | string | — | optional starting URL |
| `selector` | string | — | optional CSS selector for the scrape step |
| `timeout` | number | `25000` | per-step timeout in ms |
| `formats` | string[] | `["markdown","html"]` | scrape output formats |
| `max_chars_per_url` | number | `8000` | truncate scraped pages at this length |
| `max_urls` | number | `5` | max URLs to scrape |
| `search_context_size` | string | `"high"` | `"low"` \| `"medium"` \| `"high"` |
| `temperature` | number | `0.2` | summary creativity (0 = deterministic) |
| `stream` | bool | `false` | accepted but ignored |

### Example

```bash
curl -X POST https://search.traylinx.com/v1/search \
  -H "Authorization: Bearer $TRAYLINX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the latest AI developments?",
    "flow": "agentic",
    "params": {"items": 15, "max_tokens": 800}
  }'
```

### Images & videos (render-ready media)

Use `flow:"search_only"` with `params.result_filter` (the filter lives under
`params`, not at the top level) to get a media gallery. Each result is
ready to drop into a UI:

```bash
curl -X POST https://search.traylinx.com/v1/search \
  -H "Authorization: Bearer $TRAYLINX_API_KEY" -H "Content-Type: application/json" \
  -d '{"query":"super bowl LX","flow":"search_only","params":{"result_filter":["images"],"items":12}}'
```

Each item: `{title, url, thumbnail, image, favicon, hostname, age, source}`
— `thumbnail` is a small preview, `image` the full-res original. For videos
(`result_filter:["videos"]`), append `youtube` to the query to surface
embeddable YouTube hits (`https://www.youtube.com/embed/<id>`).

---

## Endpoint 3: `POST /v1/scrap` — direct URL scrape

Skip the router. Scrape one URL.

### Request

| Field | Type | Default | Notes |
|---|---|---|---|
| `url` | string | required | absolute URL |
| `selector` | string | `"body"` | CSS selector to extract |
| `formats` | string[] | `["markdown"]` | `markdown` \| `html` \| `text` |
| `timeout` | number | `30000` | ms |

### Response

A **flat** object — the content is at the top level, not wrapped in
`data`/`success`:

```json
{
  "content": "# Example Domain\n\nThis domain is for use in documentation examples...",
  "statusCode": 200,
  "metadata": {
    "url": "https://example.com",
    "format": "markdown",
    "contentLength": 167
  },
  "url": "https://example.com"
}
```

Read the scraped text from `response.content` (and `response.metadata.format`
/ `contentLength` for the rest). On failure the endpoint returns
`{ "error": "<message>" }` with a 4xx/5xx status — check the HTTP status, not
a `success` flag.

### Example

```bash
curl -X POST https://search.traylinx.com/v1/scrap \
  -H "Authorization: Bearer $TRAYLINX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "formats": ["markdown"]}'
```

---

## Endpoint 4: `GET /trends` — live trending searches (no auth)

Localized, real-time trending searches. **No API key required.** Pass any
ISO country code via `?geo=` (defaults to `US`, or the caller's edge geo).

```bash
curl https://search.traylinx.com/trends?geo=US
```

### Response

```json
{
  "geo": "US",
  "trends": [
    {
      "term": "world cup standings",
      "traffic": "20,000+",
      "headline": "...",
      "url": "https://...",
      "picture": "https://...",
      "source": "Yahoo Sports"
    }
  ]
}
```

`term` is the trending search query (feed it straight back into
`/v1/chat/completions`); `traffic` is approximate search volume. Great for
"what's hot right now" UIs or seeding an agent with current topics.

---

## Common patterns

### Cited Q&A
```python
r = client.chat.completions.create(
    model="agentic-search",
    messages=[{"role": "user", "content": "What's the latest on Wayland 1.24?"}],
)
data = r.model_dump()
answer = data["choices"][0]["message"]["content"]
sources = data.get("search_results", [])
print(answer)
for s in sources[:5]:
    print(f"  {s['title']} — {s['url']}")
```

### Scrape one URL, no chat
```python
import requests, os
r = requests.post(
    "https://search.traylinx.com/v1/scrap",
    headers={"Authorization": f"Bearer {os.environ['TRAYLINX_API_KEY']}"},
    json={"url": "https://example.com", "formats": ["markdown"]},
    timeout=30,
).json()
print(r["content"])          # flat response — content is top-level
```

### Cost / usage observability
```python
data = r.model_dump()
path = data["agentic_search"]["path"]
duration = data["agentic_search"]["duration_seconds"]
balance_before = data["agentic_search"]["balance_before"]
for step in data["agentic_search"]["usage"]["steps"]:
    print(f"  {step['name']:18s}", end="")
    if "calls" in step:
        print(f"  {step['calls']} calls")
    else:
        print(f"  {step['total_tokens']} tokens")
```

### Force the freshness path
Use words like `latest`, `current`, `today`, `this week`, or a year
(`2025` / `2026`) in your question. The router picks the freshness
fallback automatically — it's tuned to be more up-to-date for
current events.

---

## Errors

| Status | Body | Cause | Fix |
|---|---|---|---|
| 400 | `Messages array is required` | empty body | send `messages: [...]` |
| 400 | `No user message found` | no `role:"user"` entry | add one |
| 401 | `No authentication credentials provided` | missing `Authorization` header | add it |
| 401 | `Invalid token` | bad / expired token | re-issue at traylinx.com |
| 401 | `X-LLM-API-Key header required for agent authentication` | A2A without LLM key | add `X-LLM-API-Key` |
| 401 | `Invalid agent token` | Sentinel rejected the signature | re-fetch agent_secret_token |
| 500 | `Failed to route query` | classifier crash | retry; file a bug if persistent |
| 200 | `"I couldn't find an answer for \"<query>\"."` | both paths failed | rephrase; if you have a URL, try `/v1/scrap` directly |

The endpoint never streams today (`stream: true` accepted, ignored,
full answer in one response).

---

## Rate limits (per token)

| Tier | Requests / minute |
|---|---|
| Free | 60 |
| Pro | 600 |
| Enterprise | unlimited |

Response headers: `X-RateLimit-Limit`, `X-RateLimit-Remaining`,
`X-RateLimit-Reset`. The orchestrator counts as 1 request regardless
of internal steps.

---

## Health check

```bash
curl -fsS https://search.traylinx.com/health
# {"status":"ok","timestamp":"2026-06-02T10:19:04.534Z","version":"1.0.0"}
```

No auth required. Use to confirm the host is up before doing real work.

---

## Hard rules for any agent using this skill

1. **Always pass the token in `Authorization`, never in the URL.** Tokens
   in URLs leak through server logs and referer headers.
2. **Never hardcode the base URL.** Read it from
   `AGENTIC_SEARCH_API_BASE_URL` env var, default to
   `https://search.traylinx.com`.
3. **Never substitute a raw `curl` for this API** when the user asked
   for the search API — you lose the intent router and the LLM summary.
4. **Multi-turn is not supported.** The router only inspects the **last
   user message**. Concatenate earlier turns into the last user message
   yourself if you need context.
5. **The 22s deadline is real.** Don't wrap the call in a 5s client
   timeout — the orchestrator path needs the full budget.
6. **Inspect `agentic_search.path`** to know whether you got the rich
   orchestrator answer or the faster freshness fallback.

---

## Links

- **Production API:** <https://search.traylinx.com>
- **Docs:** <https://docs.traylinx.com>
- **Traylinx (token issuer):** <https://traylinx.com>
- **Sentinel A2A docs:** <https://docs.traylinx.com/sentinel>