Sales Associate API
Last updated: May 5, 2026
v1 — ProductionA REST endpoint that lets external clients call Omi’s recommendation engine programmatically and render the results in their own UI. One call, one turn of the conversation — the caller owns layout, theming, and product display.
Versioning: Breaking changes will ship behind /v2. The current /v1 endpoint will keep working for at least 6 months after any v2 launch. Additive fields (new optional request fields, new response fields) do not require a major version bump. Removals or behavioral changes are announced via email with at least 30 days notice.
Base URLs
| Environment | Base URL |
|---|---|
| Production | https://api.youromi.com/v1 (primary) |
| Production | https://nxgsc3wl3l.execute-api.us-east-1.amazonaws.com/prod/v1 (direct fallback) |
| Staging | https://api.youromi.com/staging/v1 (primary) |
| Staging | https://nxgsc3wl3l.execute-api.us-east-1.amazonaws.com/staging/v1 (direct fallback) |
api.youromi.com is a Regional API Gateway custom domain pinned to us-east-1 with TLS 1.2. Prefer the custom domain in client code so a future infrastructure migration doesn’t require a release on your side.
Staging mirrors production but is re-deployed continuously and shares the same rate-limit table. Use staging for contract validation and integration smoke tests; do not load-test against it.
Endpoint path appended to any of the above: /concierge/chat
Authentication
Every request must include an x-api-key header. Omi issues one key per client. The key is bound to a Shopify storefront domain (used to scope catalog lookups), one or more allowed origins (used for browser CORS), and a usage tier (sets rate-limit caps).
Key format
ak_concierge_<48 hex characters>
Sending the key
x-api-key: ak_concierge_7db7c4bc1bc1849cec4884543df6879dcdb56304609d7aef
Rotating keys
Email accounts@youromi.com and we will issue a new key alongside the old one (both valid for the rollover window you specify, default 7 days), then disable the old key when you confirm the cut-over.
Authentication errors
| Status | error | When |
|---|---|---|
| 401 | invalid_api_key | Header missing, key unknown, or key disabled |
| 500 | misconfigured_api_key | Key is enabled but server-side metadata is incomplete (contact us) |
Storing the key
Treat it as a secret: never check it into a public repository, never expose it in client-side JavaScript on a public page (use a server-side proxy for browser-side integrations), and rotate immediately if leaked.
Rate Limits
Rate limits are scoped per API key across three rolling windows. The caps depend on the tier attached to your key:
| Tier | Per minute | Per hour | Per day |
|---|---|---|---|
trial | 30 | 500 | 2,000 |
production | 120 | 5,000 | 50,000 |
internal | unlimited | unlimited | unlimited |
Counters increment on every accepted request, including ones that fail upstream. The limiter is keyed on the API key, not the session.
Inspecting your usage
Every successful response carries these headers:
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Cap for the hour window |
X-RateLimit-Remaining | Remaining requests in the current hour bucket |
X-RateLimit-Window | Always hour |
X-RateLimit-Tier | Your current tier name |
When you hit the cap
HTTP/1.1 429 Too Many Requests
Retry-After: 18
Content-Type: application/json
{
"error": "rate_limited",
"message": "Rate limit exceeded for window 'minute' (cap 30). Retry after 18s.",
"request_id": "1f4e2d6c-...",
"limit_window": "minute",
"limit": 30,
"tier": "trial"
}
The Retry-After value is the number of seconds until the breached bucket rolls over. Clients should wait at least that long before retrying, apply jitter when retrying from multiple workers, and treat repeated 429s as a configuration issue rather than retrying indefinitely.
Endpoint
Take a free-text prompt plus optional structured context, and return a recommendation message plus 0–N products.
Request
Headers
| Header | Required | Value |
|---|---|---|
x-api-key | Yes | Your API key |
Content-Type | Yes | application/json |
Origin | Optional | Browser-set; used for CORS evaluation |
Body
{
"prompt": "Birthday gift for my mom, fashion-leaning, budget around $250",
"session_id": "9b1f4e2d-6cb2-4f10-9ec0-1234567890ab",
"context": {
"occasion": "birthday",
"recipient": "mother",
"budget_max": 250,
"page_url": "https://www.example.com/pages/gifting-concierge"
}
}
| Field | Type | Required | Notes |
|---|---|---|---|
prompt | string | Yes | The user’s natural-language ask. 1–2,000 characters. |
session_id | string | No | Conversation identifier. If absent, a UUIDv4 is generated and echoed back. Send the same session_id for every turn of one conversation so the engine keeps state. |
context.occasion | string | No | Free-text occasion (“birthday”, “anniversary”, “housewarming”, etc.). |
context.recipient | string | No | Free-text recipient hint (“mother”, “friend”, “self”). |
context.budget_max | number | No | Soft upper budget in store currency (USD by default). |
context.page_url | string | No | Current page URL on the storefront — used for context heuristics. |
Additional ad-hoc keys inside context are forwarded to the engine as hints; unknown keys are ignored, never rejected.
Limits
| Constraint | Cap |
|---|---|
| Request body size | 8 KiB |
prompt length | 2,000 characters |
Exceeding either returns 400 (body_too_large or prompt_too_long).
Response
Headers
Content-Type: application/json
Access-Control-Allow-Origin: <echoed if origin is allow-listed for your key>
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 487
X-RateLimit-Window: hour
X-RateLimit-Tier: trial
Body (200 OK)
{
"session_id": "9b1f4e2d-6cb2-4f10-9ec0-1234567890ab",
"message": "Three picks based on what you described.\n\nLean fashion-forward with a coffee-table format.",
"products": [
{
"id": "gid://shopify/Product/8529335779428",
"handle": "abu-dhabi",
"title": "Abu Dhabi Bright",
"vendor": "Assouline",
"description": "Glossy travel monograph.",
"price": 105.0,
"currency": "USD",
"image": "https://cdn.shopify.com/.../AbuDhabi01.jpg",
"url": "https://www.example.com/products/abu-dhabi",
"available": true
}
],
"intent": "GIFTING",
"follow_up": "Want more interiors-leaning options?",
"refinement_options": [
{ "text": "Under $200", "type": "price_pivot" }
],
"request_id": "1f4e2d6c-aaaa-bbbb-cccc-001122334455"
}
Response fields
| Field | Type | Notes |
|---|---|---|
session_id | string | Echo of the request’s session_id (or auto-generated). Send back on the next turn. |
message | string | The assistant’s prose response. May contain \n\n paragraph breaks; render each paragraph in its own bubble for the best UX. |
products | array | 0–N recommended products. Order is meaningful (best fit first). |
products[].id | string | Shopify product GID. |
products[].handle | string | URL handle (/products/{handle} on the storefront). |
products[].title | string | Display title. |
products[].vendor | string | null | Brand. |
products[].description | string | null | Product description (may be HTML stripped of tags). |
products[].price | number | null | Price in currency. Null if the engine could not resolve a price. |
products[].currency | string | ISO 4217 code (defaults to USD). |
products[].image | string | null | CDN URL of the primary product image. |
products[].url | string | null | Product page URL on the public storefront. |
products[].available | boolean | null | Whether at least one variant is in stock. Null if unknown. |
intent | string | null | Detected user intent (GIFTING, STYLING, BROWSING, etc.). |
follow_up | string | A short follow-up question. Render as a separate assistant bubble after the products. |
refinement_options | array | Optional structured chips you may render as quick-replies. Each has {text, type}. |
request_id | string | UUID for this request. Include it when filing support tickets. |
Recommended rendering order
When showing the response in a chat-style UI, render in this order:
- Each paragraph of
message(separate bubble per\n\n) - The
productscards - The
follow_upbubble - The
refinement_optionschips (if any)
Error Responses
All errors share the same envelope:
{
"error": "<machine_code>",
"message": "<human-readable explanation>",
"request_id": "<uuid>"
}
| HTTP | error | When | Recovery |
|---|---|---|---|
| 400 | invalid_json_body | Body is not valid JSON | Fix the JSON |
| 400 | body_must_be_object | Body is valid JSON but not an object | Wrap in {} |
| 400 | body_too_large | Request body exceeds 8 KiB | Trim payload |
| 400 | missing_prompt | prompt field absent or empty | Send a non-empty prompt |
| 400 | prompt_too_long | prompt exceeds 2,000 characters | Trim prompt |
| 401 | invalid_api_key | Header missing, unknown, or disabled | Check header; contact us |
| 405 | method_not_allowed | HTTP method other than POST/OPTIONS | Use POST |
| 429 | rate_limited | Per-key cap reached | Wait Retry-After seconds |
| 500 | misconfigured_api_key | Server-side metadata incomplete | Contact us |
| 502 | upstream_invoke_error | Could not reach recommendation engine | Retry with backoff |
| 502 | upstream_function_error | Engine raised an unhandled error | Retry; report request_id |
| 502 | upstream_error | Engine returned a structured error | Inspect detail; retry |
| 502 | upstream_empty_response | Engine returned no content | Retry |
Any 5xx response should be retried at most 2–3 times with exponential backoff. If the issue persists, capture the request_id and contact support.
CORS
The API serves standard CORS headers. For a given API key, the Access-Control-Allow-Origin header is echoed only if the request’s Origin matches the public_origin or one of the additional_origins configured for your key.
Unknown origins still get a 200 response (so server-to-server callers work), but the browser will block the response from reaching the page.
If you need an additional origin allow-listed (e.g. a staging domain), email accounts@youromi.com.
Browser-side integrations
For production browser-side use, do not embed the API key in client-side JavaScript. Instead:
- Stand up a thin server-side proxy (Cloudflare Worker, Lambda, Vercel function, etc.) that holds the API key and forwards requests
- Optionally add per-end-user rate limiting at that layer
Examples
curl
curl -X POST "https://api.youromi.com/v1/concierge/chat" \
-H "x-api-key: ak_concierge_..." \
-H "Content-Type: application/json" \
-d '{
"prompt": "Birthday gift for my mom, budget $250",
"context": {
"occasion": "birthday",
"recipient": "mother",
"budget_max": 250
}
}'
JavaScript (fetch)
const res = await fetch(
"https://api.youromi.com/v1/concierge/chat",
{
method: "POST",
headers: {
"x-api-key": process.env.OMI_CONCIERGE_KEY,
"Content-Type": "application/json",
},
body: JSON.stringify({
prompt: "Birthday gift for my mom, budget $250",
session_id: sessionStorage.getItem("omi_session")
?? crypto.randomUUID(),
context: {
occasion: "birthday",
recipient: "mother",
budget_max: 250,
},
}),
}
);
if (res.status === 429) {
const wait = Number(res.headers.get("Retry-After") ?? 60);
console.warn(`Rate limited, retrying in ${wait}s`);
// implement retry logic here
}
if (!res.ok) {
const err = await res.json();
throw new Error(
`Omi error ${err.error} (${err.request_id}): ${err.message}`
);
}
const {
session_id, message, products,
follow_up, refinement_options
} = await res.json();
sessionStorage.setItem("omi_session", session_id);
Python (requests)
import os, uuid, requests
BASE = "https://api.youromi.com"
def ask_omi(prompt: str, session_id: str | None = None, **context):
body = {
"prompt": prompt,
"session_id": session_id or str(uuid.uuid4()),
}
if context:
body["context"] = context
resp = requests.post(
f"{BASE}/v1/concierge/chat",
headers={
"x-api-key": os.environ["OMI_CONCIERGE_KEY"],
"Content-Type": "application/json",
},
json=body,
timeout=30,
)
if resp.status_code == 429:
retry_after = int(resp.headers.get("Retry-After", "60"))
raise RuntimeError(f"rate_limited; retry in {retry_after}s")
resp.raise_for_status()
return resp.json()
reply = ask_omi(
"Birthday gift for my mom, budget $250",
occasion="birthday",
recipient="mother",
budget_max=250,
)
print(reply["message"])
for p in reply["products"]:
print(f" - {p['title']} ({p['price']} {p['currency']}) {p['url']}")
Latency
| Scenario | p50 | p95 |
|---|---|---|
| Cached prompt, warm Lambda | ~1.0 s | ~2.5 s |
| Cold cache, warm Lambda | ~3.0 s | ~6.0 s |
Cold Lambda starts are not expected (warmed every 5 min). Latency scales with catalog size and the number of brand/collection filters in the prompt.
The recommendation engine streams its first token within ~600 ms but this API returns a single JSON document. Streaming responses are on the roadmap.
Changelog
2026-05-02 — v1.2
Custom domain api.youromi.com is live (Regional API Gateway, TLS 1.2) for both production and staging. Raw execute-api URLs still work but new integrations should use the custom domain.
2026-05-02 — v1.1
- Per-API-key rate limits with tier-based caps (trial / production / internal)
Retry-After,X-RateLimit-*headersrequest_idreturned on every response- New 4xx codes:
body_too_large,prompt_too_long,body_must_be_object - Per-key CORS via
additional_origins - Structured error envelope
2026-04-30 — v1.0
Initial release. POST /v1/concierge/chat over x-api-key auth.
Support
Include in every report: the request_id from the response, approximate timestamp (UTC), and the prompt that triggered the issue.
General: accounts@youromi.com
Incidents: incidents@youromi.com (24h response)