# Operator SDK integration playbook — for an AI coding agent

You are an AI coding agent helping a developer integrate **`@audin.ai/operator-sdk`**
into THEIR application. This file is your playbook. It is written for you, the
agent — not as a script to paste, and not as end-user documentation.

`@audin.ai/operator-sdk` is a **headless browser SDK** that lets a partner's human
operators make and receive phone calls from a web app. Audin is the telephony
gateway (network, routing, recording, billing); the SDK handles the microphone,
audio, signalling, reconnection and heartbeats. **It ships no UI** — the developer
builds the interface; the SDK does the plumbing.

This playbook covers **SDK `0.3.0`** / **wire protocol `v1`**. Package name on npm:
`@audin.ai/operator-sdk`.

---

## How to use this guide (meta-instructions)

Read these before you write any code.

1. **Do NOT assume the developer's stack.** There is no single "right"
   integration — it adapts to their framework, backend, and how they manage
   secrets and authenticate users.
2. **Do NOT just dump code.** The snippets below are *illustrative and
   adaptable*, not a copy-paste solution. Your job is to translate the
   *principles* into the developer's actual stack.
3. **Interview first.** Ask the questions in the next section BEFORE
   implementing. Wait for the answers. The integration is shaped by them.
4. **Respect the security model.** It is non-negotiable (see "Hard security
   constraints"). If a proposed approach would leak the Account API Key into the
   browser, stop and correct it.
5. **Verify as you go.** After each meaningful step, confirm with the developer
   that it matches their app's conventions, and that it runs.

---

## Questions to ask the developer FIRST

Ask these (adapt wording to the conversation). Do not proceed to implementation
until you have enough answers to choose a concrete path.

### 1. Frontend
- Which **framework / environment** is the operator UI in? (React, Vue, Angular,
  Svelte, SolidJS, vanilla JS, a mobile webview, an Electron app…)
- Should the call controls live in an **existing screen** (e.g. a CRM contact
  page, an agent dashboard) or a new dedicated view?
- Is the site served over **HTTPS** in every environment operators will use it
  (including local dev)? Microphone access requires a secure context (HTTPS or
  `localhost`).

### 2. Backend (the token-minting endpoint)
- What **language / framework** hosts the backend that will mint session tokens?
  (Node/Express, a Next.js route handler, Python/FastAPI or Flask, Ruby on Rails,
  PHP/Laravel, Go, a serverless function…)
- Where is the **Account API Key** stored and how are secrets managed? (env vars,
  a secret manager / vault, platform-injected config) — it must stay server-side.

### 3. Identity & authorization
- How do they **authenticate their own end-users**, so the token endpoint can be
  protected? (session cookie, bearer JWT, an API gateway…) Only authenticated
  operators should be able to mint a token.
- How is an **operator identified** in their system? You need a *stable operator
  reference* (`operatorRef`) and a *display name* (`displayName`) for each
  operator.

### 4. Numbers & call direction
- Which **phone numbers** should operators be available on?
  - a **fixed set** assigned per operator (the backend decides), or
  - **let the operator choose** from the numbers the account owns, fetched at
    runtime via `listPhoneNumbers()`?
- Do they need **inbound**, **outbound**, or **both**?

### 5. UX & lifecycle
- What should happen on **incoming call** (auto-popup, ring sound in their own UI,
  manual accept/reject)?
- When does an operator **go offline** (explicit toggle, on logout, on tab close)?

Use the answers to pick: where `getToken` lives, what the backend endpoint sends
in the token request body, whether the UI calls `listPhoneNumbers()`, and which
events you wire up.

---

## Hard security constraints (non-negotiable)

- The **Account API Key MUST stay server-side** and **NEVER reach the browser**,
  a bundle, a build artifact, a query string, or client-side config.
- The browser / SDK only ever uses the **short-lived session token** (~1h). It
  never sees the Account API Key.
- The developer's **only backend responsibility** is **one endpoint** that:
  1. authenticates the calling operator (using their existing auth),
  2. POSTs to the Audin token endpoint `https://api.audin.ai/operator-sessions/token`
     with the header `X-API-Key: <Account API Key>` (server-side secret), and
  3. returns `{ token, expiresAt? }` to the browser.
- Reference the key only as an environment variable in examples
  (`process.env.AUDIN_ACCOUNT_API_KEY` or the stack's equivalent). Never write a
  literal key value anywhere.

If you ever find yourself putting the API key, or a direct call to
`api.audin.ai/operator-sessions/token`, in client-side code — stop. That call
belongs on the backend only.

---

## Integration principles (apply to the chosen stack)

These are principles, not a fixed recipe. The snippets are **adaptable examples**
— rewrite them idiomatically for the developer's framework.

### Principle 1 — One backend endpoint mints the token

The backend authenticates the operator, then calls the Audin token endpoint with
the Account API Key and returns the session token to the browser.

```ts
// ADAPTABLE EXAMPLE — Node/Express. Port the idea to the developer's backend.
// The API key stays here, server-side. Never ship it to the browser.
app.post("/api/operator/token", requireAuth, async (req, res) => {
  const r = await fetch("https://api.audin.ai/operator-sessions/token", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "X-API-Key": process.env.AUDIN_ACCOUNT_API_KEY, // server-side secret
    },
    body: JSON.stringify({
      operatorRef: req.user.id,        // YOUR stable operator identifier
      displayName: req.user.name,      // shown in Audin / CRM
      phoneNumberIds: req.user.numbers // optional: which numbers this operator may use
    }),
  });
  const data = await r.json();
  res.json({ token: data.token, expiresAt: data.expiresAt }); // return at least { token }
});
```

If `listPhoneNumbers()` will let the operator pick at runtime, `phoneNumberIds`
in the token request can be omitted or broadened; if the backend assigns a fixed
set, send exactly those ids.

### Principle 2 — Implement `getToken` to call THAT backend

On the client, `getToken` is the single seam through which credentials enter the
SDK. It calls the developer's backend (with their normal auth — cookies, bearer,
etc.) and returns `{ token, expiresAt? }`. **Always fetch a fresh token** — the
SDK calls `getToken` again on every (re)connect, so never cache an expired one.

```ts
// ADAPTABLE EXAMPLE — client side.
const getToken = async () => {
  const r = await fetch("/api/operator/token", {
    method: "POST",
    credentials: "include", // or your bearer header — match your app's auth
  });
  if (!r.ok) throw new Error("token endpoint failed");
  return r.json(); // { token, expiresAt? }
};
```

### Principle 3 — Construct the operator and subscribe BEFORE going online

Create `new AudinOperator({ coreUrl, getToken })` and register your event
listeners *before* calling `goOnline`, so you never miss an early `incomingCall`
or `error`.

```ts
// ADAPTABLE EXAMPLE.
import { AudinOperator } from "@audin.ai/operator-sdk";

const op = new AudinOperator({
  coreUrl: "https://core.audin.ai", // http(s) is upgraded to ws(s) internally
  getToken,
});

op.on("presenceStateChanged", (s) => updatePresenceUi(s));
op.on("availabilityChanged", ({ accepted, rejected }) => { /* reflect in UI */ });
op.on("incomingCall", (call) => showIncomingUi(call)); // then call.accept()/reject()
op.on("callStarted", (call) => showInCallUi(call));
op.on("callEnded", (call) => clearCallUi(call));       // inspect call.endReason
op.on("error", (e) => console.error(e.code, e.message));
```

In React/Vue/etc., construct the operator once (e.g. a ref / module singleton /
provider), attach listeners in an effect, and unsubscribe + `goOffline()` on
teardown. `on(...)` returns an unsubscribe function.

### Principle 4 — Pick a number, then go online

List the numbers the account owns and go online on one (or more) ids. Whether the
developer's UI shows a picker or uses a backend-assigned set depends on their
answer to the numbers question.

```ts
// ADAPTABLE EXAMPLE.
const numbers = await op.listPhoneNumbers();
// → [{ id, phoneNumber, displayName }, ...]
const mine = numbers[0]; // or the one the operator selected in your UI
await op.goOnline([mine.id]);
```

`listPhoneNumbers()` is fetched by the SDK using the same session token as the
WebSockets — still **no API key in the browser**.

### Principle 5 — Outbound: `dial(to, { callerId })`

For outbound calls, dial an E.164 number and present a `callerId` that is a number
the account owns and that is active (typically a `phoneNumber` from
`listPhoneNumbers()`).

```ts
// ADAPTABLE EXAMPLE — must run from a user gesture so the browser can start audio.
const call = await op.dial("+39021234567", { callerId: mine.phoneNumber });
// call.callSid, call.direction === "outbound", call.state transitions to "active"
```

### Principle 6 — Inbound: `incomingCall` → `accept()` / `reject()`

On an inbound offer, the `incomingCall` event hands you an `OperatorCall`. Show
your own UI, then `call.accept()` or `call.reject()`. If accepted, a `callStarted`
event follows once audio is up.

```ts
// ADAPTABLE EXAMPLE.
op.on("incomingCall", (call) => {
  // render your ring UI using call.from, then on user action:
  if (operatorClickedAnswer) call.accept();
  else call.reject();
});
```

### Principle 7 — In-call controls

`call.mute(true|false)` toggles the operator microphone. `call.hangup()` ends the
call. `call.sendDtmf(digit)` exists but is **not yet forwarded to the network**
(see pitfalls).

### Principle 8 — Microphone needs HTTPS + a user gesture

`getUserMedia` requires a secure context (HTTPS or `localhost`). The first
call may need a **user gesture** (a click) to resume the browser audio context
under autoplay policies. Trigger `dial()` / `accept()` from a real click handler.

### Principle 9 — Clean up on logout

Call `op.goOffline()` when the operator logs out, toggles off, or the view
unmounts. It drops availability, ends any active call, and stops auto-reconnect.

---

## Compact API surface (accurate to SDK 0.3.0)

### `new AudinOperator(config)`

| Option | Type | Default | Notes |
|---|---|---|---|
| `coreUrl` | `string` | — | Audin operator service base URL, e.g. `https://core.audin.ai`. `http(s)` is upgraded to `ws(s)` internally. |
| `getToken` | `() => Promise<{ token: string; expiresAt?: string }>` | — | Fetches a fresh session token from YOUR backend. |
| `heartbeatIntervalMs` | `number` | `25000` | Presence keep-alive interval. |
| `reconnectBackoffMs` | `number[]` | `[1000,2000,5000,10000,30000]` | Presence reconnect backoff schedule. |
| `audioConstraints` | `MediaTrackConstraints` | echo cancel + noise suppress + AGC | Passed to `getUserMedia({ audio })`. |
| `logger` | `OperatorLogger` | `console` | Diagnostic sink (`debug/info/warn/error`). |

### Methods
- `listPhoneNumbers(): Promise<OperatorPhoneNumber[]>` — numbers the account owns
  (`{ id, phoneNumber, displayName }`). Use `id` for `goOnline`, `phoneNumber`
  (E.164) for the `dial` `callerId`. On persistent `401` throws with
  `code: "UNAUTHORIZED"`; other failures `code: "REQUEST_FAILED"`.
- `goOnline(phoneNumberIds: string[]): Promise<void>` — connect presence and
  announce availability. Call again to change the number set.
- `goOffline(): Promise<void>` — drop availability, end any active call, close
  presence (stops auto-reconnect).
- `dial(to: string, { callerId }): Promise<OperatorCall>` — place an outbound call.
- `get state: PresenceState` — `"offline" | "connecting" | "online" | "reconnecting"`.
- `get currentCall: OperatorCall | null`.
- `on / off / once(event, listener)` — typed subscription; `on` returns an
  unsubscribe function.

### Events
| Event | Payload | When |
|---|---|---|
| `presenceStateChanged` | `PresenceState` | presence channel state changes |
| `availabilityChanged` | `{ accepted: string[]; rejected: string[] }` | server confirms the numbers you went online on |
| `incomingCall` | `OperatorCall` | an inbound call is ringing |
| `callStarted` | `OperatorCall` | audio is established (after accept / dial) |
| `callEnded` | `OperatorCall` | a call terminated (inspect `endReason`) |
| `error` | `{ code, message, cause? }` | a non-fatal error |

### `OperatorCall`
```ts
interface OperatorCall {
  readonly callSid: string;
  readonly direction: "inbound" | "outbound";
  readonly from?: string;   // remote party (E.164) when known
  readonly to?: string;     // local/called number (E.164) when known
  readonly state: "ringing" | "connecting" | "active" | "ended";
  readonly endReason?: CallEndReason;
  readonly muted: boolean;

  accept(): void;                 // answer an inbound offer (no-op unless ringing)
  reject(): void;                 // decline an inbound offer (no-op unless ringing)
  mute(on: boolean): void;
  sendDtmf(digit: string): void;  // "0"-"9", "*", "#" — not yet forwarded (see pitfalls)
  hangup(): void;
}
```

### Key types
- `OperatorPhoneNumber`: `{ id: string; phoneNumber: string; displayName: string | null }`
- `CallDirection`: `"inbound" | "outbound"`
- `CallState`: `"ringing" | "connecting" | "active" | "ended"`
- `CallEndReason`: `"hangup" | "remote_hangup" | "taken_by_other" | "rejected" | "no_answer" | "failed" | "offline"`
- `PresenceState`: `"offline" | "connecting" | "online" | "reconnecting"`
- `OperatorError`: `{ code: string; message: string; cause?: unknown }`

### Error codes you'll encounter
- `MIC_PERMISSION_DENIED` — operator denied / blocked the microphone.
- `WS_ERROR` — a WebSocket-level failure.
- `UNAUTHORIZED` — token rejected (e.g. expired) — surfaced from `listPhoneNumbers`.
- `REQUEST_FAILED` — a non-401 request failure.

(Codes are stable strings; new ones may be added in minor releases — match on the
ones you handle and log the rest.)

---

## Pitfalls & MVP limits (0.3.0)

- **One active call per operator.** While a call is live, incoming offers are
  auto-declined (no `incomingCall` event) and `dial()` rejects. Design the UI so
  the operator finishes / hangs up before the next call.
- **`sendDtmf` is not yet forwarded to the network.** The digit is validated and
  sent on the call channel but the far end does not hear it today — do not rely on
  it for IVR navigation. The method is kept so enabling it later needs no SDK
  change.
- **No ringback on outbound.** While an outbound call is connecting, the operator
  hears silence (no dial tone). Reflect "connecting…" in your own UI.
- **No mid-call re-routing.** A live call is not moved between operators (only a
  grace period and reconnection).
- **Microphone / secure context.** Mic capture fails outside HTTPS/`localhost`;
  handle `MIC_PERMISSION_DENIED` and prompt the operator to allow the mic. The
  first audio may need a user gesture.
- **Always fetch a fresh token.** Don't cache and reuse an expired token — the
  SDK calls `getToken` on every (re)connect for exactly this reason. A stale
  token surfaces as `UNAUTHORIZED`.
- **Caller ID must be an owned, active number.** `dial`'s `callerId` must be a
  number the account owns (use a `phoneNumber` from `listPhoneNumbers()`).

---

## References

- Operator SDK overview — https://doc.audin.ai/docs/operator-sdk
- Token flow & backend setup — https://doc.audin.ai/docs/operator-sdk/token-flow
- Quickstart — https://doc.audin.ai/docs/operator-sdk/quickstart
- Numbers & presence — https://doc.audin.ai/docs/operator-sdk/numeri-presenza
- Inbound calls — https://doc.audin.ai/docs/operator-sdk/chiamate-in-entrata
- Outbound calls — https://doc.audin.ai/docs/operator-sdk/chiamate-in-uscita
- Audio & microphone — https://doc.audin.ai/docs/operator-sdk/audio-microfono
- API reference — https://doc.audin.ai/docs/operator-sdk/api-reference
- Troubleshooting — https://doc.audin.ai/docs/operator-sdk/troubleshooting
- npm package — https://www.npmjs.com/package/@audin.ai/operator-sdk

Covers SDK **0.3.0** / wire protocol **v1**. Pin a compatible MAJOR range
(`^0.3` while pre-1.0) and read the changelog before upgrading across a MAJOR.
