Remove pipeline todo checklist

Add pipeline integration tests
Add stale RAG provider cleanup
2026-05-18 22:43:51 +03:00 · 2026-05-18 22:09:44 +03:00 · 2026-05-18 21:27:41 +03:00 · 2026-05-18 20:58:19 +03:00 · 2026-05-18 20:43:35 +03:00 · 2026-05-18 20:31:04 +03:00
73 changed files with 3826 additions and 1090 deletions
@@ -1,155 +0,0 @@
-# User Request Pipeline TODO
-
-Этот чеклист описывает оставшиеся задачи по доведению pipeline до чистой архитектуры. Текущее состояние уже рабочее: есть `UserRequestPipeline`, stage audit, `ai_requests`, internal artifacts, unified size gate, RAG/STT/final/error/tool-result artifacts и response pipeline. Ниже перечислены задачи, которые ещё нужно сделать, чтобы убрать оставшиеся архитектурные компромиссы.
-
-## 1. Нормализовать хранение attachments, artifacts и audit
-
- [x] Создать отдельную таблицу `attachments`.
- [x] Поля `attachments`: `id`, `messageChatId`, `messageId`, `direction`, `scope`, `kind`, `artifactKind`, `fileId`, `fileUniqueId`, `fileName`, `mimeType`, `cachePath`, `sizeBytes`, `sha256`, `metadata`, `createdAt`.
- [x] Создать отдельную таблицу `artifacts`.
- [x] Поля `artifacts`: `id`, `requestId`, `messageChatId`, `messageId`, `kind`, `stage`, `attachmentId`, `payload`, `createdAt`.
- [x] Создать отдельную таблицу `request_audit`.
- [x] Поля `request_audit`: `id`, `requestId`, `messageChatId`, `messageId`, `stage`, `status`, `startedAt`, `finishedAt`, `durationMs`, `provider`, `model`, `details`, `error`.
- [x] Оставить обратную совместимость с текущими JSON-полями `messages.attachments` и `messages.pipelineAudit`.
- [x] Добавить миграцию: переносить существующие `messages.attachments` в новую таблицу `attachments`.
- [x] Добавить миграцию: переносить существующие `messages.pipelineAudit` в новую таблицу `request_audit`.
- [x] Обновить backup/export/import, чтобы новые таблицы попадали в JSON и SQL dump.
- [x] Добавить DAO/store слой: `AttachmentStore`, `ArtifactStore`, `RequestAuditStore`.
- [x] Перевести новые записи на нормализованные таблицы.
- [x] Оставить чтение legacy JSON только как fallback.
-
-## 2. Сделать единый ArtifactStore API
-
- [x] Ввести `ArtifactStore.put(...)`.
- [x] Ввести `ArtifactStore.getByRequestId(requestId)`.
- [x] Ввести `ArtifactStore.getByMessage(chatId, messageId)`.
- [x] Ввести `ArtifactStore.getLatestRagForReplyChain(chatId, messageId)`.
- [x] Ввести `ArtifactStore.getTranscriptForMessage(chatId, messageId)`.
- [x] Перевести `rag-artifact-store.ts` на `ArtifactStore`.
- [x] Перевести `transcript-artifact-store.ts` на `ArtifactStore`.
- [x] Перевести `final-response-artifact-store.ts` на `ArtifactStore`.
- [x] Перевести `tool-result-artifact-store.ts` на `ArtifactStore`.
- [x] Оставить физические JSON-файлы как storage backend для payload, но регистрировать их в БД.
- [x] Добавить единый size gate для artifact payload до записи файла.
- [x] Добавить cleanup policy для временных/устаревших artifact файлов.
-
-## 3. Расширить RAG artifact content
-
- [x] Расширить общий тип `RagArtifact`.
- [x] Для Ollama сохранять extracted documents.
- [x] Для Ollama сохранять selected chunks.
- [x] Для Ollama сохранять chunk scores.
- [x] Для Ollama сохранять skipped documents и причины пропуска.
- [x] Для Ollama сохранять embedding model, `topK`, `chunkSize`, `chunkOverlap`, `maxContextChars`.
- [x] Для OpenAI сохранять `vectorStoreIds`.
- [x] Для OpenAI сохранять source file mapping: local attachment -> uploaded/vector store file.
- [x] Для Mistral сохранять `libraryId`.
- [x] Для Mistral сохранять uploaded document ids.
- [x] Для Mistral сохранять source file mapping: local attachment -> Mistral document id.
- [ ] Добавить единый `providerState` schema для всех providers.
- [ ] Добавить tests на сериализацию `RagArtifact`.
- [ ] Добавить tests на то, что internal RAG artifacts не попадают обратно в user document context.
-
-## 4. Вынести provider runners в adapter layer
-
- [ ] Ввести интерфейс `AiProviderAdapter`.
- [ ] Методы adapter-а: `mapMessages`, `rankTools`, `callModel`, `extractTextDelta`, `extractToolCalls`, `appendToolResults`, `finalize`.
- [ ] Реализовать `OpenAiProviderAdapter`.
- [ ] Реализовать `MistralProviderAdapter`.
- [ ] Реализовать `OllamaProviderAdapter`.
- [ ] Перенести provider-specific tool schema mapping внутрь adapter-ов.
- [ ] Перенести provider-specific streaming parsing внутрь adapter-ов.
- [ ] Перенести provider-specific tool result append внутрь adapter-ов.
- [ ] Упростить `runOpenAi`, `runMistral`, `runOllama` или заменить их adapter-driven runner-ом.
- [ ] Оставить compatibility wrappers для текущих imports.
- [ ] Добавить tests на adapter contract без реальных API.
-
-## 5. Сделать tool-ranker полноценным pipeline stage
-
- [ ] Вынести вызов `ToolRanker.selectTools(...)` из provider runners.
- [ ] Добавить stage `tool_rank`, который работает через provider adapter.
- [ ] Добавить stage `filter_tools`, который фильтрует provider-specific tools по результату ranker.
- [ ] Хранить `ToolRankDecision` в `UserRequestPipelineState.toolRankDecisions`.
- [ ] Сохранять `ToolRankDecision` в `request_audit.details`.
- [ ] Убрать дублирующий ручной `tool-rank-audit.ts`, если stage полностью заменит его.
- [ ] Сохранить status UX: `🧩 Выбираю подходящие инструменты...`.
- [ ] Гарантировать `clearStatus()` после ranker success/failure.
- [ ] Добавить fallback через `PipelineFallbackExecutor`: main model, all tools, no tools.
- [ ] Добавить tests на fallback ranker policy.
-
-## 6. Сделать model_call и tool_loop физически отдельными stages
-
- [ ] Stage `model_call` должен делать только один model request.
- [ ] Stage `model_call` должен возвращать normalized model output.
- [ ] Stage `tool_loop` должен решать, есть ли tool calls.
- [ ] Stage `tool_loop` должен выполнять tools через общий `executeToolBatch`.
- [ ] Stage `tool_loop` должен добавлять tool results в provider adapter.
- [ ] Stage `tool_loop` должен управлять max rounds.
- [ ] Stage `tool_loop` должен сохранять tool result artifacts.
- [ ] Stage `tool_loop` должен уметь завершаться без tools как `skipped`.
- [ ] Убрать tool loop из `runOpenAi`.
- [ ] Убрать tool loop из `runMistral`.
- [ ] Убрать tool loop из `runOllama`.
- [ ] Добавить tests на multi-round fake adapter.
-
-## 7. Довести fallback notifications до централизованного UX
-
- [ ] Добавить `PipelineFallbackNotifier`.
- [ ] Для `notify_user` отправлять пользователю понятное сообщение.
- [ ] Для `continue_without_stage` писать короткий debug/audit без user notification.
- [ ] Для `use_alternate_target` логировать исходный и alternate target.
- [ ] Для `fail_request` завершать request через единый error path.
- [ ] Добавить локализацию fallback messages.
- [ ] Добавить отдельные тексты для RAG failure, STT failure, TTS failure, tool failure.
- [ ] Не спамить пользователя несколькими fallback notifications за один request.
- [ ] Сохранять fallback notification в `request_audit.details`.
-
-## 8. Улучшить поведение reply-chain с документами
-
- [ ] Явно описать стратегию merge: current user attachments + reply-chain user attachments.
- [ ] Исключать `scope: internal_artifact` всегда.
- [ ] Исключать `scope: bot_output`, если это не user-provided file.
- [ ] Если пользователь отвечает новым документом на ответ бота с предыдущим документом, использовать оба документа.
- [ ] Если пользователь отвечает текстом на ответ бота, использовать документы из reply-chain.
- [ ] Если пользователь явно говорит "этот файл", приоритет отдавать новому вложению.
- [ ] Если несколько документов, добавлять их имена в prompt/RAG context.
- [ ] Добавить tests на follow-up с новым документом.
- [ ] Добавить tests на follow-up без нового документа.
- [ ] Добавить tests на то, что RAG internal JSON не становится пользовательским документом.
-
-## 9. Интеграционные tests без реальных Telegram/AI API
-
- [ ] Создать fake `TelegramStreamMessage`.
- [ ] Создать fake provider adapter.
- [ ] Создать fake message store или in-memory DB fixture.
- [ ] Test: oversized input attachment rejected before download.
- [ ] Test: document input creates RAG artifact.
- [ ] Test: voice input creates transcript artifact.
- [ ] Test: final answer creates final_text artifact.
- [ ] Test: thrown error creates error artifact.
- [ ] Test: tool call creates tool_result artifact.
- [ ] Test: generated file creates generated_file artifact.
- [ ] Test: TTS requested creates tts_audio artifact.
- [ ] Test: fallback `continue_without_stage` continues request.
- [ ] Test: fallback `fail_request` stops request.
-
-## 10. Operational cleanup and observability
-
- [ ] Add retention policy for `data/cache/internal-artifacts`.
- [ ] Add retention policy for stale RAG vector/library provider state.
- [ ] Add command or admin view for recent `ai_requests`.
- [ ] Add command or admin view for request audit by message id.
- [ ] Add command to inspect artifacts for a message.
- [ ] Add log correlation by `requestId` across AI logs, tool logs and DB audit.
- [ ] Add metrics counters: requests, failures, fallbacks, tool calls, RAG runs, TTS runs.
- [ ] Add startup migration logs for `ai_requests`, `attachments`, `artifacts`, `request_audit`.
-
-## Suggested order
-
- [x] 1. Normalize DB tables: `attachments`, `artifacts`, `request_audit`.
- [ ] 2. Build `ArtifactStore` and migrate current artifact helpers to it.
- [ ] 3. Add fake integration tests for reply-chain documents and artifacts.
- [ ] 4. Introduce provider adapter interface.
- [ ] 5. Move `tool_rank` into pipeline stage.
- [ ] 6. Split `model_call` and `tool_loop` physically.
- [ ] 7. Add centralized fallback user notifications.
@@ -8,6 +8,13 @@
  },
  "providerChoice.default": "Default",
  "errorText": "⚠️ An error occurred.",
+  "pipelineFallback.generic": "⚠️ I had to skip part of the request, but I can continue.",
+  "pipelineFallback.notifyUser": "⚠️ I hit a problem and need to continue with a fallback.",
+  "pipelineFallback.failRequest": "⚠️ I could not finish this request.",
+  "pipelineFallback.documentRag": "⚠️ Document retrieval failed, so I will answer without RAG.",
+  "pipelineFallback.speechToText": "⚠️ Speech transcription failed, so I will continue without the audio transcript.",
+  "pipelineFallback.textToSpeech": "⚠️ Text-to-speech failed, so I will continue without audio output.",
+  "pipelineFallback.toolLoop": "⚠️ Tool execution failed, so I will continue without that tool.",
  "waitThinkText": "⏳ Let me think...",
  "analyzingPictureText": "🔍 Analyzing the image...",
  "analyzingPicturesText": "🔍 Analyzing the images...",
@@ -176,6 +183,9 @@
  "getWhenPluralUnitText": "{unit}s",
  "getWhenDurationText": "{prefix}{value} {unit}",
  "commandDescriptions": {
+    "aiAudit": "Inspect AI request audit and artifacts",
+    "aiMetrics": "Show AI observability counters",
+    "aiRequests": "Show recent AI requests",
    "ae": "evaluation",
    "adminsAdd": "Add user to admins",
    "adminsRemove": "Remove user from admins",
@@ -8,6 +8,13 @@
  },
  "providerChoice.default": "По умолчанию",
  "errorText": "⚠️ Произошла ошибка.",
+  "pipelineFallback.generic": "⚠️ Мне пришлось пропустить часть запроса, но я могу продолжить.",
+  "pipelineFallback.notifyUser": "⚠️ Возникла проблема, и я продолжу с запасным вариантом.",
+  "pipelineFallback.failRequest": "⚠️ Я не смог завершить этот запрос.",
+  "pipelineFallback.documentRag": "⚠️ Не удалось получить документы, поэтому я отвечу без RAG.",
+  "pipelineFallback.speechToText": "⚠️ Не удалось распознать речь, поэтому я продолжу без расшифровки аудио.",
+  "pipelineFallback.textToSpeech": "⚠️ Не удалось выполнить синтез речи, поэтому я продолжу без аудио.",
+  "pipelineFallback.toolLoop": "⚠️ Не удалось выполнить инструменты, поэтому я продолжу без них.",
  "waitThinkText": "⏳ Дайте-ка подумать...",
  "analyzingPictureText": "🔍 Анализирую изображение...",
  "analyzingPicturesText": "🔍 Анализирую изображения...",
@@ -202,6 +209,9 @@
  "getWhenPluralUnitText": "{unit}",
  "getWhenDurationText": "{prefix}{value} {unit}",
  "commandDescriptions": {
+    "aiRequests": "Показать последние AI-запросы",
+    "aiAudit": "Показать аудит AI-запроса и артефакты",
+    "aiMetrics": "Показать счётчики AI-обсервабилити",
    "ae": "вычисление",
    "adminsAdd": "Добавить пользователя в администраторы",
    "adminsRemove": "Удалить пользователя из администраторов",
@@ -8,6 +8,13 @@
  },
  "providerChoice.default": "За замовчуванням",
  "errorText": "⚠️ Сталася помилка.",
+  "pipelineFallback.generic": "⚠️ Мені довелося пропустити частину запиту, але я можу продовжити.",
+  "pipelineFallback.notifyUser": "⚠️ Виникла проблема, і я продовжу із запасним варіантом.",
+  "pipelineFallback.failRequest": "⚠️ Я не зміг завершити цей запит.",
+  "pipelineFallback.documentRag": "⚠️ Не вдалося отримати документи, тому я відповім без RAG.",
+  "pipelineFallback.speechToText": "⚠️ Не вдалося розпізнати мовлення, тому я продовжу без розшифровки аудіо.",
+  "pipelineFallback.textToSpeech": "⚠️ Не вдалося виконати синтез мовлення, тому я продовжу без аудіо.",
+  "pipelineFallback.toolLoop": "⚠️ Не вдалося виконати інструменти, тому я продовжу без них.",
  "waitThinkText": "⏳ Дайте-но подумати...",
  "analyzingPictureText": "🔍 Аналізую зображення...",
  "analyzingPicturesText": "🔍 Аналізую зображення...",
@@ -201,6 +208,9 @@
  "getWhenPluralUnitText": "{unit}",
  "getWhenDurationText": "{prefix}{value} {unit}",
  "commandDescriptions": {
+    "aiRequests": "Показати останні AI-запити",
+    "aiAudit": "Показати аудит AI-запиту та артефакти",
+    "aiMetrics": "Показати лічильники AI-спостережуваності",
    "help": "Показати список команд",
    "settings": "Налаштування користувача",
    "start": "Запустити бота",
@@ -1,9 +1,9 @@
 import {Mistral} from "@mistralai/mistralai";
 import {Ollama} from "ollama";
 import {OpenAI} from "openai";
-import {Environment} from "../common/environment";
-import {AiModelCapabilities} from "../model/ai-model-capabilities";
-import {AiProvider} from "../model/ai-provider";
+import {Environment} from "../common/environment.js";
+import {AiModelCapabilities} from "../model/ai-model-capabilities.js";
+import {AiProvider} from "../model/ai-provider.js";

 export type AiCapabilityName = keyof AiModelCapabilities;
 export type AiRuntimePurpose = AiCapabilityName | "chat";
@@ -32,6 +32,7 @@ export type ConversationTurn = {
    content: string;
    deletedByBotAt?: number | null;
    attachments: ConversationAttachment[];
+    documentNames?: string[];
 };

 export type ConversationSnapshot = {
@@ -123,6 +124,13 @@ function attachmentSummary(attachments: ConversationAttachment[]): string {
    return ["[attachments]:", ...lines].join("\n");
 }

+function namesSummary(kind: string, names: string[]): string {
+    const filtered = names.map(name => name.trim()).filter(Boolean);
+    if (!filtered.length) return "";
+
+    return [`[${kind}]:`, ...filtered.map(name => `- ${name}`)].join("\n");
+}
+
 function supportedAttachmentKinds(provider: AiProvider, bot: boolean): Set<AttachmentKind> {
    if (bot) return new Set<AttachmentKind>();

@@ -160,6 +168,10 @@ function renderContentText(
        parts.push("[message_state]: deleted_by_bot");
    }

+    if (turn.documentNames?.length) {
+        parts.push(namesSummary("documents", turn.documentNames));
+    }
+
    if (unsupported.length) {
        parts.push(attachmentSummary(unsupported));
    }
@@ -291,6 +303,7 @@ export async function buildConversationSnapshot(
            content: part.content,
            deletedByBotAt: part.deletedByBotAt,
            attachments: buildConversationAttachments(part),
+            documentNames: part.documentNames,
        }));

    const imageCount = turns.reduce((sum, turn) => {
@@ -0,0 +1,5 @@
+export async function runSingleModelRequest<T>(params: {
+    execute: () => Promise<T>;
+}): Promise<T> {
+    return await params.execute();
+}
@@ -0,0 +1,112 @@
+import type {ToolCallData} from "./unified-ai-runner.shared.js";
+import type {ResponseStreamEvent} from "openai/resources/responses/responses";
+
+function isRecord(value: unknown): value is Record<string, unknown> {
+    return !!value && typeof value === "object" && !Array.isArray(value);
+}
+
+function normalizeToolCallId(value: unknown, fallback: string): string {
+    return typeof value === "string" && value.trim().length > 0 ? value : fallback;
+}
+
+function normalizeToolArguments(value: unknown): string {
+    if (typeof value === "string") return value;
+    return JSON.stringify(value ?? {});
+}
+
+export function extractOpenAiToolCalls(response: unknown): ToolCallData[] {
+    const output = isRecord(response) && Array.isArray(response.output) ? response.output : [];
+
+    return output
+        .filter(item => isRecord(item) && item.type === "function_call" && (typeof item.call_id === "string" || typeof item.name === "string"))
+        .map((item, index) => ({
+            id: normalizeToolCallId(item.call_id, `openai_${index}`),
+            name: typeof item.name === "string" ? item.name : "",
+            argumentsText: normalizeToolArguments(item.arguments),
+        }))
+        .filter(call => call.name.length > 0);
+}
+
+export function extractOpenAiTextDelta(input: unknown): string {
+    const event = input as ResponseStreamEvent | undefined;
+    return event?.type === "response.output_text.delta" ? event.delta ?? "" : "";
+}
+
+export function extractOpenAiStreamingToolCalls(input: unknown): ToolCallData[] {
+    const event = input as ResponseStreamEvent | undefined;
+    if (event?.type === "response.output_item.added" && isRecord(event.item) && event.item.type === "function_call") {
+        return extractOpenAiToolCalls({
+            output: [{
+                type: "function_call",
+                call_id: event.item.call_id ?? event.item.id,
+                name: event.item.name,
+                arguments: event.item.arguments,
+            }],
+        });
+    }
+
+    return [];
+}
+
+export function extractMistralToolCalls(calls: unknown): ToolCallData[] {
+    const normalized = Array.isArray(calls)
+        ? calls
+        : isRecord(calls) && (Array.isArray(calls.toolCalls) || Array.isArray(calls.tool_calls))
+            ? (calls.toolCalls ?? calls.tool_calls)
+            : [];
+
+    if (!Array.isArray(normalized)) return [];
+
+    return normalized
+        .map((item, index) => {
+            const call = isRecord(item) ? item : {};
+            const fn = isRecord(call.function) ? call.function : undefined;
+            const name = typeof fn?.name === "string" ? fn.name : typeof call.name === "string" ? call.name : "";
+            return {
+                id: normalizeToolCallId(call.id, `mistral_${index}`),
+                name,
+                argumentsText: normalizeToolArguments(fn?.arguments ?? call.arguments),
+            };
+        })
+        .filter(call => call.name.length > 0);
+}
+
+export function extractMistralTextDelta(input: unknown): string {
+    const delta = isRecord(input) ? input : {};
+    const content = delta.content;
+    if (typeof content === "string") return content;
+    if (Array.isArray(content)) {
+        return content
+            .map(part => isRecord(part) && typeof part.text === "string" ? part.text : "")
+            .join("");
+    }
+    return "";
+}
+
+export function extractOllamaToolCalls(calls: unknown): ToolCallData[] {
+    const normalized = Array.isArray(calls)
+        ? calls
+        : isRecord(calls) && Array.isArray(calls.tool_calls)
+            ? calls.tool_calls
+            : [];
+
+    if (!Array.isArray(normalized)) return [];
+
+    return normalized
+        .map((item, index) => {
+            const call = isRecord(item) ? item : {};
+            const fn = isRecord(call.function) ? call.function : undefined;
+            const name = typeof fn?.name === "string" ? fn.name : typeof call.name === "string" ? call.name : "";
+            return {
+                id: normalizeToolCallId(call.id, `ollama_${index}`),
+                name,
+                argumentsText: normalizeToolArguments(fn?.arguments ?? call.arguments),
+            };
+        })
+        .filter(call => call.name.length > 0);
+}
+
+export function extractOllamaTextDelta(input: unknown): string {
+    const chunk = isRecord(input) ? input.message : undefined;
+    return isRecord(chunk) && typeof chunk.content === "string" ? chunk.content : "";
+}
@@ -0,0 +1,196 @@
+import {AiProvider} from "../model/ai-provider.js";
+import type {BoundaryValue} from "../common/boundary-types.js";
+import type {RuntimeConfigSnapshot, ToolCallData} from "./unified-ai-runner.shared.js";
+import {getMistralTools, getOllamaTools, getOpenAIResponsesTools, getOpenAICodeInterpreterTool} from "./tool-mappers.js";
+import type {MistralChatMessage as MistralMessageType} from "./mistral-chat-message.js";
+import type {OpenAIChatMessage as OpenAiMessageType} from "./openai-chat-message.js";
+import type {Message as OllamaMessage} from "ollama";
+import {
+    extractMistralTextDelta,
+    extractMistralToolCalls,
+    extractOllamaTextDelta,
+    extractOllamaToolCalls,
+    extractOpenAiTextDelta,
+    extractOpenAiStreamingToolCalls,
+    extractOpenAiToolCalls,
+} from "./provider-adapter-contract.js";
+
+export type ProviderRankToolOptions = {
+    forCreator?: boolean;
+    vectorStoreIds?: string[];
+};
+
+export interface AiProviderAdapter {
+    readonly provider: AiProvider;
+    mapMessages(messages: readonly unknown[]): unknown[];
+    rankTools(config: RuntimeConfigSnapshot, options?: ProviderRankToolOptions): readonly BoundaryValue[];
+    callModel<T>(request: unknown, execute: () => Promise<T>): Promise<T>;
+    extractTextDelta(input: unknown): string;
+    extractToolCalls(input: unknown): ToolCallData[];
+    extractStreamingToolCalls(input: unknown): ToolCallData[];
+    appendToolResults(messages: unknown[], calls: ToolCallData[], results: string[]): void;
+    finalize(): Promise<void>;
+}
+
+function appendOllamaToolResults(messages: unknown[], calls: ToolCallData[], results: string[]): void {
+    for (const [index, call] of calls.entries()) {
+        messages.push({
+            role: "tool",
+            content: results[index] ?? "",
+            tool_name: call.name,
+        });
+    }
+}
+
+class OpenAiProviderAdapter implements AiProviderAdapter {
+    readonly provider = AiProvider.OPENAI;
+
+    mapMessages(messages: readonly unknown[]): unknown[] {
+        return messages as OpenAiMessageType[];
+    }
+
+    rankTools(config: RuntimeConfigSnapshot, options?: ProviderRankToolOptions): readonly BoundaryValue[] {
+        const tools: BoundaryValue[] = [
+            ...getOpenAIResponsesTools(options?.forCreator) as BoundaryValue[],
+            getOpenAICodeInterpreterTool() as BoundaryValue,
+            {
+                type: "image_generation",
+                model: config.openAiImageTarget.model,
+                size: "auto",
+                moderation: "low",
+                output_format: "png",
+                partial_images: 3,
+            },
+            {type: "web_search"},
+        ];
+
+        if (options?.vectorStoreIds?.length) {
+            tools.unshift({
+                type: "file_search",
+                vector_store_ids: options.vectorStoreIds,
+            });
+        }
+
+        return tools;
+    }
+
+    async callModel<T>(_request: unknown, execute: () => Promise<T>): Promise<T> {
+        return execute();
+    }
+
+    extractTextDelta(input: unknown): string {
+        return extractOpenAiTextDelta(input);
+    }
+
+    extractToolCalls(input: unknown): ToolCallData[] {
+        return extractOpenAiToolCalls(input);
+    }
+
+    extractStreamingToolCalls(input: unknown): ToolCallData[] {
+        return extractOpenAiStreamingToolCalls(input);
+    }
+
+    appendToolResults(messages: unknown[], calls: ToolCallData[], results: string[]): void {
+        for (const [index, call] of calls.entries()) {
+            messages.push({
+                type: "function_call_output",
+                call_id: call.id,
+                output: results[index] ?? "",
+            });
+        }
+    }
+
+    async finalize(): Promise<void> {
+        return;
+    }
+}
+
+class MistralProviderAdapter implements AiProviderAdapter {
+    readonly provider = AiProvider.MISTRAL;
+
+    mapMessages(messages: readonly unknown[]): unknown[] {
+        return messages as MistralMessageType[];
+    }
+
+    rankTools(_config: RuntimeConfigSnapshot, options?: ProviderRankToolOptions): readonly BoundaryValue[] {
+        return getMistralTools(options?.forCreator) as BoundaryValue[];
+    }
+
+    async callModel<T>(_request: unknown, execute: () => Promise<T>): Promise<T> {
+        return execute();
+    }
+
+    extractTextDelta(input: unknown): string {
+        return extractMistralTextDelta(input);
+    }
+
+    extractToolCalls(input: unknown): ToolCallData[] {
+        return extractMistralToolCalls(input);
+    }
+
+    extractStreamingToolCalls(input: unknown): ToolCallData[] {
+        return this.extractToolCalls(input);
+    }
+
+    appendToolResults(messages: unknown[], calls: ToolCallData[], results: string[]): void {
+        for (const [index, call] of calls.entries()) {
+            messages.push({
+                role: "tool",
+                name: call.name,
+                toolCallId: call.id,
+                content: results[index] ?? "",
+            });
+        }
+    }
+
+    async finalize(): Promise<void> {
+        return;
+    }
+}
+
+class OllamaProviderAdapter implements AiProviderAdapter {
+    readonly provider = AiProvider.OLLAMA;
+
+    mapMessages(messages: readonly unknown[]): unknown[] {
+        return messages as OllamaMessage[];
+    }
+
+    rankTools(_config: RuntimeConfigSnapshot, options?: ProviderRankToolOptions): readonly BoundaryValue[] {
+        return getOllamaTools(options?.forCreator) as BoundaryValue[];
+    }
+
+    async callModel<T>(_request: unknown, execute: () => Promise<T>): Promise<T> {
+        return execute();
+    }
+
+    extractTextDelta(input: unknown): string {
+        return extractOllamaTextDelta(input);
+    }
+
+    extractToolCalls(input: unknown): ToolCallData[] {
+        return extractOllamaToolCalls(input);
+    }
+
+    extractStreamingToolCalls(input: unknown): ToolCallData[] {
+        return this.extractToolCalls(input);
+    }
+
+    appendToolResults(messages: unknown[], calls: ToolCallData[], results: string[]): void {
+        appendOllamaToolResults(messages, calls, results);
+    }
+
+    async finalize(): Promise<void> {
+        return;
+    }
+}
+
+export function getProviderAdapter(provider: AiProvider): AiProviderAdapter {
+    switch (provider) {
+        case AiProvider.OPENAI:
+            return new OpenAiProviderAdapter();
+        case AiProvider.MISTRAL:
+            return new MistralProviderAdapter();
+        case AiProvider.OLLAMA:
+            return new OllamaProviderAdapter();
+    }
+}
@@ -0,0 +1,77 @@
+import type {AiProvider} from "../model/ai-provider";
+
+export type RagArtifactSource = {
+    fileId: string;
+    fileName: string;
+    mimeType?: string;
+    sizeBytes?: number;
+    sha256?: string;
+    uploadedFileId?: string;
+    documentId?: string;
+};
+
+export type RagArtifactPayload = {
+    artifactKind: "rag";
+    provider: AiProvider;
+    createdAt: string;
+    sources: RagArtifactSource[];
+    providerState:
+        | {
+            provider: AiProvider.OPENAI;
+            vectorStoreIds: string[];
+            uploadedFileIds: string[];
+        }
+        | {
+            provider: AiProvider.MISTRAL;
+            libraryId?: string;
+            documentCount: number;
+        }
+        | {
+            provider: AiProvider.OLLAMA;
+            prepared: boolean;
+            embeddingModel?: string;
+            topK?: number;
+            chunkSize?: number;
+            chunkOverlap?: number;
+            maxContextChars?: number;
+            extractedDocuments: Array<{
+                documentIndex: number;
+                fileName: string;
+                textChars: number;
+            }>;
+            selectedChunks: Array<{
+                sourceId: string;
+                documentIndex: number;
+                documentName: string;
+                chunkIndex: number;
+                chunkCount: number;
+                textChars: number;
+                score?: number;
+            }>;
+            skippedDocuments: Array<{
+                documentIndex: number;
+                fileName: string;
+                reason: string;
+            }>;
+            query: string;
+            minScore: number;
+            maxArchiveFiles: number;
+            maxArchiveBytes: number;
+            maxArchiveDepth: number;
+        };
+};
+
+export function buildRagArtifactPayload(params: {
+    provider: AiProvider;
+    createdAt?: string;
+    sources: RagArtifactSource[];
+    providerState: RagArtifactPayload["providerState"];
+}): RagArtifactPayload {
+    return {
+        artifactKind: "rag",
+        provider: params.provider,
+        createdAt: params.createdAt ?? new Date().toISOString(),
+        sources: params.sources,
+        providerState: params.providerState,
+    };
+}
@@ -4,75 +4,39 @@ import type {AiDownloadedFile} from "./telegram-attachments";
 import type {PreparedDocumentRag} from "./document-rag-pipeline";
 import type {OllamaRagArtifactDetails} from "./ollama-rag";
 import {persistInternalJsonArtifactAttachment} from "./internal-artifact-store";
-
-type RagArtifactPayload = {
-    artifactKind: "rag";
-    provider: AiProvider;
-    createdAt: string;
-    sources: Array<{
-        fileId: string;
-        fileName: string;
-        mimeType?: string;
-        sizeBytes?: number;
-        sha256?: string;
-        uploadedFileId?: string;
-        documentId?: string;
-    }>;
-    providerState: {
-        vectorStoreIds?: string[];
-        libraryId?: string;
-        documentCount?: number;
-        prepared?: boolean;
-        uploadedFileIds?: string[];
-        embeddingModel?: string;
-        topK?: number;
-        chunkSize?: number;
-        chunkOverlap?: number;
-        maxContextChars?: number;
-        extractedDocuments?: Array<{
-            documentIndex: number;
-            fileName: string;
-            textChars: number;
-        }>;
-        selectedChunks?: Array<{
-            sourceId: string;
-            documentIndex: number;
-            documentName: string;
-            chunkIndex: number;
-            chunkCount: number;
-            textChars: number;
-            score?: number;
-        }>;
-        skippedDocuments?: Array<{
-            documentIndex: number;
-            fileName: string;
-            reason: string;
-        }>;
-        query?: string;
-        ollama?: OllamaRagArtifactDetails["providerState"];
-    };
-};
+import {buildRagArtifactPayload, type RagArtifactPayload} from "./rag-artifact-payload";

 function providerState(prepared: PreparedDocumentRag, details?: NonNullable<Parameters<typeof persistRagArtifactAttachment>[0]["details"]>): RagArtifactPayload["providerState"] {
    switch (prepared.provider) {
        case AiProvider.OPENAI:
            return {
+                provider: AiProvider.OPENAI,
                vectorStoreIds: prepared.vectorStoreIds,
                uploadedFileIds: prepared.uploadedFileIds,
            };
        case AiProvider.MISTRAL:
            return {
+                provider: AiProvider.MISTRAL,
                libraryId: prepared.libraryId,
                documentCount: prepared.documents.length,
            };
        case AiProvider.OLLAMA:
            return {
+                provider: AiProvider.OLLAMA,
                prepared: prepared.prepared,
                embeddingModel: details?.embeddingModel,
                topK: details?.topK,
                chunkSize: details?.chunkSize,
                chunkOverlap: details?.chunkOverlap,
                maxContextChars: details?.maxContextChars,
+                extractedDocuments: details?.artifact?.extractedDocuments ?? [],
+                selectedChunks: details?.artifact?.selectedChunks ?? [],
+                skippedDocuments: details?.artifact?.skippedDocuments ?? [],
+                query: details?.artifact?.query ?? "",
+                minScore: details?.artifact?.providerState?.minScore ?? 0,
+                maxArchiveFiles: details?.artifact?.providerState?.maxArchiveFiles ?? 0,
+                maxArchiveBytes: details?.artifact?.providerState?.maxArchiveBytes ?? 0,
+                maxArchiveDepth: details?.artifact?.providerState?.maxArchiveDepth ?? 0,
            };
    }
 }
@@ -117,22 +81,11 @@ export async function persistRagArtifactAttachment(params: {

    if (!sources.length) return Promise.resolve(undefined);

-    const payload: RagArtifactPayload = {
-        artifactKind: "rag",
+    const payload = buildRagArtifactPayload({
        provider: params.provider,
-        createdAt: new Date().toISOString(),
        sources,
-        providerState: {
-            ...providerState(params.prepared, params.details),
-            ...(params.details?.artifact ? {
-                extractedDocuments: params.details.artifact.extractedDocuments,
-                selectedChunks: params.details.artifact.selectedChunks,
-                skippedDocuments: params.details.artifact.skippedDocuments,
-                query: params.details.artifact.query,
-                ollama: params.details.artifact.providerState,
-            } : {}),
-        },
-    };
+        providerState: providerState(params.prepared, params.details),
+    });
    return await persistInternalJsonArtifactAttachment({
        artifactKind: "rag",
        fileNamePrefix: "rag",
@@ -140,14 +93,8 @@ export async function persistRagArtifactAttachment(params: {
        messageId: params.messageId,
        payload,
        metadata: {
-            provider: params.provider,
            sourceFileNames: sources.map(source => source.fileName),
            ...payload.providerState,
-            embeddingModel: params.details?.embeddingModel,
-            topK: params.details?.topK,
-            chunkSize: params.details?.chunkSize,
-            chunkOverlap: params.details?.chunkOverlap,
-            maxContextChars: params.details?.maxContextChars,
        },
    });
 }
@@ -0,0 +1,75 @@
+import type {RagArtifactPayload} from "./rag-artifact-payload";
+
+export type ArtifactLike = {
+    id: string;
+    createdAt: string;
+    payload: string;
+};
+
+export type RagCleanupTarget = {
+    artifactId: string;
+    createdAt: string;
+    provider: RagArtifactPayload["providerState"]["provider"];
+    vectorStoreIds?: string[];
+    uploadedFileIds?: string[];
+    libraryId?: string;
+};
+
+export type RagCleanupPlan = {
+    cutoffAt: string;
+    targets: RagCleanupTarget[];
+};
+
+function parseRagArtifactPayload(payload: string): RagArtifactPayload | null {
+    try {
+        const parsed = JSON.parse(payload) as Partial<RagArtifactPayload>;
+        if (!parsed || parsed.artifactKind !== "rag" || !parsed.providerState) return null;
+        return parsed as RagArtifactPayload;
+    } catch {
+        return null;
+    }
+}
+
+export function buildStaleRagCleanupPlan(
+    artifacts: ArtifactLike[],
+    retentionDays = 14,
+    now = new Date(),
+): RagCleanupPlan {
+    const cutoffAt = new Date(now.getTime() - retentionDays * 24 * 60 * 60 * 1000).toISOString();
+    const targets: RagCleanupTarget[] = [];
+
+    for (const artifact of artifacts) {
+        if (artifact.createdAt > cutoffAt) continue;
+
+        const payload = parseRagArtifactPayload(artifact.payload);
+        if (!payload || payload.artifactKind !== "rag") continue;
+
+        switch (payload.providerState.provider) {
+            case "OPENAI":
+                if (payload.providerState.vectorStoreIds.length || payload.providerState.uploadedFileIds.length) {
+                    targets.push({
+                        artifactId: artifact.id,
+                        createdAt: artifact.createdAt,
+                        provider: payload.providerState.provider,
+                        vectorStoreIds: [...payload.providerState.vectorStoreIds],
+                        uploadedFileIds: [...payload.providerState.uploadedFileIds],
+                    });
+                }
+                break;
+            case "MISTRAL":
+                if (payload.providerState.libraryId) {
+                    targets.push({
+                        artifactId: artifact.id,
+                        createdAt: artifact.createdAt,
+                        provider: payload.providerState.provider,
+                        libraryId: payload.providerState.libraryId,
+                    });
+                }
+                break;
+            case "OLLAMA":
+                break;
+        }
+    }
+
+    return {cutoffAt, targets};
+}
@@ -0,0 +1,117 @@
+import {appLogger} from "../logging/logger.js";
+import {DatabaseManager} from "../db/database-manager.js";
+import {AiProvider} from "../model/ai-provider.js";
+import {createOpenAiClient, resolveAiRuntimeTarget} from "./ai-runtime-target.js";
+import {deleteMistralLibrary} from "./unified-ai-runner.shared.js";
+import {buildStaleRagCleanupPlan} from "./rag-retention-planner.js";
+
+const logger = appLogger.child("rag-retention");
+
+function unique(values: string[]): string[] {
+    return [...new Set(values.filter(Boolean))];
+}
+
+async function cleanupOpenAiRag(vectorStoreIds: string[], uploadedFileIds: string[]): Promise<void> {
+    const target = resolveAiRuntimeTarget(AiProvider.OPENAI, "documents");
+    const client = createOpenAiClient(target);
+
+    for (const vectorStoreId of unique(vectorStoreIds)) {
+        const startedAt = Date.now();
+        logger.info("openai.vector_store.cleanup.start", {vectorStoreId});
+        try {
+            await client.vectorStores.delete(vectorStoreId);
+            logger.success("openai.vector_store.cleanup.done", {vectorStoreId, duration: `${Date.now() - startedAt}ms`});
+        } catch (error) {
+            logger.warn("openai.vector_store.cleanup.failed", {
+                vectorStoreId,
+                duration: `${Date.now() - startedAt}ms`,
+                error: error instanceof Error ? error : String(error),
+            });
+        }
+    }
+
+    for (const fileId of unique(uploadedFileIds)) {
+        const startedAt = Date.now();
+        logger.info("openai.file.cleanup.start", {fileId});
+        try {
+            await client.files.delete(fileId);
+            logger.success("openai.file.cleanup.done", {fileId, duration: `${Date.now() - startedAt}ms`});
+        } catch (error) {
+            logger.warn("openai.file.cleanup.failed", {
+                fileId,
+                duration: `${Date.now() - startedAt}ms`,
+                error: error instanceof Error ? error : String(error),
+            });
+        }
+    }
+}
+
+async function cleanupMistralRag(libraryId: string): Promise<void> {
+    const target = resolveAiRuntimeTarget(AiProvider.MISTRAL, "documents");
+    const startedAt = Date.now();
+    logger.info("mistral.library.cleanup.start", {libraryId});
+    try {
+        await deleteMistralLibrary(libraryId, target);
+        logger.success("mistral.library.cleanup.done", {libraryId, duration: `${Date.now() - startedAt}ms`});
+    } catch (error) {
+        logger.warn("mistral.library.cleanup.failed", {
+            libraryId,
+            duration: `${Date.now() - startedAt}ms`,
+            error: error instanceof Error ? error : String(error),
+        });
+    }
+}
+
+export async function cleanupStaleRagProviderState(retentionDays = 14): Promise<{
+    scannedArtifacts: number;
+    cleanupTargets: number;
+    openaiTargets: number;
+    mistralTargets: number;
+}> {
+    const startedAt = Date.now();
+    const artifacts = await DatabaseManager.getAllArtifacts().catch(() => []);
+    const plan = buildStaleRagCleanupPlan(artifacts, retentionDays);
+
+    logger.info("cleanup.start", {
+        retentionDays,
+        scannedArtifacts: artifacts.length,
+        cleanupTargets: plan.targets.length,
+        cutoffAt: plan.cutoffAt,
+    });
+
+    let openaiTargets = 0;
+    let mistralTargets = 0;
+
+    for (const target of plan.targets) {
+        switch (target.provider) {
+            case "OPENAI":
+                openaiTargets += 1;
+                await cleanupOpenAiRag(target.vectorStoreIds ?? [], target.uploadedFileIds ?? []);
+                break;
+            case "MISTRAL":
+                mistralTargets += 1;
+                if (target.libraryId) {
+                    await cleanupMistralRag(target.libraryId);
+                }
+                break;
+            case "OLLAMA":
+                break;
+        }
+    }
+
+    logger.success("cleanup.done", {
+        retentionDays,
+        scannedArtifacts: artifacts.length,
+        cleanupTargets: plan.targets.length,
+        openaiTargets,
+        mistralTargets,
+        duration: `${Date.now() - startedAt}ms`,
+    });
+
+    return {
+        scannedArtifacts: artifacts.length,
+        cleanupTargets: plan.targets.length,
+        openaiTargets,
+        mistralTargets,
+    };
+}
@@ -0,0 +1,39 @@
+import type {AiDownloadedFile} from "./telegram-attachments.js";
+
+function downloadKey(download: AiDownloadedFile): string {
+    return [
+        download.kind,
+        download.fileId,
+        download.sha256 ?? "",
+        download.fileName,
+    ].join(":");
+}
+
+export function mergeReplyChainDownloads(
+    currentDownloads: readonly AiDownloadedFile[],
+    replyChainDownloads: readonly AiDownloadedFile[],
+): AiDownloadedFile[] {
+    const result: AiDownloadedFile[] = [];
+    const seen = new Set<string>();
+
+    for (const download of [...currentDownloads, ...replyChainDownloads]) {
+        const key = downloadKey(download);
+        if (seen.has(key)) continue;
+        seen.add(key);
+        result.push(download);
+    }
+
+    return result;
+}
+
+export function shouldPreferCurrentDownloads(text: string, currentDownloads: readonly AiDownloadedFile[]): boolean {
+    if (!currentDownloads.length) return false;
+
+    const normalized = text.trim().toLowerCase();
+    if (!normalized) return false;
+
+    return normalized.includes("this file")
+        || normalized.includes("this document")
+        || normalized.includes("этот файл")
+        || normalized.includes("этот документ");
+}
@@ -0,0 +1,19 @@
+import type {TelegramOutputAttachmentRecord, TelegramToolExecutionRecord} from "./telegram-stream-message.js";
+
+export type NormalizedModelOutput = {
+    text: string;
+    toolExecutions: TelegramToolExecutionRecord[];
+    outputAttachments: TelegramOutputAttachmentRecord[];
+};
+
+export function summarizeModelOutput(params: {
+    text: string;
+    toolExecutions: readonly TelegramToolExecutionRecord[];
+    outputAttachments: readonly TelegramOutputAttachmentRecord[];
+}): NormalizedModelOutput {
+    return {
+        text: params.text.trim(),
+        toolExecutions: [...params.toolExecutions],
+        outputAttachments: [...params.outputAttachments],
+    };
+}
@@ -5,6 +5,7 @@ import {Environment} from "../common/environment";
 import {MessageStore} from "../common/message-store";
 import {createQueuedFunction} from "../util/async-lock";
 import {enqueueTelegramApiCall} from "../util/telegram-api-queue";
+import {appLogger} from "../logging/logger";
 import fs from "node:fs";
 import path from "node:path";
 import {StoredAttachment, StoredAttachmentKind} from "../model/stored-attachment";
@@ -13,11 +14,13 @@ import {prepareTelegramMarkdownV2} from "../util/markdown-v2-renderer";
 import {AiProvider} from "../model/ai-provider";
 import {AI_IMAGE_OUTPUT_MODE_DOCUMENT, UserAiImageOutputMode} from "../common/user-ai-settings";
 import {PIPELINE_ATTACHMENT_LIMIT_BYTES} from "./user-request-pipeline";
+import {recordToolCall} from "../common/ai-observability.js";

 const TELEGRAM_LIMIT = 4096;
 const TELEGRAM_CAPTION_LIMIT = 1024;
 const TELEGRAM_PHOTO_LIMIT_BYTES = 10 * 1024 * 1024;
 const EDIT_INTERVAL_MS = 4500;
+const logger = appLogger.child("telegram-stream-message");

 export type TelegramArtifactFile = {
    kind: "image" | "file";
@@ -238,6 +241,13 @@ export class TelegramStreamMessage {

    recordToolExecution(record: TelegramToolExecutionRecord): void {
        this.toolExecutions.push(record);
+        recordToolCall();
+        logger.debug("tool.execution.recorded", {
+            requestId: this.cancelRequestId,
+            toolName: record.toolName,
+            callId: record.callId,
+            resultChars: record.resultChars,
+        });
    }

    getToolExecutions(): TelegramToolExecutionRecord[] {
@@ -246,6 +256,13 @@ export class TelegramStreamMessage {

    recordOutputAttachment(record: TelegramOutputAttachmentRecord): void {
        this.outputAttachments.push(record);
+        logger.debug("output_attachment.recorded", {
+            requestId: this.cancelRequestId,
+            artifactKind: record.artifactKind,
+            fileName: record.fileName,
+            sizeBytes: record.sizeBytes,
+            messageId: record.messageId,
+        });
    }

    getOutputAttachments(): TelegramOutputAttachmentRecord[] {
@@ -0,0 +1,28 @@
+import type {AiProviderAdapter} from "./provider-adapters.js";
+import {executeToolBatch, type ToolCallData, type ToolExecutionMemory} from "./unified-ai-runner.shared.js";
+import type {TelegramStreamMessage} from "./telegram-stream-message.js";
+import type {ToolRuntimeContext} from "./tools/runtime.js";
+
+export async function executeToolBatchWithAdapter(params: {
+    userId: number | undefined | null;
+    toolCalls: ToolCallData[];
+    streamMessage: TelegramStreamMessage;
+    toolContext: ToolRuntimeContext;
+    toolMemory: ToolExecutionMemory;
+    adapter: AiProviderAdapter;
+    appendTargets?: unknown[][];
+}): Promise<string[]> {
+    const results = await executeToolBatch(
+        params.userId,
+        params.toolCalls,
+        params.streamMessage,
+        params.toolContext,
+        params.toolMemory,
+    );
+
+    for (const target of params.appendTargets ?? []) {
+        params.adapter.appendToolResults(target, params.toolCalls, results);
+    }
+
+    return results;
+}
@@ -0,0 +1,39 @@
+import type {StoredAttachment} from "../model/stored-attachment";
+import type {TelegramOutputAttachmentRecord, TelegramToolExecutionRecord} from "./telegram-stream-message.js";
+import {persistInternalJsonArtifactAttachment} from "./internal-artifact-store";
+
+export async function persistToolLoopSummaryArtifactAttachment(params: {
+    chatId: number;
+    messageId: number;
+    text: string;
+    executions: readonly TelegramToolExecutionRecord[];
+    outputAttachments: readonly TelegramOutputAttachmentRecord[];
+}): Promise<StoredAttachment | undefined> {
+    if (!params.executions.length) return undefined;
+
+    return await persistInternalJsonArtifactAttachment({
+        artifactKind: "tool_result",
+        fileNamePrefix: "tool-loop-summary",
+        chatId: params.chatId,
+        messageId: params.messageId,
+        payload: {
+            stage: "tool_loop",
+            text: params.text.trim(),
+            executions: params.executions.map(execution => ({
+                toolName: execution.toolName,
+                callId: execution.callId,
+                argumentsText: execution.argumentsText,
+                resultChars: execution.resultChars,
+                startedAt: execution.startedAt,
+                finishedAt: execution.finishedAt,
+            })),
+            outputAttachments: params.outputAttachments,
+        },
+        metadata: {
+            stage: "tool_loop",
+            toolExecutions: params.executions.length,
+            outputAttachments: params.outputAttachments.length,
+            textChars: params.text.trim().length,
+        },
+    });
+}
@@ -0,0 +1,38 @@
+import type {ToolCallData} from "./unified-ai-runner.shared.js";
+
+export type ToolLoopStopReason = "no_tool_calls" | "max_rounds_reached";
+
+export type ToolLoopContinuation = {
+    continue: boolean;
+    reason?: ToolLoopStopReason;
+    remainingRounds: number;
+};
+
+export function decideToolLoopContinuation(params: {
+    round: number;
+    maxRounds: number;
+    toolCalls: readonly ToolCallData[];
+}): ToolLoopContinuation {
+    const remainingRounds = Math.max(params.maxRounds - params.round - 1, 0);
+
+    if (!params.toolCalls.length) {
+        return {
+            continue: false,
+            reason: "no_tool_calls",
+            remainingRounds,
+        };
+    }
+
+    if (remainingRounds === 0) {
+        return {
+            continue: false,
+            reason: "max_rounds_reached",
+            remainingRounds,
+        };
+    }
+
+    return {
+        continue: true,
+        remainingRounds,
+    };
+}
@@ -0,0 +1,22 @@
+export type ToolLoopRoundOutcome = {
+    shouldContinue: boolean;
+    maxRoundsReached?: boolean;
+};
+
+export async function runToolLoopRounds(params: {
+    maxRounds: number;
+    onRound: (round: number) => Promise<ToolLoopRoundOutcome>;
+    onMaxRoundsReached?: (round: number) => Promise<void> | void;
+}): Promise<void> {
+    for (let round = 0; round < params.maxRounds; round++) {
+        const outcome = await params.onRound(round);
+        if (!outcome.shouldContinue) {
+            if (outcome.maxRoundsReached) {
+                await params.onMaxRoundsReached?.(round);
+            }
+            return;
+        }
+    }
+
+    await params.onMaxRoundsReached?.(params.maxRounds - 1);
+}
@@ -0,0 +1,56 @@
+import type {PipelineArtifact} from "./user-request-pipeline/types.js";
+import type {TelegramOutputAttachmentRecord, TelegramToolExecutionRecord} from "./telegram-stream-message.js";
+import {summarizeModelOutput} from "./response-model-output.js";
+
+export type ToolLoopSummary = {
+    status: "succeeded" | "skipped";
+    fallbackAction?: "continue_without_stage";
+    details: {
+        modelOutput: ReturnType<typeof summarizeModelOutput>;
+        count: number;
+        tools: Array<{
+            toolName: string;
+            callId: string;
+            resultChars: number;
+        }>;
+    };
+    artifacts?: PipelineArtifact[];
+};
+
+export function summarizeToolLoop(params: {
+    text: string;
+    executions: readonly TelegramToolExecutionRecord[];
+    outputAttachments: readonly TelegramOutputAttachmentRecord[];
+}): ToolLoopSummary {
+    const count = params.executions.length;
+    const tools = params.executions.map(execution => ({
+        toolName: execution.toolName,
+        callId: execution.callId,
+        resultChars: execution.resultChars,
+    }));
+
+    return {
+        status: count ? "succeeded" : "skipped",
+        fallbackAction: count ? undefined : "continue_without_stage",
+        details: {
+            modelOutput: summarizeModelOutput({
+                text: params.text,
+                toolExecutions: params.executions,
+                outputAttachments: params.outputAttachments,
+            }),
+            count,
+            tools,
+        },
+        artifacts: count ? [{
+            kind: "tool_result",
+            stage: "tool_loop",
+            createdAt: new Date().toISOString(),
+            toolName: "summary",
+            callId: "tool_loop_summary",
+            resultText: JSON.stringify({
+                count,
+                tools,
+            }),
+        }] : undefined,
+    };
+}
@@ -1,8 +1,8 @@
 import {AiTool} from "./tool-types";
-import {AiProvider} from "../model/ai-provider";
-import {getTools} from "./tools/registry";
-import {WEB_SEARCH_TOOL_NAME} from "./tools/web-search";
-import {PYTHON_INTERPRETER_TOOL_NAME} from "./tools/python-interpretator";
+import {AiProvider} from "../model/ai-provider.js";
+import {getTools} from "./tools/registry.js";
+import {WEB_SEARCH_TOOL_NAME} from "./tools/web-search.js";
+import {PYTHON_INTERPRETER_TOOL_NAME} from "./tools/python-interpretator.js";

 export type AiProviderName = "ollama" | "openai" | "mistral";

@@ -1,32 +0,0 @@
-import {AiProvider} from "../model/ai-provider";
-import type {TelegramStreamMessage} from "./telegram-stream-message";
-import type {PipelineAuditEvent} from "./user-request-pipeline";
-import {logError} from "../util/utils";
-
-export async function storeToolRankAudit(params: {
-    streamMessage: TelegramStreamMessage;
-    provider: AiProvider;
-    model: string;
-    round: number;
-    startedAt: number;
-    startedAtIso: string;
-    selectedTools?: string[];
-    error?: unknown;
-}): Promise<void> {
-    const event: PipelineAuditEvent = {
-        stage: "tool_rank",
-        status: params.error ? "failed" : "succeeded",
-        startedAt: params.startedAtIso,
-        finishedAt: new Date().toISOString(),
-        durationMs: Date.now() - params.startedAt,
-        provider: params.provider,
-        model: params.model,
-        details: {
-            round: params.round,
-            selectedTools: params.selectedTools ?? [],
-        },
-        error: params.error instanceof Error ? params.error.message : params.error ? String(params.error) : undefined,
-    };
-
-    await params.streamMessage.storePipelineAudit([event]).catch(logError);
-}
@@ -0,0 +1,146 @@
+import {AiProvider} from "../model/ai-provider.js";
+import type {BoundaryValue} from "../common/boundary-types.js";
+import type {TelegramStreamMessage} from "./telegram-stream-message.js";
+import type {RuntimeConfigSnapshot} from "./unified-ai-runner.shared.js";
+import {allToolSchemaNames, toolSchemaNames} from "./tool-schema-utils.js";
+import type {ToolRanker} from "./unified-ai-runner.tool-ranker.js";
+import type {PipelineAuditEvent} from "./user-request-pipeline/types.js";
+
+function latestUserText(messages: readonly { role?: string; content?: unknown }[]): string {
+    for (let i = messages.length - 1; i >= 0; i--) {
+        const message = messages[i];
+        if (message?.role !== "user") continue;
+        if (typeof message.content === "string") return message.content;
+        if (Array.isArray(message.content)) {
+            return message.content
+                .map(part => typeof part === "object" && part !== null && "text" in part && typeof (part as { text?: unknown }).text === "string"
+                    ? (part as { text: string }).text
+                    : "")
+                .filter(Boolean)
+                .join("\n");
+        }
+    }
+
+    return "";
+}
+
+export async function runToolRankStage(params: {
+    provider: AiProvider;
+    model: string;
+    round: number;
+    config: RuntimeConfigSnapshot;
+    availableTools: readonly BoundaryValue[];
+    messages: readonly { role?: string; content?: unknown }[];
+    streamMessage: TelegramStreamMessage;
+    signal: AbortSignal;
+    toolRanker?: ToolRanker;
+    storeAudit?: (params: {
+        streamMessage: TelegramStreamMessage;
+        provider: AiProvider;
+        model: string;
+        round: number;
+        startedAt: number;
+        startedAtIso: string;
+        availableTools: string[];
+        selectedTools?: string[];
+        usedRanker?: boolean;
+        error?: unknown;
+    }) => Promise<void>;
+}): Promise<{
+    filteredTools: BoundaryValue[];
+    selectedToolNames: string[];
+    usedRanker: boolean;
+}> {
+    const toolRanker = params.toolRanker ?? new (await import("./unified-ai-runner.tool-ranker.js")).ToolRanker(params.config);
+    const startedAt = Date.now();
+    const startedAtIso = new Date().toISOString();
+    const filterSelectedTools = (selectedToolNames: readonly string[]): BoundaryValue[] => {
+        const selected = new Set(selectedToolNames);
+        return params.availableTools.filter(tool => toolSchemaNames(tool).some(name => selected.has(name)));
+    };
+    const storeAudit = params.storeAudit ?? (async (auditParams: {
+        streamMessage: TelegramStreamMessage;
+        provider: AiProvider;
+        model: string;
+        round: number;
+        startedAt: number;
+        startedAtIso: string;
+        availableTools: string[];
+        selectedTools?: string[];
+        usedRanker?: boolean;
+        error?: unknown;
+    }) => {
+        const event: PipelineAuditEvent = {
+            stage: "tool_rank",
+            status: auditParams.error ? "failed" : "succeeded",
+            startedAt: auditParams.startedAtIso,
+            finishedAt: new Date().toISOString(),
+            durationMs: Date.now() - auditParams.startedAt,
+            provider: auditParams.provider,
+            model: auditParams.model,
+            details: {
+                round: auditParams.round,
+                availableTools: auditParams.availableTools,
+                selectedTools: auditParams.selectedTools ?? [],
+                usedRanker: auditParams.usedRanker ?? false,
+                toolRankDecision: {
+                    provider: auditParams.provider,
+                    round: auditParams.round,
+                    availableTools: auditParams.availableTools,
+                    selectedTools: auditParams.selectedTools ?? [],
+                    usedRanker: auditParams.usedRanker ?? false,
+                },
+            },
+            error: auditParams.error instanceof Error ? auditParams.error.message : auditParams.error ? String(auditParams.error) : undefined,
+        };
+
+        await auditParams.streamMessage.storePipelineAudit([event]);
+    });
+
+    params.streamMessage.setStatus("🧩 Выбираю подходящие инструменты...");
+    await params.streamMessage.flush();
+
+    try {
+        const selection = await toolRanker.selectTools({
+            provider: params.provider,
+            userQuery: latestUserText(params.messages),
+            availableTools: params.availableTools,
+            round: params.round,
+            signal: params.signal,
+        });
+
+        params.streamMessage.clearStatus();
+        await params.streamMessage.flush();
+        await storeAudit({
+            streamMessage: params.streamMessage,
+            provider: params.provider,
+            model: params.model,
+            round: params.round,
+            startedAt,
+            startedAtIso,
+            availableTools: allToolSchemaNames(params.availableTools),
+            selectedTools: selection.toolNames,
+            usedRanker: selection.usedRanker,
+        });
+
+        return {
+            filteredTools: filterSelectedTools(selection.toolNames),
+            selectedToolNames: selection.toolNames,
+            usedRanker: selection.usedRanker,
+        };
+    } catch (error) {
+        params.streamMessage.clearStatus();
+        await params.streamMessage.flush();
+        await storeAudit({
+            streamMessage: params.streamMessage,
+            provider: params.provider,
+            model: params.model,
+            round: params.round,
+            startedAt,
+            startedAtIso,
+            availableTools: allToolSchemaNames(params.availableTools),
+            error,
+        });
+        throw error;
+    }
+}
@@ -0,0 +1,56 @@
+import {ToolRankerFallbackPolicy} from "../common/policies.js";
+import {decidePipelineFallback, type PipelineFallbackDecision} from "./user-request-pipeline/fallback-executor.js";
+
+export type ToolRankerFallbackSelection = {
+    toolNames: string[];
+    usedRanker: boolean;
+};
+
+export type ToolRankerFallbackDecision = PipelineFallbackDecision & ToolRankerFallbackSelection;
+
+function fallbackActionForPolicy(policy: ToolRankerFallbackPolicy) {
+    return policy === ToolRankerFallbackPolicy.MAIN_MODEL
+        ? "use_alternate_target"
+        : "continue_without_stage";
+}
+
+export function decideToolRankerFallback(params: {
+    fallbackPolicy: ToolRankerFallbackPolicy;
+    availableToolNames: readonly string[];
+    reason: "unavailable" | "failed";
+}): ToolRankerFallbackDecision {
+    const action = fallbackActionForPolicy(params.fallbackPolicy);
+    const decision = decidePipelineFallback({
+        stage: "tool_rank",
+        reason: params.reason,
+        policies: [{
+            stage: "tool_rank",
+            onUnavailable: action,
+            onFailed: action,
+        }],
+    });
+
+    return {
+        ...decision,
+        toolNames: params.fallbackPolicy === ToolRankerFallbackPolicy.NO_TOOLS
+            ? []
+            : [...params.availableToolNames],
+        usedRanker: false,
+    };
+}
+
+export function resolveToolRankerFallbackSelection(params: {
+    fallbackPolicy: ToolRankerFallbackPolicy;
+    availableToolNames: readonly string[];
+}): ToolRankerFallbackSelection {
+    const decision = decideToolRankerFallback({
+        fallbackPolicy: params.fallbackPolicy,
+        availableToolNames: params.availableToolNames,
+        reason: "failed",
+    });
+
+    return {
+        toolNames: decision.toolNames,
+        usedRanker: decision.usedRanker,
+    };
+}
@@ -1,12 +1,12 @@
-import type {BoundaryValue} from "../common/boundary-types";
-import type {AiRuntimeTarget} from "./ai-runtime-target";
-import {AiProvider} from "../model/ai-provider";
-import {RuntimeConfigSnapshot, toolSchemaNames} from "./unified-ai-runner.shared";
+import type {BoundaryValue} from "../common/boundary-types.js";
+import type {AiRuntimeTarget} from "./ai-runtime-target.js";
+import {AiProvider} from "../model/ai-provider.js";
+import {RuntimeConfigSnapshot, toolSchemaNames} from "./unified-ai-runner.shared.js";
 import {
    buildToolRankerSystemPrompt,
    getToolRankerAvailableToolInfos,
    type ToolRankerToolInfo,
-} from "./tool-ranker-metadata";
+} from "./tool-ranker-metadata.js";

 export type ToolRankerMessage = {
    role?: string;
@@ -0,0 +1,33 @@
+import type {BoundaryValue} from "../common/boundary-types.js";
+
+function isRecord(value: BoundaryValue): value is Record<string, BoundaryValue> {
+    return value !== null && typeof value === "object" && !Array.isArray(value);
+}
+
+function asOptionalString(value: BoundaryValue): string | undefined {
+    return typeof value === "string" && value.trim().length > 0 ? value.trim() : undefined;
+}
+
+export function toolSchemaName(tool: BoundaryValue): string | undefined {
+    if (!isRecord(tool)) return undefined;
+    const fn = isRecord(tool.function) ? tool.function : undefined;
+    const directName = fn?.name ?? tool.name ?? (typeof tool.type === "string" && tool.type !== "function" ? tool.type : undefined);
+    return asOptionalString(directName);
+}
+
+export function toolSchemaNames(tool: BoundaryValue): string[] {
+    if (!isRecord(tool)) return [];
+
+    if (Array.isArray(tool.functionDeclarations)) {
+        return tool.functionDeclarations
+            .map(declaration => isRecord(declaration) ? asOptionalString(declaration.name) : undefined)
+            .filter((name): name is string => !!name);
+    }
+
+    const name = toolSchemaName(tool);
+    return name ? [name] : [];
+}
+
+export function allToolSchemaNames(tools: readonly BoundaryValue[]): string[] {
+    return [...new Set(tools.flatMap(toolSchemaNames))];
+}
@@ -2,11 +2,11 @@ import {spawn} from "node:child_process";
 import {copyFile, lstat, mkdir, readdir, rm, writeFile} from "node:fs/promises";
 import os from "node:os";
 import path from "node:path";
-import {AiTool} from "../tool-types";
-import {Environment} from "../../common/environment";
-import {toolsLogger} from "./tool-logger";
+import {AiTool} from "../tool-types.js";
+import {Environment} from "../../common/environment.js";
+import {toolsLogger} from "./tool-logger.js";
 import {randomUUID} from "node:crypto";
-import {AiJsonObject} from "../tool-types";
+import {AiJsonObject} from "../tool-types.js";

 const logger = toolsLogger.child("python-interpreter");

@@ -1,3 +1,3 @@
-import {appLogger} from "../../logging/logger";
+import {appLogger} from "../../logging/logger.js";

 export const toolsLogger = appLogger.child("ai-tools");
@@ -2,7 +2,10 @@ import {AiProvider} from "../model/ai-provider";
 import {AI_VOICE_MODE_TRANSCRIPT, DEFAULT_AI_RESPONSE_LANGUAGE} from "../common/user-ai-settings";
 import {Environment} from "../common/environment";
 import {UserRequestPipeline, type UserRequestPipelineState, type UserRequestPipelineStage} from "./user-request-pipeline";
-import type {AiDownloadedFile} from "./telegram-attachments";
+import {PipelineFallbackNotifier} from "./user-request-pipeline/fallback-notifier";
+import {buildToolRankFallbackTargetDetails} from "./user-request-pipeline/fallback-target-details";
+import {mergeReplyChainDownloads, shouldPreferCurrentDownloads} from "./reply-chain-downloads";
+import {attachmentsToDownloadedFiles, type AiDownloadedFile} from "./telegram-attachments";
 import type {TelegramStreamMessage} from "./telegram-stream-message";
 import type {ChatMessage} from "./chat-messages-types";
 import type {OpenAIChatMessage} from "./openai-chat-message";
@@ -12,6 +15,7 @@ import {prepareDocumentRag} from "./document-rag-pipeline";
 import {persistRagArtifactAttachment} from "./rag-artifact-store";
 import {persistTranscriptArtifactAttachment} from "./transcript-artifact-store";
 import type {ToolRuntimeContext} from "./tools/runtime";
+import {recordPipelineFallback, recordRagRun} from "../common/ai-observability.js";
 import {
    appendTranscriptToChatMessages,
    collectTextMessages,
@@ -21,6 +25,7 @@ import {
    stripAudioFromRunnerMessages,
    toolRuntimeContextFromDownloads,
    transcribeAudioIfNeeded,
+    collectStoredReplyChainAttachments,
    UnifiedRunOptions,
 } from "./unified-ai-runner.shared";
 import {aiLog} from "../logging/ai-logger";
@@ -60,7 +65,7 @@ function runtimeTargetFor(options: UnifiedRunOptions, config: RuntimeConfigSnaps

 function createAiRequestPipelineState(options: UnifiedRunOptions): UserRequestPipelineState {
    return {
-        requestId: `ai:${options.msg.chat.id}:${options.msg.message_id}:${Date.now()}`,
+        requestId: options.requestId ?? `ai:${options.msg.chat.id}:${options.msg.message_id}:${Date.now()}`,
        chatId: options.msg.chat.id,
        messageId: options.msg.message_id,
        replyToMessageId: options.msg.reply_to_message?.message_id,
@@ -90,6 +95,12 @@ export async function prepareUnifiedAiRequestPipeline(params: {
    controller: AbortController;
 }): Promise<PreparedUnifiedAiRequest> {
    const {options, config, downloads, streamMessage, controller} = params;
+    const replyChainDownloads = shouldPreferCurrentDownloads(options.text, downloads)
+        ? downloads
+        : mergeReplyChainDownloads(
+            downloads,
+            attachmentsToDownloadedFiles(await collectStoredReplyChainAttachments(options.msg)),
+        );
    const prepared: MutablePreparedContext = {
        chatMessages: [],
        imageCount: 0,
@@ -109,7 +120,7 @@ export async function prepareUnifiedAiRequestPipeline(params: {
                    details: {
                        phase: "ai_request_prepare",
                        provider: options.provider,
-                        downloads: downloads.map(download => ({
+                        downloads: replyChainDownloads.map(download => ({
                            kind: download.kind,
                            fileName: download.fileName,
                            mimeType: download.mimeType,
@@ -126,15 +137,15 @@ export async function prepareUnifiedAiRequestPipeline(params: {
                    options.msg,
                    options.text,
                    options.provider,
-                    downloads,
+                    replyChainDownloads,
                    config,
                    runtimeTargetFor(options, config),
                    options.responseLanguage ?? DEFAULT_AI_RESPONSE_LANGUAGE,
                );
                prepared.chatMessages = collected.chatMessages as typeof prepared.chatMessages;
                prepared.imageCount = collected.imageCount;
-                prepared.firstRoundStatus = initialStatus(downloads, prepared.imageCount);
-                prepared.toolContext = toolRuntimeContextFromDownloads(downloads);
+                prepared.firstRoundStatus = initialStatus(replyChainDownloads, prepared.imageCount);
+                prepared.toolContext = toolRuntimeContextFromDownloads(replyChainDownloads);

                return {
                    stage: "collect_conversation_context",
@@ -169,11 +180,11 @@ export async function prepareUnifiedAiRequestPipeline(params: {
                prepared.transcript = await transcribeAudioIfNeeded(
                    options.provider,
                    options.msg.from?.id,
-                    downloads,
+                    replyChainDownloads,
                    streamMessage,
                    controller.signal,
                ).catch(error => {
-                    if (downloads.some(isTranscribableAudioDownload)) throw error;
+                    if (replyChainDownloads.some(isTranscribableAudioDownload)) throw error;
                    return "";
                });

@@ -188,7 +199,7 @@ export async function prepareUnifiedAiRequestPipeline(params: {
                const transcriptArtifact = await persistTranscriptArtifactAttachment({
                    provider: options.provider,
                    transcript,
-                    downloads,
+                    downloads: replyChainDownloads,
                    chatId: options.msg.chat.id,
                    messageId: options.msg.message_id,
                });
@@ -233,7 +244,7 @@ export async function prepareUnifiedAiRequestPipeline(params: {

                prepared.preparedDocumentRag = await prepareDocumentRag(
                    options.provider,
-                    downloads,
+                    replyChainDownloads,
                    prepared.chatMessages,
                    streamMessage,
                    config,
@@ -244,7 +255,7 @@ export async function prepareUnifiedAiRequestPipeline(params: {
                const ragArtifact = await persistRagArtifactAttachment({
                    provider: options.provider,
                    prepared: prepared.preparedDocumentRag,
-                    downloads,
+                    downloads: replyChainDownloads,
                    chatId: options.msg.chat.id,
                    messageId: options.msg.message_id,
                    details: prepared.preparedDocumentRag?.provider === AiProvider.OPENAI
@@ -264,6 +275,10 @@ export async function prepareUnifiedAiRequestPipeline(params: {
                    await streamMessage.storeInternalAttachment(ragArtifact);
                }

+                if (prepared.preparedDocumentRag) {
+                    recordRagRun();
+                }
+
                return {
                    stage: "document_rag",
                    status: prepared.preparedDocumentRag ? "succeeded" : "skipped",
@@ -290,6 +305,7 @@ export async function prepareUnifiedAiRequestPipeline(params: {
    ];

    const state = createAiRequestPipelineState(options);
+    const fallbackNotifier = new PipelineFallbackNotifier(options.msg, options.responseLanguage);
    const pipeline = new UserRequestPipeline({
        stages,
        stageNames: [
@@ -301,6 +317,44 @@ export async function prepareUnifiedAiRequestPipeline(params: {
            "document_rag",
            "audit_finish",
        ],
+        onFallback: async decision => {
+            recordPipelineFallback(decision.action);
+            if (decision.action === "use_alternate_target") {
+                aiLog("warn", "request.fallback.use_alternate_target", {
+                    provider: options.provider,
+                    stage: decision.stage,
+                    reason: decision.reason,
+                    requestId: state.requestId,
+                    ...buildToolRankFallbackTargetDetails(options.provider, config),
+                });
+            }
+
+            if (decision.action === "fail_request") {
+                aiLog("error", "request.fallback.fail_request", {
+                    provider: options.provider,
+                    stage: decision.stage,
+                    reason: decision.reason,
+                    requestId: state.requestId,
+                });
+            }
+
+            const notification = await fallbackNotifier.notify(state.requestId, decision);
+            state.audit.push({
+                stage: decision.stage,
+                status: "fallback",
+                startedAt: nowIso(),
+                finishedAt: nowIso(),
+                details: {
+                    fallbackAction: decision.action,
+                    fallbackNotification: notification.text,
+                    fallbackNotified: notification.notified,
+                    reason: decision.reason,
+                    ...(decision.action === "use_alternate_target"
+                        ? buildToolRankFallbackTargetDetails(options.provider, config)
+                        : {}),
+                },
+            });
+        },
    });
    await pipeline.run(state, controller.signal);
    await streamMessage.storePipelineAudit(state.audit);
@@ -2,6 +2,7 @@ import {AiProvider} from "../model/ai-provider";
 import {Environment} from "../common/environment";
 import {ifTrue, logError} from "../util/utils";
 import {UserRequestPipeline, type UserRequestPipelineState, type UserRequestPipelineStage} from "./user-request-pipeline";
+import {getProviderAdapter} from "./provider-adapters";
 import type {AiDownloadedFile} from "./telegram-attachments";
 import type {TelegramStreamMessage} from "./telegram-stream-message";
 import type {PreparedUnifiedAiRequest} from "./unified-ai-request-pipeline";
@@ -9,15 +10,22 @@ import type {OpenAIChatMessage} from "./openai-chat-message";
 import type {MistralChatMessage} from "./mistral-chat-message";
 import type {ChatMessage} from "./chat-messages-types";
 import {
+    allToolSchemaNames,
    providerName,
    RuntimeConfigSnapshot,
    snapshotModel,
    TELEGRAM_LIMIT,
    UnifiedRunOptions,
 } from "./unified-ai-runner.shared";
+import {runToolRankStage} from "./tool-rank-stage";
 import {runOpenAi} from "./unified-ai-runner.openai";
 import {runOllama} from "./unified-ai-runner.ollama";
 import {runMistral} from "./unified-ai-runner.mistral";
+import {summarizeModelOutput} from "./response-model-output";
+import {summarizeToolLoop} from "./tool-loop-summary";
+import {persistToolLoopSummaryArtifactAttachment} from "./tool-loop-artifact-store";
+import {PipelineFallbackNotifier} from "./user-request-pipeline/fallback-notifier";
+import {buildToolRankFallbackTargetDetails} from "./user-request-pipeline/fallback-target-details";
 import {
    resolveTextToSpeechProviderForUser,
    sendSynthesizedSpeech,
@@ -26,6 +34,7 @@ import {
 } from "./text-to-speech";
 import {persistFinalTextArtifactAttachment} from "./final-response-artifact-store";
 import {aiLog} from "../logging/ai-logger";
+import {recordPipelineFallback, recordTtsRun} from "../common/ai-observability.js";

 function nowIso(): string {
    return new Date().toISOString();
@@ -33,7 +42,7 @@ function nowIso(): string {

 function createResponsePipelineState(options: UnifiedRunOptions): UserRequestPipelineState {
    return {
-        requestId: `ai-response:${options.msg.chat.id}:${options.msg.message_id}:${Date.now()}`,
+        requestId: options.requestId ?? `ai-response:${options.msg.chat.id}:${options.msg.message_id}:${Date.now()}`,
        chatId: options.msg.chat.id,
        messageId: options.msg.message_id,
        replyToMessageId: options.msg.reply_to_message?.message_id,
@@ -159,6 +168,10 @@ export async function runUnifiedAiResponsePipeline(params: {
 }): Promise<void> {
    const {options, config, downloads, prepared, streamMessage, controller} = params;
    const state = createResponsePipelineState(options);
+    const fallbackNotifier = new PipelineFallbackNotifier(options.msg, options.responseLanguage);
+    const adapter = getProviderAdapter(options.provider);
+    let selectedToolNames: string[] = [];
+    let filteredTools: unknown[] = [];

    const stages: UserRequestPipelineStage[] = [
        {
@@ -177,6 +190,62 @@ export async function runUnifiedAiResponsePipeline(params: {
                };
            },
        },
+        {
+            name: "tool_rank",
+            async run() {
+                const availableTools = adapter.rankTools(config, {
+                    forCreator: options.msg.from?.id === Environment.CREATOR_ID,
+                    vectorStoreIds: prepared.preparedDocumentRag?.provider === AiProvider.OPENAI
+                        ? prepared.preparedDocumentRag.vectorStoreIds
+                        : [],
+                });
+
+                const rankResult = await runToolRankStage({
+                    provider: options.provider,
+                    model: snapshotModel(options.provider, config),
+                    round: state.toolRankDecisions.length,
+                    config,
+                    availableTools,
+                    messages: prepared.chatMessages,
+                    streamMessage,
+                    signal: controller.signal,
+                });
+
+                selectedToolNames = rankResult.selectedToolNames;
+                filteredTools = rankResult.filteredTools;
+                state.toolRankDecisions.push({
+                    provider: options.provider,
+                    round: state.toolRankDecisions.length,
+                    availableTools: allToolSchemaNames(availableTools),
+                    selectedTools: selectedToolNames,
+                    usedRanker: rankResult.usedRanker,
+                });
+
+                return {
+                    stage: "tool_rank",
+                    status: "succeeded",
+                    details: {
+                        selectedTools: selectedToolNames,
+                        usedRanker: rankResult.usedRanker,
+                        availableTools: allToolSchemaNames(availableTools),
+                        toolRankDecision: state.toolRankDecisions.at(-1),
+                    },
+                };
+            },
+        },
+        {
+            name: "filter_tools",
+            async run() {
+                return {
+                    stage: "filter_tools",
+                    status: "succeeded",
+                    details: {
+                        selectedTools: selectedToolNames,
+                        filteredToolCount: filteredTools.length,
+                    },
+                };
+            },
+        },
        {
            name: "model_call",
            async run() {
@@ -192,6 +261,13 @@ export async function runUnifiedAiResponsePipeline(params: {
                return {
                    stage: "model_call",
                    status: "succeeded",
+                    details: {
+                        modelOutput: summarizeModelOutput({
+                            text: streamMessage.getText(),
+                            toolExecutions: streamMessage.getToolExecutions(),
+                            outputAttachments: streamMessage.getOutputAttachments(),
+                        }),
+                    },
                };
            },
        },
@@ -199,33 +275,31 @@ export async function runUnifiedAiResponsePipeline(params: {
            name: "tool_loop",
            async run() {
                const executions = streamMessage.getToolExecutions();
+                const outputAttachments = streamMessage.getOutputAttachments();
+                const summary = summarizeToolLoop({
+                    text: streamMessage.getText(),
+                    executions,
+                    outputAttachments,
+                });
+                const persisted = await persistToolLoopSummaryArtifactAttachment({
+                    chatId: options.msg.chat.id,
+                    messageId: options.msg.message_id,
+                    text: streamMessage.getText(),
+                    executions,
+                    outputAttachments,
+                });
+
+                if (persisted) {
+                    await streamMessage.storeInternalAttachment(persisted);
+                }
+
                return {
                    stage: "tool_loop",
-                    status: executions.length ? "succeeded" : "skipped",
-                    fallbackAction: executions.length ? undefined : "continue_without_stage",
+                    ...summary,
                    details: {
-                        count: executions.length,
-                        tools: executions.map(execution => ({
-                            toolName: execution.toolName,
-                            callId: execution.callId,
-                            resultChars: execution.resultChars,
-                        })),
+                        ...summary.details,
+                        persistedSummaryArtifact: !!persisted,
                    },
-                    artifacts: executions.length ? [{
-                        kind: "tool_result",
-                        stage: "tool_loop",
-                        createdAt: new Date().toISOString(),
-                        toolName: "summary",
-                        callId: "tool_loop_summary",
-                        resultText: JSON.stringify({
-                            count: executions.length,
-                            tools: executions.map(execution => ({
-                                toolName: execution.toolName,
-                                callId: execution.callId,
-                                resultChars: execution.resultChars,
-                            })),
-                        }),
-                    }] : undefined,
                };
            },
        },
@@ -284,6 +358,7 @@ export async function runUnifiedAiResponsePipeline(params: {
            name: "text_to_speech",
            async run() {
                const status = await synthesizeResponseIfRequested({options, config, streamMessage});
+                recordTtsRun(status);
                return {
                    stage: "text_to_speech",
                    status,
@@ -312,6 +387,8 @@ export async function runUnifiedAiResponsePipeline(params: {
        stages,
        stageNames: [
            "audit_start",
+            "tool_rank",
+            "filter_tools",
            "model_call",
            "tool_loop",
            "output_size_gate",
@@ -320,6 +397,44 @@ export async function runUnifiedAiResponsePipeline(params: {
            "persist_output_artifacts",
            "audit_finish",
        ],
+        onFallback: async decision => {
+            recordPipelineFallback(decision.action);
+            if (decision.action === "use_alternate_target") {
+                aiLog("warn", "response.fallback.use_alternate_target", {
+                    provider: options.provider,
+                    stage: decision.stage,
+                    reason: decision.reason,
+                    requestId: state.requestId,
+                    ...buildToolRankFallbackTargetDetails(options.provider, config),
+                });
+            }
+
+            if (decision.action === "fail_request") {
+                aiLog("error", "response.fallback.fail_request", {
+                    provider: options.provider,
+                    stage: decision.stage,
+                    reason: decision.reason,
+                    requestId: state.requestId,
+                });
+            }
+
+            const notification = await fallbackNotifier.notify(state.requestId, decision);
+            state.audit.push({
+                stage: decision.stage,
+                status: "fallback",
+                startedAt: new Date().toISOString(),
+                finishedAt: new Date().toISOString(),
+                details: {
+                    fallbackAction: decision.action,
+                    fallbackNotification: notification.text,
+                    fallbackNotified: notification.notified,
+                    reason: decision.reason,
+                    ...(decision.action === "use_alternate_target"
+                        ? buildToolRankFallbackTargetDetails(options.provider, config)
+                        : {}),
+                },
+            });
+        },
    });

    try {
@@ -1,30 +1,27 @@
 import {Environment} from "../common/environment";
-import {getMistralTools} from "./tool-mappers";
 import {TelegramStreamMessage} from "./telegram-stream-message";
 import {ToolRuntimeContext} from "./tools/runtime";
 import {MistralChatMessage} from "./mistral-chat-message";
 import {createMistralClient} from "./ai-runtime-target";
 import {aiLog, aiLogDuration, aiLogProviderTarget, aiLogToolCall} from "../logging/ai-logger";
 import {AiProvider} from "../model/ai-provider";
-import {ToolRanker} from "./unified-ai-runner.tool-ranker";
+import {getProviderAdapter} from "./provider-adapters";
+import {runToolRankStage} from "./tool-rank-stage";

 import {
-    contentFromMistralDelta,
-    executeToolBatch,
    MAX_TOOL_ROUNDS,
-    MistralDeltaLike,
    MistralDocumentReference,
-    mistralToolCalls,
-    normalizeMistralToolCalls,
    roundStatus,
    RuntimeConfigSnapshot,
    StreamingToolCallAccumulator,
    ToolCallData,
    ToolExecutionMemory
 } from "./unified-ai-runner.shared";
+import {executeToolBatchWithAdapter} from "./tool-batch-runner";
+import {decideToolLoopContinuation} from "./tool-loop-control";
+import {runToolLoopRounds} from "./tool-loop-runner";
+import {runSingleModelRequest} from "./model-call-stage";
 import {Message} from "typescript-telegram-bot-api";
-import {filterRankedTools, latestUserTextFromMessages} from "./tool-ranker-pipeline";
-import {storeToolRankAudit} from "./tool-rank-audit";

 export async function runMistral(
    msg: Message,
@@ -39,8 +36,9 @@ export async function runMistral(
 ): Promise<void> {
    const runnerStartedAt = Date.now();
    const mistralAi = createMistralClient(config.mistralChatTarget);
-    const toolRanker = new ToolRanker(config);
-    const availableTools = getMistralTools(msg.from?.id === Environment.CREATOR_ID);
+    const adapter = getProviderAdapter(AiProvider.MISTRAL);
+    const availableTools = adapter.rankTools(config, {forCreator: msg.from?.id === Environment.CREATOR_ID});
+    const requestMessages = adapter.mapMessages([...messages]) as unknown as MistralChatMessage[];
    aiLog("info", "mistral.run.start", {
        stream,
        target: aiLogProviderTarget(config.mistralChatTarget),
@@ -50,142 +48,161 @@ export async function runMistral(
    });

    const toolMemory: ToolExecutionMemory = new Map();
+    try {
+        await runToolLoopRounds({
+            maxRounds: MAX_TOOL_ROUNDS,
+            onRound: async (round) => {
+            const roundStartedAt = Date.now();
+            aiLog("debug", "mistral.round.start", {round, messages: messages.length, stream});
+            if (signal.aborted) throw new Error("Aborted");

-    for (let round = 0; round < MAX_TOOL_ROUNDS; round++) {
-        const roundStartedAt = Date.now();
-        aiLog("debug", "mistral.round.start", {round, messages: messages.length, stream});
-        if (signal.aborted) throw new Error("Aborted");
-
-        streamMessage.setStatus(Environment.getSelectingToolsText());
-        await streamMessage.flush();
-        const toolRankStartedAt = Date.now();
-        const toolRankStartedAtIso = new Date().toISOString();
-        const rankerSelection = await toolRanker.selectTools({
+            const rankResult = await runToolRankStage({
                provider: AiProvider.MISTRAL,
-                userQuery: latestUserTextFromMessages(messages),
-                availableTools,
+                model: config.mistralChatTarget.model,
                round,
+                config,
+                availableTools,
+                messages,
+                streamMessage,
                signal,
-            })
-            .catch(async error => {
-                streamMessage.clearStatus();
-                await streamMessage.flush();
-                await storeToolRankAudit({
-                    streamMessage,
-                    provider: AiProvider.MISTRAL,
-                    model: config.mistralChatTarget.model,
-                    round,
-                    startedAt: toolRankStartedAt,
-                    startedAtIso: toolRankStartedAtIso,
-                    error,
-                });
-                throw error;
            });
-        streamMessage.clearStatus();
-        await streamMessage.flush();
-        await storeToolRankAudit({
-            streamMessage,
-            provider: AiProvider.MISTRAL,
-            model: config.mistralChatTarget.model,
-            round,
-            startedAt: toolRankStartedAt,
-            startedAtIso: toolRankStartedAtIso,
-            selectedTools: rankerSelection.toolNames,
-        });
-        const filteredTools = filterRankedTools(availableTools, rankerSelection.toolNames);
-        const requestTools = filteredTools.length ? filteredTools : undefined;
+            const filteredTools = rankResult.filteredTools;
+            const requestTools = filteredTools.length ? filteredTools : undefined;

-        streamMessage.setStatus(roundStatus(round, firstRoundStatus) ?? "");
-        await streamMessage.flush();
+            streamMessage.setStatus(roundStatus(round, firstRoundStatus) ?? "");
+            await streamMessage.flush();
+
+            if (!stream) {
+                const request = {
+                    model: config.mistralChatTarget.model,
+                    messages: requestMessages,
+                    tools: requestTools,
+                    documents: documents
+                } as Parameters<typeof mistralAi.chat.complete>[0];
+                const response = await runSingleModelRequest({
+                    execute: () => adapter.callModel(request, () => mistralAi.chat.complete(request, {signal})),
+                });
+                const message = response.choices?.[0]?.message;
+                const text = typeof message?.content === "string" ? message.content : JSON.stringify(message?.content ?? "");
+                streamMessage.append(text);
+                const calls = adapter.extractToolCalls(message);
+                aiLog(calls.length ? "info" : "success", calls.length ? "mistral.tool_calls" : "mistral.run.done", {
+                    round,
+                    duration: calls.length ? aiLogDuration(roundStartedAt) : aiLogDuration(runnerStartedAt),
+                    textChars: text.length,
+                    calls: calls.map(aiLogToolCall),
+                });
+                if (!calls.length) return {shouldContinue: false};
+                messages.push({
+                    role: "assistant",
+                    content: text,
+                    toolCalls: calls.map(call => ({
+                        id: call.id,
+                        function: {name: call.name, arguments: call.argumentsText},
+                    })),
+                });
+                requestMessages.push({
+                    role: "assistant",
+                    content: text,
+                    toolCalls: calls.map(call => ({
+                        id: call.id,
+                        function: {name: call.name, arguments: call.argumentsText},
+                    })),
+                });
+                await executeToolBatchWithAdapter({
+                    userId: msg.from?.id,
+                    toolCalls: calls,
+                    streamMessage,
+                    toolContext,
+                    toolMemory,
+                    adapter,
+                    appendTargets: [messages, requestMessages],
+                });
+                const continuation = decideToolLoopContinuation({
+                    round,
+                    maxRounds: MAX_TOOL_ROUNDS,
+                    toolCalls: calls,
+                });
+                if (!continuation.continue && continuation.reason === "max_rounds_reached") {
+                    aiLog("warn", "mistral.tool_loop.max_rounds_reached", {
+                        round,
+                        maxRounds: MAX_TOOL_ROUNDS,
+                    });
+                }
+                return {shouldContinue: true};
+            }

-        if (!stream) {
            const request = {
                model: config.mistralChatTarget.model,
-                messages,
+                messages: requestMessages,
                tools: requestTools,
                documents: documents
-            } as Parameters<typeof mistralAi.chat.complete>[0];
-            const response = await mistralAi.chat.complete(request, {signal});
-            const message = response.choices?.[0]?.message;
-            const text = typeof message?.content === "string" ? message.content : JSON.stringify(message?.content ?? "");
-            streamMessage.append(text);
-            const calls = normalizeMistralToolCalls(mistralToolCalls(message));
+            } as Parameters<typeof mistralAi.chat.stream>[0];
+            const streamResponse = await runSingleModelRequest({
+                execute: () => adapter.callModel(request, () => mistralAi.chat.stream(request, {signal})),
+            });
+            aiLog("debug", "mistral.stream.open", {round});
+            let calls: ToolCallData[] = [];
+            const roundTextStart = streamMessage.getText().length;
+            const toolCallAccumulator = new StreamingToolCallAccumulator("mistral_stream", round);
+
+            for await (const event of streamResponse) {
+                if (signal.aborted) throw new Error("Aborted");
+
+                const choice = event.data?.choices?.[0];
+                const delta = choice?.delta;
+                const mistralDelta = delta;
+                streamMessage.append(adapter.extractTextDelta(mistralDelta));
+
+                const rawDeltaCalls = adapter.extractStreamingToolCalls(mistralDelta);
+                if (rawDeltaCalls.length) {
+                    calls = toolCallAccumulator.add(rawDeltaCalls);
+                    streamMessage.setStatus(Environment.getUseToolText(calls));
+                    await streamMessage.flush();
+                }
+            }
            aiLog(calls.length ? "info" : "success", calls.length ? "mistral.tool_calls" : "mistral.run.done", {
                round,
                duration: calls.length ? aiLogDuration(roundStartedAt) : aiLogDuration(runnerStartedAt),
-                textChars: text.length,
+                textChars: streamMessage.getText().slice(roundTextStart).length,
                calls: calls.map(aiLogToolCall),
            });
-            if (!calls.length) return;
+            if (!calls.length) return {shouldContinue: false};
+            const roundText = streamMessage.getText().slice(roundTextStart);
            messages.push({
                role: "assistant",
-                content: text,
-                toolCalls: calls.map(call => ({
-                    id: call.id,
-                    function: {name: call.name, arguments: call.argumentsText},
-                })),
+                content: roundText,
+                toolCalls: calls.map(c => ({id: c.id, function: {name: c.name, arguments: c.argumentsText}}))
            });
-            const toolResults = await executeToolBatch(msg.from?.id, calls, streamMessage, toolContext, toolMemory);
-            for (const [index, call] of calls.entries()) {
-                messages.push({
-                    role: "tool",
-                    name: call.name,
-                    toolCallId: call.id,
-                    content: toolResults[index] ?? "",
-                });
-            }
-            continue;
-        }
-
-        const request = {
-            model: config.mistralChatTarget.model,
-            messages,
-            tools: requestTools,
-            documents: documents
-        } as Parameters<typeof mistralAi.chat.stream>[0];
-        const streamResponse = await mistralAi.chat.stream(request, {signal});
-        aiLog("debug", "mistral.stream.open", {round});
-        let calls: ToolCallData[] = [];
-        const roundTextStart = streamMessage.getText().length;
-        const toolCallAccumulator = new StreamingToolCallAccumulator("mistral_stream", round);
-
-        for await (const event of streamResponse) {
-            if (signal.aborted) throw new Error("Aborted");
-
-            const choice = event.data?.choices?.[0];
-            const delta = choice?.delta;
-            const mistralDelta = delta as MistralDeltaLike;
-
-            streamMessage.append(contentFromMistralDelta(mistralDelta));
-
-            const rawDeltaCalls = mistralToolCalls(mistralDelta);
-            if (rawDeltaCalls.length) {
-                calls = toolCallAccumulator.add(rawDeltaCalls);
-                streamMessage.setStatus(Environment.getUseToolText(calls));
-                await streamMessage.flush();
-            }
-        }
-        aiLog(calls.length ? "info" : "success", calls.length ? "mistral.tool_calls" : "mistral.run.done", {
-            round,
-            duration: calls.length ? aiLogDuration(roundStartedAt) : aiLogDuration(runnerStartedAt),
-            textChars: streamMessage.getText().slice(roundTextStart).length,
-            calls: calls.map(aiLogToolCall),
-        });
-        if (!calls.length) return;
-        const roundText = streamMessage.getText().slice(roundTextStart);
-        messages.push({
-            role: "assistant",
-            content: roundText,
-            toolCalls: calls.map(c => ({id: c.id, function: {name: c.name, arguments: c.argumentsText}}))
-        });
-        const toolResults = await executeToolBatch(msg.from?.id, calls, streamMessage, toolContext, toolMemory);
-        for (const [index, call] of calls.entries()) {
-            messages.push({
-                role: "tool",
-                name: call.name,
-                toolCallId: call.id,
-                content: toolResults[index] ?? "",
+            requestMessages.push({
+                role: "assistant",
+                content: roundText,
+                toolCalls: calls.map(c => ({id: c.id, function: {name: c.name, arguments: c.argumentsText}}))
            });
-        }
+            await executeToolBatchWithAdapter({
+                userId: msg.from?.id,
+                toolCalls: calls,
+                streamMessage,
+                toolContext,
+                toolMemory,
+                adapter,
+                appendTargets: [messages, requestMessages],
+            });
+            const continuation = decideToolLoopContinuation({
+                round,
+                maxRounds: MAX_TOOL_ROUNDS,
+                toolCalls: calls,
+            });
+                if (!continuation.continue && continuation.reason === "max_rounds_reached") {
+                    aiLog("warn", "mistral.tool_loop.max_rounds_reached", {
+                        round,
+                        maxRounds: MAX_TOOL_ROUNDS,
+                    });
+                }
+                return {shouldContinue: true};
+        },
+        });
+    } finally {
+        await adapter.finalize().catch(() => undefined);
    }
 }
@@ -5,7 +5,6 @@ import {Environment} from "../common/environment";
 import type {BoundaryValue} from "../common/boundary-types";
 import {bot, notesDir} from "../index";
 import {clamp, logError} from "../util/utils";
-import {getOllamaTools} from "./tool-mappers";
 import {TelegramStreamMessage} from "./telegram-stream-message";
 import {ChatMessage} from "./chat-messages-types";
 import {ChatRequest, Tool} from "ollama";
@@ -14,20 +13,18 @@ import {enqueueTelegramApiCall} from "../util/telegram-api-queue";
 import {loadOllamaModel, unloadAllOllamaModels} from "./tools/utils";
 import {createOllamaClient} from "./ai-runtime-target";
 import {aiLog, aiLogDuration, aiLogMessageIdentity, aiLogProviderTarget, aiLogToolCall} from "../logging/ai-logger";
+import {getProviderAdapter} from "./provider-adapters";
+import {runToolRankStage} from "./tool-rank-stage";

 import {
    allToolSchemaNames,
-    appendOllamaToolResults,
    dedupeToolCalls,
    DEFAULT_OLLAMA_CONTEXT_SIZE,
-    executeToolBatch,
    isOllamaModelActive,
    isRecord,
    MAX_OLLAMA_CONTEXT_SIZE,
    MAX_TOOL_ROUNDS,
    MIN_OLLAMA_CONTEXT_SIZE,
-    normalizeOllamaToolCalls,
-    OllamaToolCallLike,
    roundStatus,
    RuntimeConfigSnapshot,
    safeJsonParseObject,
@@ -35,14 +32,15 @@ import {
    ToolCallData,
    ToolExecutionMemory
 } from "./unified-ai-runner.shared";
-import {ToolRanker} from "./unified-ai-runner.tool-ranker";
+import {executeToolBatchWithAdapter} from "./tool-batch-runner";
+import {decideToolLoopContinuation} from "./tool-loop-control";
+import {runToolLoopRounds} from "./tool-loop-runner";
+import {runSingleModelRequest} from "./model-call-stage";
 import {getToolPrompts} from "./tools/registry";
-import {filterRankedTools, latestUserTextFromMessages} from "./tool-ranker-pipeline";
 import {GetNoteFileResult, GetNoteFileResultSchema} from "./tools/notes";
 import {getModelCapabilities} from "./provider-model-runtime";
 import {AiProvider} from "../model/ai-provider";
 import {Message} from "typescript-telegram-bot-api";
-import {storeToolRankAudit} from "./tool-rank-audit";

 export async function runOllama(
    msg: Message,
@@ -157,9 +155,12 @@ export async function runOllama(
    }

    const toolMemory: ToolExecutionMemory = new Map();
+    const adapter = getProviderAdapter(AiProvider.OLLAMA);

    try {
-        for (let round = 0; round < MAX_TOOL_ROUNDS; round++) {
+        await runToolLoopRounds({
+            maxRounds: MAX_TOOL_ROUNDS,
+            onRound: async (round) => {
            const roundStartedAt = Date.now();
            aiLog("debug", "ollama.round.start", {
                round,
@@ -183,7 +184,7 @@ export async function runOllama(

            let activeToolNames: string[] = [];
            if ((await getModelCapabilities(AiProvider.OLLAMA, model, "tools"))?.tools?.supported) {
-                const availableOllamaTools: Tool[] = getOllamaTools(msg.from?.id === Environment.CREATOR_ID) as Tool[];
+                const availableOllamaTools: Tool[] = adapter.rankTools(config, {forCreator: msg.from?.id === Environment.CREATOR_ID}) as Tool[];

                aiLog("debug", "ollama.tools.available", {
                    round,
@@ -191,44 +192,18 @@ export async function runOllama(
                    rankerEnabled: !!config.ollamaToolRankerTarget,
                });

-                streamMessage.setStatus(Environment.getSelectingToolsText());
-                await streamMessage.flush();
-                const toolRankStartedAt = Date.now();
-                const toolRankStartedAtIso = new Date().toISOString();
-                const rankerSelection = await new ToolRanker(config).selectTools({
-                        provider: AiProvider.OLLAMA,
-                        userQuery: latestUserTextFromMessages(messages),
-                        availableTools: availableOllamaTools,
-                        round,
-                        signal,
-                    })
-                    .catch(async error => {
-                        streamMessage.clearStatus();
-                        await streamMessage.flush();
-                        await storeToolRankAudit({
-                            streamMessage,
-                            provider: AiProvider.OLLAMA,
-                            model,
-                            round,
-                            startedAt: toolRankStartedAt,
-                            startedAtIso: toolRankStartedAtIso,
-                            error,
-                        });
-                        throw error;
-                    });
-                streamMessage.clearStatus();
-                await streamMessage.flush();
-                await storeToolRankAudit({
-                    streamMessage,
+                const rankResult = await runToolRankStage({
                    provider: AiProvider.OLLAMA,
                    model,
                    round,
-                    startedAt: toolRankStartedAt,
-                    startedAtIso: toolRankStartedAtIso,
-                    selectedTools: rankerSelection.toolNames,
+                    config,
+                    availableTools: availableOllamaTools,
+                    messages,
+                    streamMessage,
+                    signal,
                });

-                const filteredTools = [...new Set(filterRankedTools(availableOllamaTools, rankerSelection.toolNames))];
+                const filteredTools = [...new Set(rankResult.filteredTools as Tool[])];
                activeToolNames = filteredTools.map(t => t.function.name ?? "");
                if (filteredTools.length > 0) {
                    request.tools = [...filteredTools];
@@ -256,24 +231,23 @@ export async function runOllama(
                    round,
                    tools: activeToolNames,
                    count: activeToolNames.length,
-                    usedRanker: rankerSelection.usedRanker,
+                    usedRanker: rankResult.usedRanker,
                });
            }

            if (!stream) {
-                const response = await ollama.chat({
-                    ...request,
-                    stream: false
+                const response = await runSingleModelRequest({
+                    execute: () => adapter.callModel(request, () => ollama.chat({
+                        ...request,
+                        stream: false
+                    })),
                });

                const message = response.message;
                const rawContent = message?.content ?? "";

                const nativeCalls = dedupeToolCalls(
-                    normalizeOllamaToolCalls(
-                        message?.tool_calls as readonly OllamaToolCallLike[] | undefined,
-                        round,
-                    ),
+                    adapter.extractToolCalls(message),
                );

                const responseText = rawContent;
@@ -298,10 +272,10 @@ export async function runOllama(

                if (!nativeCalls.length) {
                    aiLog("success", "ollama.run.done", {round, duration: aiLogDuration(runnerStartedAt)});
-                    break;
+                    return {shouldContinue: false};
                }

-                const calls = nativeCalls;
+                const calls = adapter.extractToolCalls(message).length ? adapter.extractToolCalls(message) : nativeCalls;

                aiLog("info", "ollama.tool_calls", {
                    round,
@@ -319,22 +293,40 @@ export async function runOllama(
                    })),
                });

-                appendOllamaToolResults(
-                    messages,
-                    calls,
-                    await executeToolBatch(msg.from?.id, calls, streamMessage, toolContext, toolMemory),
-                );
+                await executeToolBatchWithAdapter({
+                    userId: msg.from?.id,
+                    toolCalls: calls,
+                    streamMessage,
+                    toolContext,
+                    toolMemory,
+                    adapter,
+                    appendTargets: [messages],
+                });

-                continue;
+                const continuation = decideToolLoopContinuation({
+                    round,
+                    maxRounds: MAX_TOOL_ROUNDS,
+                    toolCalls: calls,
+                });
+                if (!continuation.continue && continuation.reason === "max_rounds_reached") {
+                    aiLog("warn", "ollama.tool_loop.max_rounds_reached", {
+                        round,
+                        maxRounds: MAX_TOOL_ROUNDS,
+                    });
+                }
+
+                return {shouldContinue: true};
            }

            aiLog("debug", "ollama.stream.messages", {
                round,
                messageCount: request.messages?.length ?? 0,
            });
-            const response = await ollama.chat({
-                ...request,
-                stream: true
+            const response = await runSingleModelRequest({
+                execute: () => adapter.callModel(request, () => ollama.chat({
+                    ...request,
+                    stream: true
+                })),
            });

            aiLog("debug", "ollama.stream.open", {round});
@@ -354,10 +346,7 @@ export async function runOllama(

                    const localToolCalls: ToolCallData[] = [];

-                    localToolCalls.push(...normalizeOllamaToolCalls(
-                        chunk.message.tool_calls as readonly OllamaToolCallLike[] | undefined,
-                        round,
-                    ));
+                    localToolCalls.push(...adapter.extractStreamingToolCalls(chunk.message));

                    const newStatus = roundStatus(round, firstRoundStatus, chunk.message.content, localToolCalls, !!chunk.message.thinking);
                    const previousStatus = streamMessage.getStatus();
@@ -377,13 +366,10 @@ export async function runOllama(
                    }

                    if (!(chunk.message?.thinking && streamMessage.getStatus() !== Environment.reasoningText)) {
-                        streamMessage.append(chunk.message?.content ?? "");
+                        streamMessage.append(adapter.extractTextDelta(chunk));
                    }

-                    calls.push(...normalizeOllamaToolCalls(
-                        chunk.message?.tool_calls as readonly OllamaToolCallLike[] | undefined,
-                        round,
-                    ));
+                    calls.push(...adapter.extractStreamingToolCalls(chunk.message));

                    if (chunk.done) {
                        aiLog("debug", "ollama.stream.done", {
@@ -416,7 +402,7 @@ export async function runOllama(
                    duration: aiLogDuration(runnerStartedAt),
                });

-                break;
+                return {shouldContinue: false};
            }

            calls.splice(0, calls.length, ...dedupeToolCalls(calls));
@@ -439,7 +425,27 @@ export async function runOllama(
                })),
            });

-            const toolResults = await executeToolBatch(msg.from?.id, calls, streamMessage, toolContext, toolMemory);
+            const toolResults = await executeToolBatchWithAdapter({
+                userId: msg.from?.id,
+                toolCalls: calls,
+                streamMessage,
+                toolContext,
+                toolMemory,
+                adapter,
+                appendTargets: [messages],
+            });
+
+            const continuation = decideToolLoopContinuation({
+                round,
+                maxRounds: MAX_TOOL_ROUNDS,
+                toolCalls: calls,
+            });
+            if (!continuation.continue && continuation.reason === "max_rounds_reached") {
+                aiLog("warn", "ollama.tool_loop.max_rounds_reached", {
+                    round,
+                    maxRounds: MAX_TOOL_ROUNDS,
+                });
+            }

            let successGetNoteFileResult: GetNoteFileResult | undefined = undefined;

@@ -471,9 +477,11 @@ export async function runOllama(
                }).catch(logError);
            }

-            appendOllamaToolResults(messages, calls, toolResults);
-        }
+            return {shouldContinue: true};
+        },
+        });
    } finally {
        if (interval) clearInterval(interval);
+        await adapter.finalize().catch(() => undefined);
    }
 }
@@ -17,11 +17,8 @@ import {
    AsyncIterableStream,
    buildSystemInstruction,
    collectOpenAiResponseCodeInterpreterCalls,
-    collectOpenAiResponseFunctionCalls,
    collectOpenAiResponseImages,
    collectOpenAiResponseText,
-    executeToolBatch,
-    getOpenAIResponsesToolsWithImage,
    MAX_TOOL_ROUNDS,
    OPENAI_IMAGE_PARTIALS,
    openAiResponseItemCallId,
@@ -35,6 +32,10 @@ import {
    errorMessage,
    allToolSchemaNames
 } from "./unified-ai-runner.shared";
+import {executeToolBatchWithAdapter} from "./tool-batch-runner";
+import {decideToolLoopContinuation} from "./tool-loop-control";
+import {runToolLoopRounds} from "./tool-loop-runner";
+import {runSingleModelRequest} from "./model-call-stage";
 import {bot} from "../index";
 import fs from "node:fs";
 import path from "node:path";
@@ -42,10 +43,9 @@ import {logError} from "../util/utils";
 import {SendFileAttachmentResult, SendFileAttachmentResultSchema} from "./tools/files";
 import {DEFAULT_AI_RESPONSE_LANGUAGE} from "../common/user-ai-settings";
 import {AiDownloadedFile} from "./telegram-attachments";
-import {ToolRanker} from "./unified-ai-runner.tool-ranker";
 import {AiProvider} from "../model/ai-provider";
-import {filterRankedTools, latestUserTextFromMessages} from "./tool-ranker-pipeline";
-import {storeToolRankAudit} from "./tool-rank-audit";
+import {getProviderAdapter} from "./provider-adapters";
+import {runToolRankStage} from "./tool-rank-stage";

 export async function runOpenAi(
    msg: Message,
@@ -60,16 +60,15 @@ export async function runOpenAi(
    documentRag?: OpenAiDocumentRagContext,
 ): Promise<void> {
    const runnerStartedAt = Date.now();
-    let responseInput: Array<ResponseInputItem | OpenAiResponseOutputItem> = [...messages] as Array<ResponseInputItem | OpenAiResponseOutputItem>;
    const openAi = createOpenAiClient(config.openAiChatTarget);
    const ownsDocumentRag = !documentRag;
    const preparedDocumentRag = documentRag ?? await prepareOpenAiDocumentRag(openAi, downloads.filter(download => download.kind === "document"));
-    const toolRanker = new ToolRanker(config);
-    const availableTools = getOpenAIResponsesToolsWithImage(
-        config,
-        msg.from?.id === Environment.CREATOR_ID,
-        preparedDocumentRag?.vectorStoreIds ?? [],
-    );
+    const adapter = getProviderAdapter(AiProvider.OPENAI);
+    let responseInput: Array<ResponseInputItem | OpenAiResponseOutputItem> = adapter.mapMessages(messages) as unknown as Array<ResponseInputItem | OpenAiResponseOutputItem>;
+    const availableTools = adapter.rankTools(config, {
+        forCreator: msg.from?.id === Environment.CREATOR_ID,
+        vectorStoreIds: preparedDocumentRag?.vectorStoreIds ?? [],
+    });

    const systemPrompt = buildSystemInstruction(
        config,
@@ -90,78 +89,266 @@ export async function runOpenAi(
    const toolMemory: ToolExecutionMemory = new Map();

    try {
-        for (let round = 0; round < MAX_TOOL_ROUNDS; round++) {
-        const roundStartedAt = Date.now();
-        aiLog("debug", "openai.round.start", {round, inputItems: responseInput.length, stream});
-        streamMessage.setStatus(Environment.getSelectingToolsText());
-        await streamMessage.flush();
-        const toolRankStartedAt = Date.now();
-        const toolRankStartedAtIso = new Date().toISOString();
-        const rankerSelection = await toolRanker.selectTools({
+        await runToolLoopRounds({
+            maxRounds: MAX_TOOL_ROUNDS,
+            onRound: async (round) => {
+            const roundStartedAt = Date.now();
+            aiLog("debug", "openai.round.start", {round, inputItems: responseInput.length, stream});
+            const rankResult = await runToolRankStage({
                provider: AiProvider.OPENAI,
-                userQuery: latestUserTextFromMessages(messages),
-                availableTools,
+                model: config.openAiChatTarget.model,
                round,
+                config,
+                availableTools,
+                messages,
+                streamMessage,
                signal,
-            })
-            .catch(async error => {
-                streamMessage.clearStatus();
-                await streamMessage.flush();
-                await storeToolRankAudit({
-                    streamMessage,
-                    provider: AiProvider.OPENAI,
-                    model: config.openAiChatTarget.model,
-                    round,
-                    startedAt: toolRankStartedAt,
-                    startedAtIso: toolRankStartedAtIso,
-                    error,
-                });
-                throw error;
            });
-        streamMessage.clearStatus();
-        await streamMessage.flush();
-        await storeToolRankAudit({
-            streamMessage,
-            provider: AiProvider.OPENAI,
-            model: config.openAiChatTarget.model,
-            round,
-            startedAt: toolRankStartedAt,
-            startedAtIso: toolRankStartedAtIso,
-            selectedTools: rankerSelection.toolNames,
-        });
-        const filteredTools = filterRankedTools(availableTools, rankerSelection.toolNames);
-        const requestTools = preparedDocumentRag?.vectorStoreIds.length
-            ? (() => {
-                const tools = [...filteredTools];
-                const hasFileSearch = allToolSchemaNames(tools).includes("file_search");
-                if (!hasFileSearch) {
-                    const fileSearchTool = availableTools.find(tool => allToolSchemaNames([tool]).includes("file_search"));
-                    if (fileSearchTool) {
-                        tools.unshift(fileSearchTool);
+            const filteredTools = rankResult.filteredTools;
+            const requestTools = preparedDocumentRag?.vectorStoreIds.length
+                ? (() => {
+                    const tools = [...filteredTools];
+                    const hasFileSearch = allToolSchemaNames(tools).includes("file_search");
+                    if (!hasFileSearch) {
+                        const fileSearchTool = availableTools.find(tool => allToolSchemaNames([tool]).includes("file_search"));
+                        if (fileSearchTool) {
+                            tools.unshift(fileSearchTool);
+                        }
+                    }
+                    return tools.length ? tools : undefined;
+                })()
+                : (filteredTools.length ? filteredTools : undefined);
+
+            if (!stream) {
+                const request: ResponseCreateParamsNonStreaming = {
+                    model: config.openAiChatTarget.model,
+                    input: responseInput as ResponseInputItem[],
+                    tools: requestTools as ResponseCreateParamsNonStreaming["tools"],
+                    instructions: systemPrompt,
+                };
+                const response = await runSingleModelRequest({
+                    execute: () => adapter.callModel(request, () => openAi.responses.create(request, {signal})),
+                }) as OpenAiResponseLike;
+
+                const responseText = collectOpenAiResponseText(response);
+                streamMessage.append(responseText);
+                aiLog("debug", "openai.response.received", {
+                    round,
+                    duration: aiLogDuration(roundStartedAt),
+                    textChars: responseText.length,
+                    outputItems: response?.output?.length ?? 0,
+                });
+                const images = collectOpenAiResponseImages(response);
+                if (images.length) {
+                    await showOpenAiGeneratedImage(
+                        streamMessage,
+                        sourceMessage,
+                        images[images.length - 1],
+                        `final_${round}`,
+                        Environment.getImageGenDoneText(config.openAiImageTarget.model),
+                        true,
+                    );
+                }
+
+                const codeInterpreterCalls = collectOpenAiResponseCodeInterpreterCalls(response);
+                if (codeInterpreterCalls.length) {
+                    aiLog("info", "openai.code_interpreter_calls", {
+                        round,
+                        duration: aiLogDuration(roundStartedAt),
+                        calls: codeInterpreterCalls.map(call => ({
+                            id: call.id,
+                            status: call.status,
+                            containerId: call.containerId,
+                            codeChars: call.code?.length ?? 0,
+                            outputItems: call.outputs.length,
+                        })),
+                    });
+                }
+
+                const calls = adapter.extractToolCalls(response);
+                aiLog(calls.length ? "info" : "success", calls.length ? "openai.tool_calls" : "openai.run.done", {
+                    round,
+                    duration: calls.length ? aiLogDuration(roundStartedAt) : aiLogDuration(runnerStartedAt),
+                    calls: calls.map(call => ({
+                        id: call.id,
+                        name: call.name,
+                        arguments: safeJsonParseObject(call.argumentsText)
+                    })),
+                });
+                if (!calls.length) return {shouldContinue: false};
+
+                const toolCalls = calls.map(call => ({
+                    id: call.id,
+                    name: call.name,
+                    argumentsText: call.argumentsText,
+                }));
+                const toolOutputs: Array<{type: "function_call_output"; call_id: string; output: string}> = [];
+                const toolResults = await executeToolBatchWithAdapter({
+                    userId: msg.from?.id,
+                    toolCalls,
+                    streamMessage,
+                    toolContext,
+                    toolMemory,
+                    adapter,
+                    appendTargets: [toolOutputs],
+                });
+
+                const uploadFilesResult = await tryToUploadFiles(msg, toolResults);
+                if (uploadFilesResult.found) {
+                    if (!uploadFilesResult.uploaded) {
+                        const old = toolOutputs[uploadFilesResult.toolIndex];
+                        const callId = old?.call_id;
+                        if (uploadFilesResult.toolIndex >= 0) {
+                            delete toolOutputs[uploadFilesResult.toolIndex];
+                        }
+                        if (callId) {
+                            toolOutputs.push({
+                                type: "function_call_output" as const,
+                                call_id: callId,
+                                output: "Error: " + uploadFilesResult.error
+                            });
+                        }
                    }
                }
-                return tools.length ? tools : undefined;
-            })()
-            : (filteredTools.length ? filteredTools : undefined);

-        if (!stream) {
-            const request: ResponseCreateParamsNonStreaming = {
+                const continuation = decideToolLoopContinuation({
+                    round,
+                    maxRounds: MAX_TOOL_ROUNDS,
+                    toolCalls: calls,
+                });
+                if (!continuation.continue && continuation.reason === "max_rounds_reached") {
+                    aiLog("warn", "openai.tool_loop.max_rounds_reached", {
+                        round,
+                        maxRounds: MAX_TOOL_ROUNDS,
+                    });
+                }
+
+                responseInput = [...responseInput, ...(response.output ?? []), ...toolOutputs];
+                return {shouldContinue: true};
+            }
+
+            let completedResponse: OpenAiResponseLike | null = null;
+            const request: ResponseCreateParamsStreaming = {
                model: config.openAiChatTarget.model,
                input: responseInput as ResponseInputItem[],
-                tools: requestTools as ResponseCreateParamsNonStreaming["tools"],
-                instructions: systemPrompt,
+                stream: true,
+                tools: requestTools as ResponseCreateParamsStreaming["tools"],
+                parallel_tool_calls: true,
+                instructions: systemPrompt
            };
-            const response = await openAi.responses.create(request, {signal}) as OpenAiResponseLike;
+            const response = await runSingleModelRequest({
+                execute: () => adapter.callModel(request, () => openAi.responses.create(request, {signal})),
+            }) as AsyncIterableStream<ResponseStreamEvent>;

-            const responseText = collectOpenAiResponseText(response);
-            streamMessage.append(responseText);
-            aiLog("debug", "openai.response.received", {
+            aiLog("debug", "openai.stream.open", {round});
+
+            let localToolCalls: ToolCallData[] = [];
+            for await (const event of response) {
+                if (signal.aborted) throw new Error("Aborted");
+
+                switch (event.type) {
+                    case "response.output_text.delta":
+                        streamMessage.append(adapter.extractTextDelta(event));
+                        break;
+                    case "response.image_generation_call.in_progress":
+                        streamMessage.setStatus(Environment.startingImageGenText);
+                        await streamMessage.flush();
+                        break;
+                    case "response.image_generation_call.generating":
+                        streamMessage.setStatus(Environment.imageGenText);
+                        await streamMessage.flush();
+                        break;
+                    case "response.image_generation_call.partial_image": {
+                        const iteration = (event.partial_image_index ?? 0) + 1;
+                        await showOpenAiGeneratedImage(
+                            streamMessage,
+                            sourceMessage,
+                            event.partial_image_b64,
+                            `partial_${round}_${iteration}`,
+                            Environment.getPartialImageGenText(iteration, OPENAI_IMAGE_PARTIALS),
+                            false,
+                        );
+                        break;
+                    }
+                    case "response.image_generation_call.completed":
+                        streamMessage.setStatus(Environment.finalizingImageGenText);
+                        await streamMessage.flush();
+                        break;
+                    case "response.file_search_call.in_progress":
+                    case "response.file_search_call.searching":
+                        streamMessage.setStatus(Environment.getUseToolText(["file_search"]));
+                        await streamMessage.flush();
+                        break;
+                    case "response.file_search_call.completed":
+                        streamMessage.clearStatus();
+                        await streamMessage.flush();
+                        break;
+                    case "response.code_interpreter_call.in_progress":
+                    case "response.code_interpreter_call.interpreting":
+                        streamMessage.setStatus(Environment.getUseToolText(["code_interpreter"]));
+                        await streamMessage.flush();
+                        break;
+                    case "response.code_interpreter_call.completed":
+                        streamMessage.clearStatus();
+                        await streamMessage.flush();
+                        break;
+                    case "response.code_interpreter_call_code.delta":
+                    case "response.code_interpreter_call_code.done":
+                        break;
+                    case "response.output_item.added":
+                        {
+                            const streamedCalls = adapter.extractStreamingToolCalls(event);
+                            if (streamedCalls.length) {
+                                localToolCalls.push(...streamedCalls);
+                            }
+                            aiLog("info", "openai.stream.tool_call.added", {
+                                round,
+                                toolCalls: localToolCalls.map(aiLogToolCall)
+                            });
+                            streamMessage.setStatus(Environment.getUseToolText(localToolCalls));
+                            await streamMessage.flush();
+                        }
+                        break;
+                    case "response.output_item.done":
+                        if (event.item.type === "function_call" && event.item.name) {
+                            const item = event.item as OpenAiResponseOutputItem & { id?: string };
+                            const itemId = openAiResponseItemCallId(item);
+                            const index = localToolCalls.findIndex(c => c.id === itemId);
+                            if (index !== -1) {
+                                localToolCalls.splice(index, 1);
+                                if (localToolCalls.length === 0) {
+                                    streamMessage.clearStatus();
+                                } else {
+                                    streamMessage.setStatus(Environment.getUseToolText(localToolCalls));
+                                }
+                                await streamMessage.flush();
+                            }
+                        }
+                        break;
+                    case "response.function_call_arguments.delta":
+                        break;
+                    case "response.function_call_arguments.done":
+                        break;
+
+                    case "response.completed":
+                        completedResponse = event.response as OpenAiResponseLike;
+                        break;
+                    case "response.failed":
+                        throw new Error(event.response?.error?.message ?? "OpenAI response failed");
+                    case "error":
+                        throw new Error(event.message ?? event?.message ?? "OpenAI stream error");
+                }
+            }
+
+            if (!completedResponse) throw new Error("OpenAI did not return the final response.completed event.");
+
+            aiLog("debug", "openai.stream.completed", {
                round,
                duration: aiLogDuration(roundStartedAt),
-                textChars: responseText.length,
-                outputItems: response?.output?.length ?? 0,
+                outputItems: completedResponse?.output?.length ?? 0,
            });
-            const images = collectOpenAiResponseImages(response);
+
+            const images = collectOpenAiResponseImages(completedResponse);
            if (images.length) {
                await showOpenAiGeneratedImage(
                    streamMessage,
@@ -173,7 +360,7 @@ export async function runOpenAi(
                );
            }

-            const codeInterpreterCalls = collectOpenAiResponseCodeInterpreterCalls(response);
+            const codeInterpreterCalls = collectOpenAiResponseCodeInterpreterCalls(completedResponse);
            if (codeInterpreterCalls.length) {
                aiLog("info", "openai.code_interpreter_calls", {
                    round,
@@ -188,29 +375,33 @@ export async function runOpenAi(
                });
            }

-            const calls = collectOpenAiResponseFunctionCalls(response);
+            const calls = adapter.extractToolCalls(completedResponse);
            aiLog(calls.length ? "info" : "success", calls.length ? "openai.tool_calls" : "openai.run.done", {
                round,
                duration: calls.length ? aiLogDuration(roundStartedAt) : aiLogDuration(runnerStartedAt),
                calls: calls.map(call => ({
-                    id: call.callId,
+                    id: call.id,
                    name: call.name,
                    arguments: safeJsonParseObject(call.argumentsText)
                })),
            });
-            if (!calls.length) return;
+            if (!calls.length) return {shouldContinue: false};

            const toolCalls = calls.map(call => ({
-                id: call.callId,
+                id: call.id,
                name: call.name,
                argumentsText: call.argumentsText,
            }));
-            const toolResults = await executeToolBatch(msg.from?.id, toolCalls, streamMessage, toolContext, toolMemory);
-            const toolOutputs = calls.map((call, index) => ({
-                type: "function_call_output" as const,
-                call_id: call.callId,
-                output: toolResults[index] ?? "",
-            }));
+            const toolOutputs: Array<{type: "function_call_output"; call_id: string; output: string}> = [];
+            const toolResults = await executeToolBatchWithAdapter({
+                userId: msg.from?.id,
+                toolCalls,
+                streamMessage,
+                toolContext,
+                toolMemory,
+                adapter,
+                appendTargets: [toolOutputs],
+            });

            const uploadFilesResult = await tryToUploadFiles(msg, toolResults);
            if (uploadFilesResult.found) {
@@ -230,207 +421,27 @@ export async function runOpenAi(
                }
            }

-            responseInput = [...responseInput, ...(response.output ?? []), ...toolOutputs];
-            continue;
-        }
-
-        let completedResponse: OpenAiResponseLike | null = null;
-        const request: ResponseCreateParamsStreaming = {
-            model: config.openAiChatTarget.model,
-            input: responseInput as ResponseInputItem[],
-            stream: true,
-            tools: requestTools as ResponseCreateParamsStreaming["tools"],
-            parallel_tool_calls: true,
-            instructions: systemPrompt
-        };
-        const response = await openAi.responses.create(request, {signal}) as AsyncIterableStream<ResponseStreamEvent>;
-
-        aiLog("debug", "openai.stream.open", {round});
-
-        let localToolCalls: ToolCallData[] = [];
-        for await (const event of response) {
-            if (signal.aborted) throw new Error("Aborted");
-
-            switch (event.type) {
-                case "response.output_text.delta":
-                    streamMessage.append(event.delta ?? "");
-                    break;
-                case "response.image_generation_call.in_progress":
-                    streamMessage.setStatus(Environment.startingImageGenText);
-                    await streamMessage.flush();
-                    break;
-                case "response.image_generation_call.generating":
-                    streamMessage.setStatus(Environment.imageGenText);
-                    await streamMessage.flush();
-                    break;
-                case "response.image_generation_call.partial_image": {
-                    const iteration = (event.partial_image_index ?? 0) + 1;
-                    await showOpenAiGeneratedImage(
-                        streamMessage,
-                        sourceMessage,
-                        event.partial_image_b64,
-                        `partial_${round}_${iteration}`,
-                        Environment.getPartialImageGenText(iteration, OPENAI_IMAGE_PARTIALS),
-                        false,
-                    );
-                    break;
-                }
-                case "response.image_generation_call.completed":
-                    streamMessage.setStatus(Environment.finalizingImageGenText);
-                    await streamMessage.flush();
-                    break;
-                case "response.file_search_call.in_progress":
-                case "response.file_search_call.searching":
-                    streamMessage.setStatus(Environment.getUseToolText(["file_search"]));
-                    await streamMessage.flush();
-                    break;
-                case "response.file_search_call.completed":
-                    streamMessage.clearStatus();
-                    await streamMessage.flush();
-                    break;
-                case "response.code_interpreter_call.in_progress":
-                case "response.code_interpreter_call.interpreting":
-                    streamMessage.setStatus(Environment.getUseToolText(["code_interpreter"]));
-                    await streamMessage.flush();
-                    break;
-                case "response.code_interpreter_call.completed":
-                    streamMessage.clearStatus();
-                    await streamMessage.flush();
-                    break;
-                case "response.code_interpreter_call_code.delta":
-                case "response.code_interpreter_call_code.done":
-                    break;
-                case "response.output_item.added":
-                    if (event.item.type === "function_call" && event.item.name) {
-                        const item = event.item as OpenAiResponseOutputItem & { id?: string };
-                        localToolCalls.push({
-                            id: openAiResponseItemCallId(item),
-                            name: item.name ?? "",
-                            argumentsText: item.arguments ?? "{}",
-                        });
-
-                        aiLog("info", "openai.stream.tool_call.added", {
-                            round,
-                            toolCalls: localToolCalls.map(aiLogToolCall)
-                        });
-                        streamMessage.setStatus(Environment.getUseToolText(localToolCalls));
-                        await streamMessage.flush();
-                    }
-                    break;
-                case "response.output_item.done":
-                    if (event.item.type === "function_call" && event.item.name) {
-                        const item = event.item as OpenAiResponseOutputItem & { id?: string };
-                        const itemId = openAiResponseItemCallId(item);
-                        const index = localToolCalls.findIndex(c => c.id === itemId);
-                        if (index !== -1) {
-                            localToolCalls.splice(index, 1);
-                            if (localToolCalls.length === 0) {
-                                streamMessage.clearStatus();
-                            } else {
-                                streamMessage.setStatus(Environment.getUseToolText(localToolCalls));
-                            }
-                            await streamMessage.flush();
-                        }
-                    }
-                    break;
-                case "response.function_call_arguments.delta":
-                    break;
-                case "response.function_call_arguments.done":
-                    break;
-
-                case "response.completed":
-                    completedResponse = event.response as OpenAiResponseLike;
-                    break;
-                case "response.failed":
-                    throw new Error(event.response?.error?.message ?? "OpenAI response failed");
-                case "error":
-                    throw new Error(event.message ?? event?.message ?? "OpenAI stream error");
-            }
-        }
-
-        if (!completedResponse) throw new Error("OpenAI did not return the final response.completed event.");
-
-        aiLog("debug", "openai.stream.completed", {
-            round,
-            duration: aiLogDuration(roundStartedAt),
-            outputItems: completedResponse?.output?.length ?? 0,
-        });
-
-        const images = collectOpenAiResponseImages(completedResponse);
-        if (images.length) {
-            await showOpenAiGeneratedImage(
-                streamMessage,
-                sourceMessage,
-                images[images.length - 1],
-                `final_${round}`,
-                Environment.getImageGenDoneText(config.openAiImageTarget.model),
-                true,
-            );
-        }
-
-        const codeInterpreterCalls = collectOpenAiResponseCodeInterpreterCalls(completedResponse);
-        if (codeInterpreterCalls.length) {
-            aiLog("info", "openai.code_interpreter_calls", {
+            const continuation = decideToolLoopContinuation({
                round,
-                duration: aiLogDuration(roundStartedAt),
-                calls: codeInterpreterCalls.map(call => ({
-                    id: call.id,
-                    status: call.status,
-                    containerId: call.containerId,
-                    codeChars: call.code?.length ?? 0,
-                    outputItems: call.outputs.length,
-                })),
+                maxRounds: MAX_TOOL_ROUNDS,
+                toolCalls: calls,
            });
-        }
-
-        const calls = collectOpenAiResponseFunctionCalls(completedResponse);
-        aiLog(calls.length ? "info" : "success", calls.length ? "openai.tool_calls" : "openai.run.done", {
-            round,
-            duration: calls.length ? aiLogDuration(roundStartedAt) : aiLogDuration(runnerStartedAt),
-            calls: calls.map(call => ({
-                id: call.callId,
-                name: call.name,
-                arguments: safeJsonParseObject(call.argumentsText)
-            })),
-        });
-        if (!calls.length) return;
-
-        const toolCalls = calls.map(call => ({
-            id: call.callId,
-            name: call.name,
-            argumentsText: call.argumentsText,
-        }));
-        const toolResults = await executeToolBatch(msg.from?.id, toolCalls, streamMessage, toolContext, toolMemory);
-        const toolOutputs = calls.map((call, index) => ({
-            type: "function_call_output",
-            call_id: call.callId,
-            output: toolResults[index] ?? "",
-        }));
-
-        const uploadFilesResult = await tryToUploadFiles(msg, toolResults);
-        if (uploadFilesResult.found) {
-            if (!uploadFilesResult.uploaded) {
-                const old = toolOutputs[uploadFilesResult.toolIndex];
-                const callId = old?.call_id;
-                if (uploadFilesResult.toolIndex >= 0) {
-                    delete toolOutputs[uploadFilesResult.toolIndex];
-                }
-                if (callId) {
-                    toolOutputs.push({
-                        type: "function_call_output" as const,
-                        call_id: callId,
-                        output: "Error: " + uploadFilesResult.error
-                    });
-                }
+            if (!continuation.continue && continuation.reason === "max_rounds_reached") {
+                aiLog("warn", "openai.tool_loop.max_rounds_reached", {
+                    round,
+                    maxRounds: MAX_TOOL_ROUNDS,
+                });
            }
-        }

-        responseInput = [...responseInput, ...(completedResponse.output ?? []), ...toolOutputs];
-        }
+            responseInput = [...responseInput, ...(completedResponse.output ?? []), ...toolOutputs];
+            return {shouldContinue: true};
+        },
+        });
    } finally {
        if (ownsDocumentRag) {
            await preparedDocumentRag?.cleanup().catch(logError);
        }
+        await adapter.finalize().catch(logError);
    }
 }

@@ -2,40 +2,39 @@ import {Message} from "typescript-telegram-bot-api";
 import * as fs from "node:fs";
 import path from "node:path";
 import type {BoundaryValue} from "../common/boundary-types";
-import {AiProvider} from "../model/ai-provider";
-import {ToolRankerFallbackPolicy} from "../common/policies";
-import {Environment} from "../common/environment";
-import {photoGenDir} from "../index";
-import {delay, logError, replyToMessage} from "../util/utils";
-import {MessageStore} from "../common/message-store";
-import type {OpenAiResponseTool} from "./tool-mappers";
-import {AiProviderName, getOpenAICodeInterpreterTool, getOpenAIResponsesTools} from "./tool-mappers";
-import {TelegramArtifactFile, TelegramStreamMessage} from "./telegram-stream-message";
-import {AiDownloadedFile} from "./telegram-attachments";
-import {getRuntimeCapabilities} from "./provider-model-runtime";
-import {StoredAttachment} from "../model/stored-attachment";
-import {AiChatMessage, ChatMessage} from "./chat-messages-types";
+import {AiProvider} from "../model/ai-provider.js";
+import {ToolRankerFallbackPolicy} from "../common/policies.js";
+import {Environment} from "../common/environment.js";
+import {delay, logError, replyToMessage} from "../util/utils.js";
+import {MessageStore} from "../common/message-store.js";
+import type {OpenAiResponseTool} from "./tool-mappers.js";
+import {AiProviderName, getOpenAICodeInterpreterTool, getOpenAIResponsesTools} from "./tool-mappers.js";
+import {TelegramArtifactFile, TelegramStreamMessage} from "./telegram-stream-message.js";
+import {AiDownloadedFile} from "./telegram-attachments.js";
+import {getRuntimeCapabilities} from "./provider-model-runtime.js";
+import {StoredAttachment} from "../model/stored-attachment.js";
+import {AiChatMessage, ChatMessage} from "./chat-messages-types.js";
 import {ListResponse, Ollama} from "ollama";
-import {executeToolCall, ToolRuntimeContext} from "./tools/runtime";
-import {MessageImagePart, MessagePart} from "../common/message-part";
-import {KeyedAsyncLock} from "../util/async-lock";
-import {type AiRequestQueueTarget} from "./provider-request-queue";
-import {PYTHON_INTERPRETER_TOOL_NAME, pythonInterpreterToolPrompt} from "./tools/python-interpretator";
-import {getResponseLanguageInstruction, UserAiResponseLanguage, UserAiVoiceMode} from "../common/user-ai-settings";
+import {executeToolCall, ToolRuntimeContext} from "./tools/runtime.js";
+import {MessageImagePart, MessagePart} from "../common/message-part.js";
+import {KeyedAsyncLock} from "../util/async-lock.js";
+import {type AiRequestQueueTarget} from "./provider-request-queue.js";
+import {PYTHON_INTERPRETER_TOOL_NAME, pythonInterpreterToolPrompt} from "./tools/python-interpretator.js";
+import {getResponseLanguageInstruction, UserAiResponseLanguage, UserAiVoiceMode} from "../common/user-ai-settings.js";
 import {
    isTranscribableAudioDownload,
    resolveSpeechToTextProviderForUser,
    transcribeSpeechDownloads
-} from "./speech-to-text";
+} from "./speech-to-text.js";
 import type {ChatCompletionMessageParam} from "openai/resources/chat/completions";
-import {MistralChatMessage} from "./mistral-chat-message";
-import {prepareTelegramMarkdownV2} from "../util/markdown-v2-renderer";
-import {AiRuntimeTarget, createMistralClient, resolveAiRuntimeTarget} from "./ai-runtime-target";
-import {aiLog, aiLogDuration, aiLogProviderTarget, aiLogToolCall} from "../logging/ai-logger";
-import {buildConversationSnapshot, serializeConversationSnapshot} from "./conversation-pipeline";
+import {MistralChatMessage} from "./mistral-chat-message.js";
+import {prepareTelegramMarkdownV2} from "../util/markdown-v2-renderer.js";
+import {AiRuntimeTarget, createMistralClient, resolveAiRuntimeTarget} from "./ai-runtime-target.js";
+import {aiLog, aiLogDuration, aiLogProviderTarget, aiLogToolCall} from "../logging/ai-logger.js";
+import {buildConversationSnapshot, serializeConversationSnapshot} from "./conversation-pipeline.js";
 import type {ResponseInputMessageContentList} from "openai/resources/responses/responses";
-import {persistToolResultArtifactAttachment} from "./tool-result-artifact-store";
-import {filterUserVisibleStoredAttachments} from "../common/stored-attachment-utils";
+import {persistToolResultArtifactAttachment} from "./tool-result-artifact-store.js";
+import {filterUserInputStoredAttachments} from "../common/attachment-visibility.js";

 export type {Message} from "typescript-telegram-bot-api";
 export type {AiRuntimeTarget} from "./ai-runtime-target";
@@ -72,9 +71,14 @@ export const MAX_OLLAMA_CONTEXT_SIZE = 262144;
 export const DEFAULT_OLLAMA_CONTEXT_SIZE = 32768;
 export const toolResourceLocks = new KeyedAsyncLock();

+function photoGenDir(): string {
+    return path.join(Environment.DATA_PATH, "cache", "photo", "gen");
+}
+
 export type UnifiedRunOptions = {
    provider: AiProvider;
    msg: Message;
+    requestId?: string;
    isGuestMsg?: boolean;
    text: string;
    stream?: boolean;
@@ -512,13 +516,13 @@ export function addMessageAttachmentKinds(msg: Message | undefined, kinds: Set<A
    if (msg.video) kinds.add("video");
 }

-export async function collectStoredReplyChainAttachments(msg: Message, limit: number = 1): Promise<StoredAttachment[]> {
+export async function collectStoredReplyChainAttachments(msg: Message, limit: number = 40): Promise<StoredAttachment[]> {
    const attachments: StoredAttachment[] = [];
    const seen = new Set<string>();
    let current = await MessageStore.get(msg.chat.id, msg.message_id);

    for (let i = 0; current && i < limit; i++) {
-        for (const attachment of filterUserVisibleStoredAttachments(current?.attachments ?? [])) {
+        for (const attachment of filterUserInputStoredAttachments(current?.attachments ?? [])) {
            const key = [
                attachment.kind,
                attachment.fileUniqueId || attachment.fileId,
@@ -1523,7 +1527,7 @@ export function writeOpenAiGeneratedImage(sourceMessage: Message, b64: string, l
 } {
    const buffer = Buffer.from(b64, "base64");
    const fileName = `${sourceMessage.chat.id}_${sourceMessage.message_id}_${Date.now()}_${label}.png`;
-    const cachePath = path.join(photoGenDir, fileName);
+    const cachePath = path.join(photoGenDir(), fileName);
    fs.writeFileSync(cachePath, buffer);
    return {buffer, cachePath, fileName};
 }
@@ -1,20 +1,21 @@
 import {ChatCompletionMessageParam} from "openai/resources/chat/completions";
 import {ChatRequest} from "ollama";
-import {BoundaryValue} from "../common/boundary-types";
-import {ToolRankerFallbackPolicy} from "../common/policies";
-import {AiProvider} from "../model/ai-provider";
-import {createMistralClient, createOllamaClient, createOpenAiClient, sameRuntimeEndpoint} from "./ai-runtime-target";
-import {aiLog, aiLogDuration, aiLogProviderTarget} from "../logging/ai-logger";
-import {providerChatTarget, RuntimeConfigSnapshot} from "./unified-ai-runner.shared";
+import {BoundaryValue} from "../common/boundary-types.js";
+import {ToolRankerFallbackPolicy} from "../common/policies.js";
+import {AiProvider} from "../model/ai-provider.js";
+import {createMistralClient, createOllamaClient, createOpenAiClient, sameRuntimeEndpoint} from "./ai-runtime-target.js";
+import {aiLog, aiLogDuration, aiLogProviderTarget} from "../logging/ai-logger.js";
+import {providerChatTarget, RuntimeConfigSnapshot} from "./unified-ai-runner.shared.js";
 import {
    buildRankerContext,
    buildRankerTarget,
    buildToolRankerPrompt,
    filterRankedTools,
    ToolRankerSelection,
-} from "./tool-ranker-pipeline";
-import {allToolSchemaNames} from "./unified-ai-runner.shared";
-import {sanitizeToolRankerResult} from "./tool-ranker-metadata";
+} from "./tool-ranker-pipeline.js";
+import {allToolSchemaNames} from "./unified-ai-runner.shared.js";
+import {sanitizeToolRankerResult} from "./tool-ranker-metadata.js";
+import {resolveToolRankerFallbackSelection} from "./tool-ranker-fallback.js";

 export class ToolRanker {
    constructor(private readonly config: RuntimeConfigSnapshot) {
@@ -27,8 +28,15 @@ export class ToolRanker {
        round: number;
        signal: AbortSignal;
        messages?: readonly { role?: string; content?: string | readonly { text?: string }[] }[];
+        runRanker?: (
+            provider: AiProvider,
+            target: NonNullable<ReturnType<typeof buildRankerTarget>>,
+            prompt: string,
+            userQuery: string,
+        ) => Promise<string>;
    }): Promise<ToolRankerSelection> {
        const {availableTools, provider, round, signal, userQuery} = args;
+        const runRanker = args.runRanker ?? this.runRanker.bind(this);
        const availableNames = allToolSchemaNames(availableTools);
        const fallbackPolicy = this.config.toolRankerFallbackPolicy;
        const configuredTarget = buildRankerTarget(this.config, provider);
@@ -41,11 +49,10 @@ export class ToolRanker {
        const target = configuredTarget ?? (fallbackPolicy === ToolRankerFallbackPolicy.MAIN_MODEL ? mainModelTarget : undefined);

        if (!target) {
-            if (fallbackPolicy === ToolRankerFallbackPolicy.NO_TOOLS) {
-                return {toolNames: [], usedRanker: false};
-            }
-
-            return {toolNames: availableNames, usedRanker: false};
+            return resolveToolRankerFallbackSelection({
+                fallbackPolicy,
+                availableToolNames: availableNames,
+            });
        }

        const startedAt = Date.now();
@@ -63,7 +70,7 @@ export class ToolRanker {

        try {
            if (signal.aborted) throw new Error("Aborted");
-            const raw = await this.runRanker(provider, target, ranker.prompt, userQuery);
+            const raw = await runRanker(provider, target, ranker.prompt, userQuery);
            if (signal.aborted) throw new Error("Aborted");
            const selectedNames = sanitizeToolRankerResult({
                raw,
@@ -106,7 +113,7 @@ export class ToolRanker {
                    const fallbackRanker = buildToolRankerPrompt(
                        buildRankerContext(this.config, provider, mainModelTarget, round, userQuery, availableTools),
                    );
-                    const raw = await this.runRanker(provider, mainModelTarget, fallbackRanker.prompt, userQuery);
+                    const raw = await runRanker(provider, mainModelTarget, fallbackRanker.prompt, userQuery);
                    const selectedNames = sanitizeToolRankerResult({
                        raw,
                        availableToolNames: availableNames,
@@ -151,14 +158,10 @@ export class ToolRanker {
                error: failureMessage,
            });

-            if (fallbackPolicy === ToolRankerFallbackPolicy.NO_TOOLS) {
-                return {toolNames: [], usedRanker: false};
-            }
-
-            return {
-                toolNames: availableNames,
-                usedRanker: false,
-            };
+            return resolveToolRankerFallbackSelection({
+                fallbackPolicy,
+                availableToolNames: availableNames,
+            });
        }
    }

@@ -35,6 +35,7 @@ import {persistErrorArtifactAttachment} from "./final-response-artifact-store";
 import {runUnifiedAiResponsePipeline} from "./unified-ai-response-pipeline";
 import {AiRequestStore} from "../common/ai-request-store";
 import type {StoredAiRequestStatus} from "../model/stored-ai-request";
+import {recordAiRequestFinish, recordAiRequestStart} from "../common/ai-observability.js";

 export type {ToolCallData} from "./unified-ai-runner.shared";
 export {snapshotModel, providerTargets, ollamaModelNames} from "./unified-ai-runner.shared";
@@ -49,6 +50,7 @@ async function executeUnifiedAiRequest(
    const requestStartedAt = Date.now();
    let preparedRequest: Awaited<ReturnType<typeof prepareUnifiedAiRequestPipeline>> | undefined;
    aiLog("info", "request.execute.start", {
+        requestId: options.requestId,
        provider: providerName(options.provider),
        stream: options.stream ?? true,
        think: options.think,
@@ -74,6 +76,7 @@ async function executeUnifiedAiRequest(
    if (preparedRequest.finishAfterTranscript) return;

    aiLog("debug", "request.messages.collected", {
+        requestId: options.requestId,
        provider: providerName(options.provider),
        chatMessages: preparedRequest.chatMessages.length,
        imageCount: preparedRequest.imageCount,
@@ -91,6 +94,7 @@ async function executeUnifiedAiRequest(
            controller,
        });
        aiLog("success", "request.execute.done", {
+            requestId: options.requestId,
            provider: providerName(options.provider),
            duration: aiLogDuration(requestStartedAt),
            responseChars: streamMessage.getText().length,
@@ -99,6 +103,7 @@ async function executeUnifiedAiRequest(
        return;
    } catch (e) {
        aiLog("error", "request.execute.failed", {
+            requestId: options.requestId,
            provider: providerName(options.provider),
            duration: aiLogDuration(requestStartedAt),
            error: e instanceof Error ? e : String(e),
@@ -117,6 +122,7 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
    const requestedAttachmentKinds = await collectRequestedAttachmentKinds(options.msg);

    aiLog("info", "run.start", {
+        requestId: options.requestId ?? `pending:${options.msg.chat.id}:${options.msg.message_id}`,
        provider: providerName(options.provider),
        model: snapshotModel(options.provider, config),
        message: aiLogMessageIdentity(options.msg),
@@ -133,6 +139,7 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {

    if (await rejectUnsupportedAttachments(options.provider, snapshotModel(options.provider, config), options.msg, config, requestedAttachmentKinds)) {
        aiLog("warn", "run.rejected.unsupported_attachment", {
+            requestId: options.requestId ?? `pending:${options.msg.chat.id}:${options.msg.message_id}`,
            provider: providerName(options.provider),
            requestedAttachmentKinds: [...requestedAttachmentKinds],
        });
@@ -150,6 +157,7 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
            text: Environment.getAttachmentMissingFromCacheText(cached.missing[0].fileName),
        }).catch(logError);
        aiLog("warn", "run.rejected.missing_attachment_cache", {
+            requestId: options.requestId ?? `pending:${options.msg.chat.id}:${options.msg.message_id}`,
            missing: cached.missing.map(a => ({kind: a.kind, fileName: a.fileName, cachePath: a.cachePath})),
        });
        return;
@@ -166,6 +174,8 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
        provider: providerName(options.provider),
        controller
    });
+    options.requestId ??= cancel.id;
+    const requestId = options.requestId;
    const streamMessage = new TelegramStreamMessage(
        options.msg,
        cancel.id,
@@ -180,10 +190,11 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
    );
    cancel.onCancel = () => streamMessage.cancel(cancel.provider);
    const queueTarget = resolveAiRequestQueueTarget(options, config, requestedAttachmentKinds);
-    aiLog("debug", "run.queue.target", {target: aiLogProviderTarget(queueTarget), cancelId: cancel.id});
+    aiLog("debug", "run.queue.target", {requestId, target: aiLogProviderTarget(queueTarget), cancelId: cancel.id});
    const aiRequestStartedAt = new Date().toISOString();
+    recordAiRequestStart();
    await AiRequestStore.put({
-        requestId: cancel.id,
+        requestId,
        chatId: options.msg.chat.id,
        messageId: options.msg.message_id,
        fromId: options.msg.from?.id ?? 0,
@@ -197,7 +208,7 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
        const queueMessage = await streamMessage.start(Environment.waitThinkText);
        responseMessageId = queueMessage.message_id;
        await AiRequestStore.put({
-            requestId: cancel.id,
+            requestId,
            chatId: options.msg.chat.id,
            messageId: options.msg.message_id,
            responseMessageId,
@@ -207,8 +218,9 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
            status: "running",
            startedAt: aiRequestStartedAt,
        }).catch(logError);
-        setAiCancelMessageId(cancel.id, queueMessage.message_id);
+        setAiCancelMessageId(requestId, queueMessage.message_id);
        aiLog("info", "run.queue.enter", {
+            requestId,
            cancelId: cancel.id,
            queueMessageId: queueMessage.message_id,
            target: aiLogProviderTarget(queueTarget),
@@ -217,15 +229,16 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
        await aiProviderRequestQueue.enqueue(queueTarget, {
            signal: controller.signal,
            onPositionChange: async requestsBefore => {
-                aiLog("debug", "run.queue.position", {cancelId: cancel.id, requestsBefore});
+                aiLog("debug", "run.queue.position", {requestId, cancelId: cancel.id, requestsBefore});
                streamMessage.setStatus(Environment.getAiQueueText(options.provider, requestsBefore));
                await streamMessage.flush();
            },
            run: async (): Promise<null> => {
                const queueWaitFinishedAt = Date.now();
-                aiLog("info", "run.queue.dequeued", {cancelId: cancel.id});
+                aiLog("info", "run.queue.dequeued", {requestId, cancelId: cancel.id});
                const downloads = attachmentsToDownloadedFiles(cached.attachments);
                aiLog("debug", "run.downloads.ready", {
+                    requestId,
                    count: downloads.length,
                    downloads: downloads.map(d => ({
                        kind: d.kind,
@@ -239,12 +252,13 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
                    await executeUnifiedAiRequest(options, config, downloads, controller, streamMessage);
                    aiRequestStatus = "succeeded";
                    aiLog("success", "run.queue.task.done", {
+                        requestId,
                        cancelId: cancel.id,
                        duration: aiLogDuration(queueWaitFinishedAt),
                    });
                } finally {
                    cleanupDownloads(downloads);
-                    aiLog("debug", "run.downloads.cleaned", {cancelId: cancel.id, count: downloads.length});
+                    aiLog("debug", "run.downloads.cleaned", {requestId, cancelId: cancel.id, count: downloads.length});
                }
                return null;
            },
@@ -253,13 +267,13 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
        if (controller.signal.aborted || isAbortError(e instanceof Error ? e : String(e))) {
            aiRequestStatus = "aborted";
            aiRequestError = e instanceof Error ? e.message : String(e);
-            aiLog("warn", "run.aborted", {cancelId: cancel.id, duration: aiLogDuration(startedAt), error: e instanceof Error ? e : String(e)});
+            aiLog("warn", "run.aborted", {requestId, cancelId: cancel.id, duration: aiLogDuration(startedAt), error: e instanceof Error ? e : String(e)});
            streamMessage.replaceText(streamMessage.getText());
            await streamMessage.finish();
        } else {
            aiRequestStatus = "failed";
            aiRequestError = e instanceof Error ? e.message : String(e);
-            aiLog("error", "run.failed", {cancelId: cancel.id, duration: aiLogDuration(startedAt), error: e instanceof Error ? e : String(e)});
+            aiLog("error", "run.failed", {requestId, cancelId: cancel.id, duration: aiLogDuration(startedAt), error: e instanceof Error ? e : String(e)});
            const errorMessage = e instanceof Error ? e.message : String(e);
            await streamMessage.fail(e instanceof Error ? e : String(e));
            try {
@@ -279,7 +293,7 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
    } finally {
        clearTimeout(timeout);
        await AiRequestStore.put({
-            requestId: cancel.id,
+            requestId,
            chatId: options.msg.chat.id,
            messageId: options.msg.message_id,
            responseMessageId,
@@ -291,8 +305,10 @@ export async function runUnifiedAi(options: UnifiedRunOptions): Promise<void> {
            finishedAt: new Date().toISOString(),
            error: aiRequestError,
        }).catch(logError);
-        finishAiRequest(cancel.id);
+        recordAiRequestFinish(aiRequestStatus);
+        finishAiRequest(requestId);
        aiLog("success", "run.finished", {
+            requestId,
            cancelId: cancel.id,
            provider: providerName(options.provider),
            duration: aiLogDuration(startedAt),
@@ -0,0 +1,12 @@
+import type {PipelineFallbackDecision} from "./fallback-executor.js";
+
+export class PipelineRequestFailure extends Error {
+    constructor(public readonly decision: PipelineFallbackDecision, message: string) {
+        super(message);
+        this.name = "PipelineRequestFailure";
+    }
+}
+
+export function raisePipelineRequestFailure(decision: PipelineFallbackDecision, stageName: string): never {
+    throw new PipelineRequestFailure(decision, `Pipeline send failed at stage ${stageName} with fallback action ${decision.action}`);
+}
@@ -0,0 +1,16 @@
+import type {PipelineFallbackDecision} from "./fallback-executor.js";
+
+export function fallbackNotificationKey(requestId: string, decision: PipelineFallbackDecision): string {
+    return `${requestId}:${decision.stage}:${decision.action}`;
+}
+
+export class PipelineFallbackNotificationRegistry {
+    private readonly notifiedKeys = new Set<string>();
+
+    claim(requestId: string, decision: PipelineFallbackDecision): boolean {
+        const key = fallbackNotificationKey(requestId, decision);
+        if (this.notifiedKeys.has(key)) return false;
+        this.notifiedKeys.add(key);
+        return true;
+    }
+}
@@ -0,0 +1,26 @@
+import {Localization} from "../../common/localization.js";
+import type {PipelineFallbackAction, PipelineStageName} from "./types.js";
+
+export function resolvePipelineFallbackText(
+    stage: PipelineStageName,
+    action: PipelineFallbackAction,
+    locale?: string,
+): string | undefined {
+    if (action === "continue_without_stage") return undefined;
+    if (action === "fail_request") return Localization.text("pipelineFallback.failRequest", {}, "⚠️ I could not finish this request.", locale);
+
+    switch (stage) {
+        case "speech_to_text":
+            return Localization.text("pipelineFallback.speechToText", {}, "⚠️ Speech transcription failed, so I will continue without the audio transcript.", locale);
+        case "document_rag":
+            return Localization.text("pipelineFallback.documentRag", {}, "⚠️ Document retrieval failed, so I will answer without RAG.", locale);
+        case "tool_loop":
+            return Localization.text("pipelineFallback.toolLoop", {}, "⚠️ Tool execution failed, so I will continue without that tool.", locale);
+        case "text_to_speech":
+            return Localization.text("pipelineFallback.textToSpeech", {}, "⚠️ Text-to-speech failed, so I will continue without audio output.", locale);
+        default:
+            return action === "notify_user"
+                ? Localization.text("pipelineFallback.notifyUser", {}, "⚠️ I hit a problem and need to continue with a fallback.", locale)
+                : Localization.text("pipelineFallback.generic", {}, "⚠️ I had to skip part of the request, but I can continue.", locale);
+    }
+}
@@ -0,0 +1,43 @@
+import type {Message} from "typescript-telegram-bot-api";
+import {Localization} from "../../common/localization.js";
+import {replyToMessage, logError} from "../../util/utils.js";
+import type {PipelineFallbackDecision} from "./fallback-executor.js";
+import {PipelineFallbackNotificationRegistry} from "./fallback-notifier-registry.js";
+import {resolvePipelineFallbackText} from "./fallback-notifier-text.js";
+
+export class PipelineFallbackNotifier {
+    private readonly registry = new PipelineFallbackNotificationRegistry();
+
+    constructor(
+        private readonly sourceMessage: Message,
+        private readonly responseLanguage?: string,
+        private readonly sendFallbackMessage: (text: string) => Promise<void> = async text => {
+            await replyToMessage({
+                message: this.sourceMessage,
+                text,
+            });
+        },
+    ) {}
+
+    async notify(requestId: string, decision: PipelineFallbackDecision): Promise<{notified: boolean; text?: string}> {
+        if (!this.registry.claim(requestId, decision)) {
+            return {notified: false};
+        }
+
+        const locale = this.responseLanguage === "default"
+            ? Localization.currentLocale()
+            : Localization.normalizeLocale(this.responseLanguage) ?? Localization.currentLocale();
+        const text = resolvePipelineFallbackText(decision.stage, decision.action, locale);
+        if (!text) {
+            return {notified: false};
+        }
+
+        try {
+            await this.sendFallbackMessage(text);
+            return {notified: true, text};
+        } catch (error) {
+            logError(error instanceof Error ? error : String(error));
+            return {notified: false, text};
+        }
+    }
+}
@@ -0,0 +1,15 @@
+import {AiProvider} from "../../model/ai-provider.js";
+import type {RuntimeConfigSnapshot} from "../unified-ai-runner.shared.js";
+import {aiLogProviderTarget} from "../../logging/ai-logger.js";
+import {buildRankerTarget} from "../tool-ranker-pipeline.js";
+import {providerChatTarget} from "../unified-ai-runner.shared.js";
+
+export function buildToolRankFallbackTargetDetails(provider: AiProvider, config: RuntimeConfigSnapshot) {
+    const sourceTarget = buildRankerTarget(config, provider);
+    const alternateTarget = providerChatTarget(provider, config);
+
+    return {
+        sourceTarget: aiLogProviderTarget(sourceTarget),
+        alternateTarget: aiLogProviderTarget(alternateTarget),
+    };
+}
@@ -1,5 +1,6 @@
 import {DEFAULT_PIPELINE_FALLBACK_POLICIES, USER_REQUEST_PIPELINE_STAGES} from "./blueprint.js";
 import {decidePipelineFallback, type PipelineFallbackDecision} from "./fallback-executor.js";
+import {raisePipelineRequestFailure} from "./fallback-failure.js";
 import type {
    PipelineAuditEvent,
    PipelineFallbackPolicy,
@@ -66,7 +67,7 @@ export class UserRequestPipeline {
                    },
                }));
                if (decision.shouldFailRequest) {
-                    throw new Error(`Required pipeline stage is not registered: ${stageName}`);
+                    raisePipelineRequestFailure(decision, stageName);
                }
                continue;
            }
@@ -112,7 +113,7 @@ export class UserRequestPipeline {
                    error: error instanceof Error ? error.message : String(error),
                }));
                if (decision.shouldFailRequest) {
-                    throw error;
+                    raisePipelineRequestFailure(decision, stageName);
                }
            }
        }
@@ -0,0 +1,33 @@
+import {Message} from "typescript-telegram-bot-api";
+import {Command} from "../base/command.js";
+import {Requirements} from "../base/requirements.js";
+import {Requirement} from "../base/requirement.js";
+import {Environment} from "../common/environment.js";
+import {buildAiAuditReport, replyWithTrimmedText, resolveAuditTarget} from "./ai-observability.js";
+import {logError, sendErrorPlaceholder} from "../util/utils.js";
+
+export class AIAudit extends Command {
+    command = ["aiaudit", "audit"];
+    argsMode = "optional" as const;
+
+    requirements = Requirements.Build(Requirement.BOT_ADMIN);
+
+    title = Environment.commandTitles.aiAudit;
+    description = Environment.commandDescriptions.aiAudit;
+
+    async execute(msg: Message, match?: RegExpExecArray | null): Promise<void> {
+        try {
+            const target = resolveAuditTarget(msg, match?.[3] ?? null);
+            if (!target) {
+                await replyWithTrimmedText(msg, "Usage: reply to a message or pass messageId, or chatId messageId.");
+                return;
+            }
+
+            const text = await buildAiAuditReport(target);
+            await replyWithTrimmedText(msg, text);
+        } catch (error) {
+            logError(error instanceof Error ? error : String(error));
+            await sendErrorPlaceholder(msg).catch(logError);
+        }
+    }
+}
@@ -0,0 +1,27 @@
+import {Message} from "typescript-telegram-bot-api";
+import {Command} from "../base/command.js";
+import {Requirements} from "../base/requirements.js";
+import {Requirement} from "../base/requirement.js";
+import {Environment} from "../common/environment.js";
+import {buildAiMetricsReport, replyWithTrimmedText} from "./ai-observability.js";
+import {logError, sendErrorPlaceholder} from "../util/utils.js";
+
+export class AIMetrics extends Command {
+    command = ["aimetrics", "metrics"];
+    argsMode = "none" as const;
+
+    requirements = Requirements.Build(Requirement.BOT_ADMIN);
+
+    title = Environment.commandTitles.aiMetrics;
+    description = Environment.commandDescriptions.aiMetrics;
+
+    async execute(msg: Message): Promise<void> {
+        try {
+            const text = await buildAiMetricsReport();
+            await replyWithTrimmedText(msg, text);
+        } catch (error) {
+            logError(error instanceof Error ? error : String(error));
+            await sendErrorPlaceholder(msg).catch(logError);
+        }
+    }
+}
@@ -0,0 +1,155 @@
+import {Message} from "typescript-telegram-bot-api";
+import {DatabaseManager} from "../db/database-manager.js";
+import type {AttachmentDbRow} from "../db/db-types.js";
+import {replyToMessage} from "../util/utils.js";
+import {snapshotAiObservability} from "../common/ai-observability.js";
+
+export type AuditTarget = {
+    chatId: number;
+    messageId: number;
+};
+
+export function resolveAuditTarget(msg: Message, argsText?: string | null): AuditTarget | null {
+    if (msg.reply_to_message) {
+        return {
+            chatId: msg.chat.id,
+            messageId: msg.reply_to_message.message_id,
+        };
+    }
+
+    const args = argsText?.trim().split(/\s+/).filter(Boolean) ?? [];
+    if (!args.length) return null;
+
+    if (args.length === 1) {
+        const messageId = Number(args[0]);
+        if (!Number.isFinite(messageId)) return null;
+        return {
+            chatId: msg.chat.id,
+            messageId,
+        };
+    }
+
+    const chatId = Number(args[0]);
+    const messageId = Number(args[1]);
+    if (!Number.isFinite(chatId) || !Number.isFinite(messageId)) return null;
+
+    return {chatId, messageId};
+}
+
+function formatSize(bytes: number | null | undefined): string {
+    if (!Number.isFinite(bytes ?? NaN)) return "n/a";
+    const value = Number(bytes);
+    if (value >= 1024 * 1024) return `${(value / (1024 * 1024)).toFixed(1)} MB`;
+    if (value >= 1024) return `${(value / 1024).toFixed(1)} KB`;
+    return `${value} B`;
+}
+
+function clip(value: string | null | undefined, max = 120): string {
+    const text = (value ?? "").trim();
+    if (!text) return "n/a";
+    return text.length <= max ? text : `${text.slice(0, max)}…`;
+}
+
+function formatAttachmentLine(index: number, attachment: AttachmentDbRow): string {
+    return [
+        `${index + 1}.`,
+        attachment.direction,
+        attachment.kind,
+        attachment.fileName,
+        `size=${formatSize(attachment.sizeBytes)}`,
+        attachment.artifactKind ? `artifact=${attachment.artifactKind}` : null,
+    ].filter(Boolean).join(" ");
+}
+
+export async function buildAiAuditReport(target: AuditTarget): Promise<string> {
+    const [request, audits, artifacts, attachments] = await Promise.all([
+        DatabaseManager.getAiRequestByMessage(target.chatId, target.messageId),
+        DatabaseManager.getRequestAuditsByMessage(target.chatId, target.messageId),
+        DatabaseManager.getArtifactsByMessage(target.chatId, target.messageId),
+        DatabaseManager.getAttachmentsByMessage(target.chatId, target.messageId),
+    ]);
+
+    const lines: string[] = [
+        "AI observability audit",
+        `chatId: ${target.chatId}`,
+        `messageId: ${target.messageId}`,
+        "",
+        "AI request:",
+    ];
+
+    if (request) {
+        lines.push(
+            `  requestId: ${request.requestId}`,
+            `  provider: ${request.provider}`,
+            `  model: ${request.model}`,
+            `  status: ${request.status}`,
+            `  startedAt: ${request.startedAt}`,
+            `  finishedAt: ${request.finishedAt ?? "n/a"}`,
+            `  error: ${clip(request.error, 240)}`,
+        );
+    } else {
+        lines.push("  not found");
+    }
+
+    lines.push("", `Pipeline audits: ${audits.length}`);
+    audits.slice(0, 12).forEach((audit, index) => {
+        lines.push(
+            `  ${index + 1}. ${audit.stage} ${audit.status}` +
+            `${audit.durationMs !== null ? ` ${audit.durationMs}ms` : ""}` +
+            `${audit.provider ? ` provider=${audit.provider}` : ""}` +
+            `${audit.model ? ` model=${audit.model}` : ""}` +
+            `${audit.error ? ` error=${clip(audit.error, 120)}` : ""}`,
+        );
+    });
+    if (audits.length > 12) {
+        lines.push(`  … and ${audits.length - 12} more`);
+    }
+
+    lines.push("", `Artifacts: ${artifacts.length}`);
+    artifacts.slice(0, 12).forEach((artifact, index) => {
+        lines.push(
+            `  ${index + 1}. ${artifact.kind} stage=${artifact.stage}` +
+            `${artifact.attachmentId ? ` attachmentId=${artifact.attachmentId}` : ""}` +
+            `${artifact.createdAt ? ` createdAt=${artifact.createdAt}` : ""}`,
+        );
+    });
+    if (artifacts.length > 12) {
+        lines.push(`  … and ${artifacts.length - 12} more`);
+    }
+
+    lines.push("", `Attachments: ${attachments.length}`);
+    attachments.slice(0, 12).forEach((attachment, index) => {
+        lines.push(`  ${formatAttachmentLine(index, attachment)}`);
+    });
+    if (attachments.length > 12) {
+        lines.push(`  … and ${attachments.length - 12} more`);
+    }
+
+    return lines.join("\n");
+}
+
+export async function buildAiMetricsReport(): Promise<string> {
+    const snapshot = snapshotAiObservability();
+    const [aiRequests, attachments, artifacts, requestAudits] = await Promise.all([
+        DatabaseManager.getAllAiRequests(),
+        DatabaseManager.getAllAttachments(),
+        DatabaseManager.getAllArtifacts(),
+        DatabaseManager.getAllRequestAudits(),
+    ]);
+
+    return [
+        "AI observability metrics",
+        `requests: total=${snapshot.requests.total} succeeded=${snapshot.requests.succeeded} failed=${snapshot.requests.failed} aborted=${snapshot.requests.aborted}`,
+        `fallbacks: total=${snapshot.fallbacks.total} ignore=${snapshot.fallbacks.ignore} notify_user=${snapshot.fallbacks.notifyUser} continue_without_stage=${snapshot.fallbacks.continueWithoutStage} use_alternate_target=${snapshot.fallbacks.useAlternateTarget} fail_request=${snapshot.fallbacks.failRequest}`,
+        `tool calls: ${snapshot.toolCalls}`,
+        `RAG runs: ${snapshot.ragRuns}`,
+        `TTS runs: total=${snapshot.ttsRuns.total} succeeded=${snapshot.ttsRuns.succeeded} failed=${snapshot.ttsRuns.failed} skipped=${snapshot.ttsRuns.skipped}`,
+        `db rows: ai_requests=${aiRequests.length} attachments=${attachments.length} artifacts=${artifacts.length} request_audit=${requestAudits.length}`,
+    ].join("\n");
+}
+
+export async function replyWithTrimmedText(msg: Message, text: string): Promise<void> {
+    const maxLength = 3800;
+    const nextText = text.length <= maxLength ? text : `${text.slice(0, maxLength)}\n… (trimmed)`;
+    await replyToMessage({message: msg, text: nextText});
+}
@@ -0,0 +1,51 @@
+import {Message} from "typescript-telegram-bot-api";
+import {Command} from "../base/command.js";
+import {Requirements} from "../base/requirements.js";
+import {Requirement} from "../base/requirement.js";
+import {Environment} from "../common/environment.js";
+import {DatabaseManager} from "../db/database-manager.js";
+import {logError, sendErrorPlaceholder} from "../util/utils.js";
+import {replyWithTrimmedText} from "./ai-observability.js";
+
+function formatRequestLine(index: number, request: Awaited<ReturnType<typeof DatabaseManager.getAllAiRequests>>[number]): string {
+    return [
+        `${index + 1}.`,
+        `requestId=${request.requestId}`,
+        `chatId=${request.chatId}`,
+        `messageId=${request.messageId}`,
+        request.responseMessageId ? `responseMessageId=${request.responseMessageId}` : null,
+        `provider=${request.provider}`,
+        `model=${request.model}`,
+        `status=${request.status}`,
+        `startedAt=${request.startedAt}`,
+        request.finishedAt ? `finishedAt=${request.finishedAt}` : null,
+        request.error ? `error=${request.error}` : null,
+    ].filter(Boolean).join(" ");
+}
+
+export class AIRequests extends Command {
+    command = ["airequests"];
+    argsMode = "none" as const;
+
+    requirements = Requirements.Build(Requirement.BOT_ADMIN);
+
+    title = Environment.commandTitles.aiRequests;
+    description = Environment.commandDescriptions.aiRequests;
+
+    async execute(msg: Message): Promise<void> {
+        try {
+            const requests = (await DatabaseManager.getAllAiRequests()).slice(-10).reverse();
+            const lines = [
+                "Recent AI requests",
+                `count: ${requests.length}`,
+                "",
+                ...requests.map((request, index) => formatRequestLine(index, request)),
+            ];
+
+            await replyWithTrimmedText(msg, lines.join("\n"));
+        } catch (error) {
+            logError(error instanceof Error ? error : String(error));
+            await sendErrorPlaceholder(msg).catch(logError);
+        }
+    }
+}
@@ -0,0 +1,123 @@
+import type {PipelineFallbackAction} from "../ai/user-request-pipeline";
+import type {StoredAiRequestStatus} from "../model/stored-ai-request.js";
+
+type CounterSnapshot = {
+    total: number;
+    succeeded: number;
+    failed: number;
+    aborted: number;
+};
+
+export type AiObservabilitySnapshot = {
+    requests: CounterSnapshot;
+    fallbacks: {
+        total: number;
+        ignore: number;
+        notifyUser: number;
+        continueWithoutStage: number;
+        useAlternateTarget: number;
+        failRequest: number;
+    };
+    toolCalls: number;
+    ragRuns: number;
+    ttsRuns: {
+        total: number;
+        succeeded: number;
+        failed: number;
+        skipped: number;
+    };
+};
+
+const requestCounters = {
+    total: 0,
+    succeeded: 0,
+    failed: 0,
+    aborted: 0,
+};
+
+const fallbackCounters = {
+    total: 0,
+    ignore: 0,
+    notifyUser: 0,
+    continueWithoutStage: 0,
+    useAlternateTarget: 0,
+    failRequest: 0,
+};
+
+const ttsCounters = {
+    total: 0,
+    succeeded: 0,
+    failed: 0,
+    skipped: 0,
+};
+
+let toolCalls = 0;
+let ragRuns = 0;
+
+function incrementFallback(action: PipelineFallbackAction): void {
+    fallbackCounters.total += 1;
+    switch (action) {
+        case "ignore":
+            fallbackCounters.ignore += 1;
+            break;
+        case "notify_user":
+            fallbackCounters.notifyUser += 1;
+            break;
+        case "continue_without_stage":
+            fallbackCounters.continueWithoutStage += 1;
+            break;
+        case "use_alternate_target":
+            fallbackCounters.useAlternateTarget += 1;
+            break;
+        case "fail_request":
+            fallbackCounters.failRequest += 1;
+            break;
+    }
+}
+
+export function recordAiRequestStart(): void {
+    requestCounters.total += 1;
+}
+
+export function recordAiRequestFinish(status: StoredAiRequestStatus): void {
+    switch (status) {
+        case "succeeded":
+            requestCounters.succeeded += 1;
+            break;
+        case "failed":
+            requestCounters.failed += 1;
+            break;
+        case "aborted":
+            requestCounters.aborted += 1;
+            break;
+        case "running":
+            break;
+    }
+}
+
+export function recordPipelineFallback(action: PipelineFallbackAction): void {
+    incrementFallback(action);
+}
+
+export function recordToolCall(): void {
+    toolCalls += 1;
+}
+
+export function recordRagRun(): void {
+    ragRuns += 1;
+}
+
+export function recordTtsRun(status: "succeeded" | "failed" | "skipped"): void {
+    ttsCounters.total += 1;
+    ttsCounters[status] += 1;
+}
+
+export function snapshotAiObservability(): AiObservabilitySnapshot {
+    return {
+        requests: {...requestCounters},
+        fallbacks: {...fallbackCounters},
+        toolCalls,
+        ragRuns,
+        ttsRuns: {...ttsCounters},
+    };
+}
@@ -0,0 +1,9 @@
+import type {StoredAttachment} from "../model/stored-attachment";
+
+export function filterUserVisibleStoredAttachments(attachments: StoredAttachment[]): StoredAttachment[] {
+    return attachments.filter(attachment => attachment.scope !== "internal_artifact");
+}
+
+export function filterUserInputStoredAttachments(attachments: StoredAttachment[]): StoredAttachment[] {
+    return attachments.filter(attachment => attachment.scope === "user_input" || attachment.scope === undefined);
+}
@@ -3,18 +3,21 @@ import os from "node:os";
 import path from "node:path";
 import {parse as parseDotEnv} from "dotenv";
 import {z} from "zod";
-import {appLogger} from "../logging/logger";
+import {appLogger} from "../logging/logger.js";
 import type {BoundaryValue, ErrorLike} from "./boundary-types";

-import {saveData} from "../db/database";
-import {Answers} from "../model/answers";
-import {ifTrue} from "../util/utils";
-import {AiProvider} from "../model/ai-provider";
-import {ImageHandleFallbackPolicy, ImageHandlePolicy, RateLimitFallbackPolicy} from "./policies";
-import {ToolRankerFallbackPolicy} from "./policies";
-import type {ToolCallData} from "../ai/unified-ai-runner";
-import {PYTHON_INTERPRETER_TOOL_NAME} from "../ai/tools/python-interpretator";
-import {Localization, type LocalizationParams} from "./localization";
+import {Answers} from "../model/answers.js";
+import {AiProvider} from "../model/ai-provider.js";
+import {ImageHandleFallbackPolicy, ImageHandlePolicy, RateLimitFallbackPolicy} from "./policies.js";
+import {ToolRankerFallbackPolicy} from "./policies.js";
+import type {ToolCallData} from "../ai/unified-ai-runner.js";
+import {PYTHON_INTERPRETER_TOOL_NAME} from "../ai/tools/python-interpretator.js";
+import {Localization, type LocalizationParams} from "./localization.js";
+
+function parseBooleanLike(value: string): boolean {
+    const normalized = value.trim().toLowerCase();
+    return ["true", "t", "y", "1"].includes(normalized);
+}

 type EnvRecord = Record<string, string>;
 type StringEnumLike = Record<string, string>;
@@ -53,7 +56,7 @@ function booleanWithDefaultSchema(defaultValue: boolean) {
                return defaultValue;
            }

-            return ifTrue(normalized);
+            return parseBooleanLike(normalized);
        }, z.boolean())
        .default(defaultValue)
        .catch(defaultValue);
@@ -62,7 +65,7 @@ function booleanWithDefaultSchema(defaultValue: boolean) {
 const optionalBooleanSchema = z
    .preprocess(value => {
        const normalized = normalizeString(value as BoundaryValue);
-        return normalized === undefined ? undefined : ifTrue(normalized);
+        return normalized === undefined ? undefined : parseBooleanLike(normalized);
    }, z.boolean().optional())
    .optional()
    .catch(undefined);
@@ -820,6 +823,34 @@ export class Environment {
        return this.text("noTextToSynthesizeText", "No text to synthesize.");
    }

+    static get pipelineFallbackGenericText() {
+        return this.text("pipelineFallbackGenericText", "⚠️ I had to skip part of the request, but I can continue.");
+    }
+
+    static get pipelineFallbackNotifyText() {
+        return this.text("pipelineFallbackNotifyText", "⚠️ I hit a problem and need to continue with a fallback.");
+    }
+
+    static get pipelineFallbackFailText() {
+        return this.text("pipelineFallbackFailText", "⚠️ I could not finish this request.");
+    }
+
+    static get pipelineFallbackRagText() {
+        return this.text("pipelineFallbackRagText", "⚠️ Document retrieval failed, so I will answer without RAG.");
+    }
+
+    static get pipelineFallbackSpeechToTextText() {
+        return this.text("pipelineFallbackSpeechToTextText", "⚠️ Speech transcription failed, so I will continue without the audio transcript.");
+    }
+
+    static get pipelineFallbackTextToSpeechText() {
+        return this.text("pipelineFallbackTextToSpeechText", "⚠️ Text-to-speech failed, so I will continue without audio output.");
+    }
+
+    static get pipelineFallbackToolText() {
+        return this.text("pipelineFallbackToolText", "⚠️ Tool execution failed, so I will continue without that tool.");
+    }
+
    static get mistralTtsNoAudioDataText() {
        return this.text("mistralTtsNoAudioDataText", "Mistral TTS did not return audioData.");
    }
@@ -960,6 +991,9 @@ export class Environment {
        choice: "/choice a, b, ..., c",
        coin: "/coin",
        debug: "/debug",
+        aiRequests: "/aiRequests",
+        aiAudit: "/aiAudit [reply|messageId|chatId messageId]",
+        aiMetrics: "/aiMetrics",
        dice: "/dice",
        distort: "/distort [amp] [wavelength]",
        help: "/help",
@@ -1010,6 +1044,9 @@ export class Environment {
            choice: this.text("commandDescriptions.choice", "Choose a random value"),
            coin: this.text("commandDescriptions.coin", "Heads or tails"),
            debug: this.text("commandDescriptions.debug", "Returns msg (or reply) as json"),
+            aiRequests: this.text("commandDescriptions.aiRequests", "Show recent AI requests"),
+            aiAudit: this.text("commandDescriptions.aiAudit", "Inspect AI request audit and artifacts"),
+            aiMetrics: this.text("commandDescriptions.aiMetrics", "Show AI observability counters"),
            dice: this.text("commandDescriptions.dice", "Sends random or specific dice"),
            distort: this.text("commandDescriptions.distort", "Distortion of picture"),
            help: this.text("commandDescriptions.help", "Show list of commands"),
@@ -1939,6 +1976,7 @@ export class Environment {

        if (!has) {
            this.ADMIN_IDS.add(id);
+            const {saveData} = await import("../db/database.js");
            await saveData();
        }

@@ -1950,6 +1988,7 @@ export class Environment {

        if (has) {
            this.ADMIN_IDS.delete(id);
+            const {saveData} = await import("../db/database.js");
            await saveData();
        }

@@ -1966,6 +2005,7 @@ export class Environment {
        }

        this.MUTED_IDS.add(id);
+        const {saveData} = await import("../db/database.js");
        await saveData();
        return true;
    }
@@ -1976,6 +2016,7 @@ export class Environment {
        }

        this.MUTED_IDS.delete(id);
+        const {saveData} = await import("../db/database.js");
        await saveData();
        return true;
    }
@@ -1,7 +1,7 @@
 import {AsyncLocalStorage} from "node:async_hooks";
 import fs from "node:fs";
 import path from "node:path";
-import {appLogger} from "../logging/logger";
+import {appLogger} from "../logging/logger.js";

 const logger = appLogger.child("localization");

@@ -20,6 +20,9 @@ export type MessagePart = {
    audios?: string[];
    audioParts?: MessageAudioPart[];
    documents?: string[];
+    documentNames?: string[];
    videos?: string[];
    videoNotes?: string[];
+    videoNames?: string[];
+    videoNoteNames?: string[];
 }
@@ -1,6 +1,7 @@
 import path from "node:path";
 import {Environment} from "./environment";
 import {StoredAttachment} from "../model/stored-attachment";
+export {filterUserVisibleStoredAttachments} from "./attachment-visibility";

 export function photoCachePathForUniqueId(uniqueId: string): string {
    return path.join(Environment.DATA_PATH, "cache", "photo", `${uniqueId}.jpg`);
@@ -44,7 +45,3 @@ export function uniqueStoredAttachments(attachments: StoredAttachment[]): Stored

    return result;
 }
-
-export function filterUserVisibleStoredAttachments(attachments: StoredAttachment[]): StoredAttachment[] {
-    return attachments.filter(attachment => attachment.scope !== "internal_artifact");
-}
@@ -1,9 +1,9 @@
 import * as fs from "fs";
-import {Environment} from "../common/environment";
-import {logError} from "../util/utils";
-import {Answers} from "../model/answers";
+import {Environment} from "../common/environment.js";
+import {logError} from "../util/utils.js";
+import {Answers} from "../model/answers.js";
 import path from "node:path";
-import {KeyedAsyncLock} from "../util/async-lock";
+import {KeyedAsyncLock} from "../util/async-lock.js";

 type DataJsonFile = {
    admins: number[]
@@ -1,9 +1,9 @@
 import "dotenv/config";
-import {appLogger} from "./logging/logger";
-import {Environment} from "./common/environment";
+import {appLogger} from "./logging/logger.js";
+import {Environment} from "./common/environment.js";
 import {BotCommand, TelegramBot, User} from "typescript-telegram-bot-api";
-import {Command} from "./base/command";
-import type {LogDetails} from "./logging/logger";
+import {Command} from "./base/command.js";
+import type {LogDetails} from "./logging/logger.js";
 import {
    initSystemSpecs,
    logError,
@@ -13,68 +13,72 @@ import {
    processInlineQuery,
    processMyChatMember,
    processNewMessage
-} from "./util/utils";
-import {Ae} from "./commands/ae";
-import {Help} from "./commands/help";
-import {Ignore} from "./commands/ignore";
-import {Unignore} from "./commands/unignore";
-import {Ping} from "./commands/ping";
-import {RandomString} from "./commands/random-string";
-import {SystemInfo} from "./commands/system-info";
-import {Test} from "./commands/test";
-import {readData, retrieveAnswers} from "./db/database";
-import {Uptime} from "./commands/uptime";
-import {WhatBetter} from "./commands/what-better";
-import {When} from "./commands/when";
-import {RandomInt} from "./commands/random-int";
-import {Ban} from "./commands/ban";
-import {Quote} from "./commands/quote";
-import {OllamaSearch} from "./commands/ollama-search";
-import {Id} from "./commands/id";
-import {AdminsAdd} from "./commands/admins-add";
-import {AdminsRemove} from "./commands/admins-remove";
-import {Shutdown} from "./commands/shutdown";
-import {Leave} from "./commands/leave";
-import {OllamaChat} from "./commands/ollama-chat";
-import {Start} from "./commands/start";
-import {Choice} from "./commands/choice";
-import {Coin} from "./commands/coin";
-import {Qr} from "./commands/qr";
-import {Distort} from "./commands/distort";
-import {Dice} from "./commands/dice";
-import {Unban} from "./commands/unban";
-import {Title} from "./commands/title";
-import {MessageDao} from "./db/message-dao";
-import {DatabaseManager} from "./db/database-manager";
-import {UserDao} from "./db/user-dao";
-import {UserStore} from "./common/user-store";
-import {CallbackCommand} from "./base/callback-command";
-import {AiCancel} from "./callback_commands/ai-cancel";
-import {AiRegenerate} from "./callback_commands/ai-regenerate";
-import {MistralChat} from "./commands/mistral-chat";
-import {Transliteration} from "./commands/transliteration";
-import {OllamaListModels} from "./commands/ollama-list-models";
-import {OllamaGetModel} from "./commands/ollama-get-model";
-import {OllamaSetModel} from "./commands/ollama-set-model";
-import {MistralGetModel} from "./commands/mistral-get-model";
-import {MistralSetModel} from "./commands/mistral-set-model";
-import {MistralListModels} from "./commands/mistral-list-models";
-import {Debug} from "./commands/debug";
+} from "./util/utils.js";
+import {Ae} from "./commands/ae.js";
+import {Help} from "./commands/help.js";
+import {Ignore} from "./commands/ignore.js";
+import {Unignore} from "./commands/unignore.js";
+import {Ping} from "./commands/ping.js";
+import {RandomString} from "./commands/random-string.js";
+import {SystemInfo} from "./commands/system-info.js";
+import {Test} from "./commands/test.js";
+import {readData, retrieveAnswers} from "./db/database.js";
+import {Uptime} from "./commands/uptime.js";
+import {WhatBetter} from "./commands/what-better.js";
+import {When} from "./commands/when.js";
+import {RandomInt} from "./commands/random-int.js";
+import {Ban} from "./commands/ban.js";
+import {Quote} from "./commands/quote.js";
+import {OllamaSearch} from "./commands/ollama-search.js";
+import {Id} from "./commands/id.js";
+import {AdminsAdd} from "./commands/admins-add.js";
+import {AdminsRemove} from "./commands/admins-remove.js";
+import {Shutdown} from "./commands/shutdown.js";
+import {Leave} from "./commands/leave.js";
+import {OllamaChat} from "./commands/ollama-chat.js";
+import {Start} from "./commands/start.js";
+import {Choice} from "./commands/choice.js";
+import {Coin} from "./commands/coin.js";
+import {Qr} from "./commands/qr.js";
+import {Distort} from "./commands/distort.js";
+import {Dice} from "./commands/dice.js";
+import {Unban} from "./commands/unban.js";
+import {Title} from "./commands/title.js";
+import {MessageDao} from "./db/message-dao.js";
+import {DatabaseManager} from "./db/database-manager.js";
+import {UserDao} from "./db/user-dao.js";
+import {UserStore} from "./common/user-store.js";
+import {CallbackCommand} from "./base/callback-command.js";
+import {AiCancel} from "./callback_commands/ai-cancel.js";
+import {AiRegenerate} from "./callback_commands/ai-regenerate.js";
+import {MistralChat} from "./commands/mistral-chat.js";
+import {Transliteration} from "./commands/transliteration.js";
+import {OllamaListModels} from "./commands/ollama-list-models.js";
+import {OllamaGetModel} from "./commands/ollama-get-model.js";
+import {OllamaSetModel} from "./commands/ollama-set-model.js";
+import {MistralGetModel} from "./commands/mistral-get-model.js";
+import {MistralSetModel} from "./commands/mistral-set-model.js";
+import {MistralListModels} from "./commands/mistral-list-models.js";
+import {Debug} from "./commands/debug.js";
 import fs from "node:fs";
 import path from "node:path";
-import {OpenAIChat} from "./commands/openai-chat";
-import {OpenAIListModels} from "./commands/openai-list-models";
-import {OpenAIGetModel} from "./commands/openai-get-model";
-import {OpenAISetModel} from "./commands/openai-set-model";
-import {Info} from "./commands/info";
-import {AdminsList} from "./commands/admins-list";
-import {ExportDb} from "./commands/export-db";
-import {ImportDb} from "./commands/import-db";
-import {Settings} from "./commands/settings";
-import {UserSettingsCallback} from "./callback_commands/user-settings";
-import {TextToSpeech} from "./commands/text-to-speech";
-import {SpeechToText} from "./commands/speech-to-text";
-import {cleanupInternalArtifactCache} from "./ai/internal-artifact-store";
+import {OpenAIChat} from "./commands/openai-chat.js";
+import {OpenAIListModels} from "./commands/openai-list-models.js";
+import {OpenAIGetModel} from "./commands/openai-get-model.js";
+import {OpenAISetModel} from "./commands/openai-set-model.js";
+import {Info} from "./commands/info.js";
+import {AdminsList} from "./commands/admins-list.js";
+import {ExportDb} from "./commands/export-db.js";
+import {ImportDb} from "./commands/import-db.js";
+import {Settings} from "./commands/settings.js";
+import {UserSettingsCallback} from "./callback_commands/user-settings.js";
+import {TextToSpeech} from "./commands/text-to-speech.js";
+import {SpeechToText} from "./commands/speech-to-text.js";
+import {cleanupInternalArtifactCache} from "./ai/internal-artifact-store.js";
+import {AIAudit} from "./commands/ai-audit.js";
+import {AIMetrics} from "./commands/ai-metrics.js";
+import {AIRequests} from "./commands/ai-requests.js";
+import {cleanupStaleRagProviderState} from "./ai/rag-retention.js";

 process.setUncaughtExceptionCaptureCallback(logError);

@@ -119,6 +123,9 @@ export const commands: Command[] = [
    new Settings(),
    new TextToSpeech(),
    new SpeechToText(),
+    new AIRequests(),
+    new AIAudit(),
+    new AIMetrics(),

    new AdminsAdd(),
    new AdminsRemove(),
@@ -272,6 +279,22 @@ async function main() {
    }, () => ({notesRootFilePath}));

    await measureStartupStep("cleanup_internal_artifacts", () => cleanupInternalArtifactCache(), () => ({retentionDays: 14}));
+    await measureStartupStep("cleanup_stale_rag_provider_state", () => cleanupStaleRagProviderState(), () => ({retentionDays: 14}));
+    await measureStartupStep("observability.snapshot", async () => {
+        const [aiRequests, attachments, artifacts, requestAudits] = await Promise.all([
+            DatabaseManager.getAllAiRequests(),
+            DatabaseManager.getAllAttachments(),
+            DatabaseManager.getAllArtifacts(),
+            DatabaseManager.getAllRequestAudits(),
+        ]);
+
+        return {
+            aiRequests: aiRequests.length,
+            attachments: attachments.length,
+            artifacts: artifacts.length,
+            requestAudits: requestAudits.length,
+        };
+    }, () => ({tables: ["ai_requests", "attachments", "artifacts", "request_audit"]}));

    const cmds = await measureStartupStep("build_commands", () => commands.filter(cmd => {
        return cmd.title && cmd.title.startsWith("/") && cmd.title.split(" ").length === 1 && cmd.description;
@@ -1,5 +1,5 @@
 import {Message} from "typescript-telegram-bot-api";
-import {createLogger, formatDuration, LogDetails, LogLevel} from "./logger";
+import {createLogger, formatDuration, LogDetails, LogLevel} from "./logger.js";

 export type AiRunnerLogLevel = LogLevel;
 export type AiRunnerLogDetails = LogDetails;
@@ -1,4 +1,4 @@
-import {AiProvider} from "./ai-provider";
+import {AiProvider} from "./ai-provider.js";

 export type AiEndpointInfo = {
    provider?: AiProvider;
@@ -1,4 +1,4 @@
-import {AiCapabilityInfo} from "./ai-capability-info";
+import {AiCapabilityInfo} from "./ai-capability-info.js";

 export class AiModelCapabilities {
    chat: AiCapabilityInfo | undefined;
@@ -1,7 +1,7 @@
 import * as si from "systeminformation";
-import {appLogger} from "../logging/logger";
-import {Command} from "../base/command";
-import {CallbackCommand} from "../base/callback-command";
+import {appLogger} from "../logging/logger.js";
+import {Command} from "../base/command.js";
+import {CallbackCommand} from "../base/callback-command.js";
 import {
    CallbackQuery,
    ChatMember,
@@ -15,39 +15,40 @@ import {
    TelegramBot,
    User
 } from "typescript-telegram-bot-api";
-import {Environment} from "../common/environment";
-import {TelegramError} from "typescript-telegram-bot-api/dist/errors";
-import {bot, botUser, callbackCommands, commands, messageDao, photoDir} from "../index";
+import {Environment} from "../common/environment.js";
+import {TelegramError} from "typescript-telegram-bot-api/dist/errors.js";
+import {bot, botUser, callbackCommands, commands, messageDao, photoDir} from "../index.js";
 import os from "os";
 import axios from "axios";
-import {MessageAudioPart, MessageImagePart, MessagePart} from "../common/message-part";
-import {StoredMessage} from "../model/stored-message";
+import {MessageAudioPart, MessageImagePart, MessagePart} from "../common/message-part.js";
+import {StoredMessage} from "../model/stored-message.js";
 import sharp from "sharp";
-import {UserStore} from "../common/user-store";
+import {UserStore} from "../common/user-store.js";
 import fs from "node:fs";
 import path from "node:path";
-import {MessageStore} from "../common/message-store";
-import {SystemInfo} from "../commands/system-info";
-import {PrefixResponse} from "../commands/prefix-response";
-import {ChatCommand} from "../base/chat-command";
-import {AiProvider} from "../model/ai-provider";
-import {SendOptions} from "../model/send-options";
-import {EditOptions} from "../model/edit-options";
-import {StoredUser} from "../model/stored-user";
-import {StoredAttachment} from "../model/stored-attachment";
-import {AiDownloadedFile} from "../ai/telegram-attachments";
-import {runUnifiedAi} from "../ai/unified-ai-runner";
-import {enqueueTelegramApiCall} from "./telegram-api-queue";
-import {AsyncSemaphore, KeyedAsyncLock} from "./async-lock";
-import {resolveEffectiveAiProviderForUser, resolveInterfaceLocaleForUser} from "../common/user-ai-settings";
-import {Localization} from "../common/localization";
-import {createOllamaClient, resolveAiRuntimeTarget} from "../ai/ai-runtime-target";
-import {RandomUtils} from "./random-utils";
-import {HtmlUtils} from "./html-utils";
-import {ShellCommandResult, ShellCommandRunner} from "./shell-command-runner";
-import type {BoundaryValue, ErrorLike} from "../common/boundary-types";
-import {createStoredImageAttachment, photoCachePathForUniqueId, uniqueStoredAttachments} from "../common/stored-attachment-utils";
-import {runTelegramMessageAttachmentPipeline} from "../ai/user-request-pipeline";
+import {MessageStore} from "../common/message-store.js";
+import {filterUserInputStoredAttachments} from "../common/attachment-visibility.js";
+import {SystemInfo} from "../commands/system-info.js";
+import {PrefixResponse} from "../commands/prefix-response.js";
+import {ChatCommand} from "../base/chat-command.js";
+import {AiProvider} from "../model/ai-provider.js";
+import {SendOptions} from "../model/send-options.js";
+import {EditOptions} from "../model/edit-options.js";
+import {StoredUser} from "../model/stored-user.js";
+import {StoredAttachment} from "../model/stored-attachment.js";
+import {AiDownloadedFile} from "../ai/telegram-attachments.js";
+import {runUnifiedAi} from "../ai/unified-ai-runner.js";
+import {enqueueTelegramApiCall} from "./telegram-api-queue.js";
+import {AsyncSemaphore, KeyedAsyncLock} from "./async-lock.js";
+import {resolveEffectiveAiProviderForUser, resolveInterfaceLocaleForUser} from "../common/user-ai-settings.js";
+import {Localization} from "../common/localization.js";
+import {createOllamaClient, resolveAiRuntimeTarget} from "../ai/ai-runtime-target.js";
+import {RandomUtils} from "./random-utils.js";
+import {HtmlUtils} from "./html-utils.js";
+import {ShellCommandResult, ShellCommandRunner} from "./shell-command-runner.js";
+import type {BoundaryValue, ErrorLike} from "../common/boundary-types.js";
+import {createStoredImageAttachment, photoCachePathForUniqueId, uniqueStoredAttachments} from "../common/stored-attachment-utils.js";
+import {runTelegramMessageAttachmentPipeline} from "../ai/user-request-pipeline/index.js";

 const imageProcessingSemaphore = new AsyncSemaphore(2);
 const fileWriteLocks = new KeyedAsyncLock();
@@ -1487,12 +1488,13 @@ export async function collectReplyChainText(options: ReplyChainOptions): Promise
            const cleanText = cutPrefix ? cutPrefixes(rawText) : rawText;
            const imageNames = await loadImagesIfExists(msg);
            const messageDownloads = includeDownloads ? downloads : [];
-            const storedImageAttachments = isStoredMessage(msg)
-                ? (msg.attachments ?? []).filter(attachment => attachment.kind === "image" && fs.existsSync(attachment.cachePath))
+            const storedAttachments = isStoredMessage(msg)
+                ? filterUserInputStoredAttachments(msg.attachments ?? []).filter(attachment => fs.existsSync(attachment.cachePath))
                : [];
+            const storedImageAttachments = storedAttachments.filter(attachment => attachment.kind === "image");

            if (!cleanText && !quoteText && textRequired) return;
-            if (!cleanText && !quoteText && !imageNames?.length && !storedImageAttachments.length && !messageDownloads.length) return;
+            if (!cleanText && !quoteText && !imageNames?.length && !storedAttachments.length && !messageDownloads.length) return;

            const fromId = isStoredMessage(msg) ? msg.fromId : msg.from?.id;
            const user = await UserStore.get(isStoredMessage(msg) ? msg.fromId : msg.from?.id ?? -1);
@@ -1527,11 +1529,19 @@ export async function collectReplyChainText(options: ReplyChainOptions): Promise
            });
            const imageParts = [...photoImageParts, ...cachedImageParts];

+            const storedDocumentAttachments = storedAttachments.filter(attachment => attachment.kind === "document");
+            const storedVideoAttachments = storedAttachments.filter(attachment => attachment.kind === "video");
+            const storedVideoNoteAttachments = storedAttachments.filter(attachment => attachment.kind === "video-note");
+            const storedAudioAttachments = storedAttachments.filter(attachment => attachment.kind === "audio");
+
            const audios: string[] = [];
            const audioParts: MessageAudioPart[] = [];
            const documents: string[] = [];
+            const documentNames: string[] = [];
            const videos: string[] = [];
+            const videoNames: string[] = [];
            const videoNotes: string[] = [];
+            const videoNoteNames: string[] = [];

            if (messageDownloads.length) {
                messageDownloads
@@ -1544,21 +1554,51 @@ export async function collectReplyChainText(options: ReplyChainOptions): Promise

                messageDownloads
                    .filter(d => d.kind === "document")
-                    .forEach(d => documents.push(d.buffer.toString("base64")));
+                    .forEach(d => {
+                        documents.push(d.buffer.toString("base64"));
+                        documentNames.push(d.fileName);
+                    });

                messageDownloads
                    .filter(d => d.kind === "video")
-                    .forEach(v => videos.push(v.buffer.toString("base64")));
+                    .forEach(v => {
+                        videos.push(v.buffer.toString("base64"));
+                        videoNames.push(v.fileName);
+                    });

                messageDownloads
                    .filter(d => d.kind === "video-note")
                    .forEach(v => {
                        const data = v.buffer.toString("base64");
                        videoNotes.push(data);
+                        videoNoteNames.push(v.fileName);
                        audioParts.push({data, mimeType: mimeTypeFromAudioDownload(v)});
                    });
            }

+            storedAudioAttachments.forEach(attachment => {
+                const data = Buffer.from(fs.readFileSync(attachment.cachePath)).toString("base64");
+                audios.push(data);
+                audioParts.push({data, mimeType: attachment.mimeType || "audio/ogg"});
+            });
+
+            storedDocumentAttachments.forEach(attachment => {
+                documents.push(Buffer.from(fs.readFileSync(attachment.cachePath)).toString("base64"));
+                documentNames.push(attachment.fileName);
+            });
+
+            storedVideoAttachments.forEach(attachment => {
+                videos.push(Buffer.from(fs.readFileSync(attachment.cachePath)).toString("base64"));
+                videoNames.push(attachment.fileName);
+            });
+
+            storedVideoNoteAttachments.forEach(attachment => {
+                const data = Buffer.from(fs.readFileSync(attachment.cachePath)).toString("base64");
+                videoNotes.push(data);
+                videoNoteNames.push(attachment.fileName);
+                audioParts.push({data, mimeType: attachment.mimeType || "video/mp4"});
+            });
+
            const content = [
                quoteText ? `[citation]:\n${quoteText}\n\n[message]:\n` : "",
                cleanText ?? ""
@@ -1576,8 +1616,11 @@ export async function collectReplyChainText(options: ReplyChainOptions): Promise
                audios: audios.length ? audios : undefined,
                audioParts: audioParts.length ? audioParts : undefined,
                documents: documents.length ? documents : undefined,
+                documentNames: documentNames.length ? documentNames : undefined,
                videos: videos.length ? videos : undefined,
+                videoNames: videoNames.length ? videoNames : undefined,
                videoNotes: videoNotes.length ? videoNotes : undefined,
+                videoNoteNames: videoNoteNames.length ? videoNoteNames : undefined,
            });
        }
    };
@@ -0,0 +1,24 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const observability = await import("../dist/common/ai-observability.js");
+
+test("ai observability snapshot counts recorded events", () => {
+    const before = observability.snapshotAiObservability();
+
+    observability.recordAiRequestStart();
+    observability.recordAiRequestFinish("succeeded");
+    observability.recordPipelineFallback("notify_user");
+    observability.recordToolCall();
+    observability.recordRagRun();
+    observability.recordTtsRun("skipped");
+
+    const after = observability.snapshotAiObservability();
+
+    assert.equal(after.requests.total, before.requests.total + 1);
+    assert.equal(after.requests.succeeded, before.requests.succeeded + 1);
+    assert.equal(after.fallbacks.notifyUser, before.fallbacks.notifyUser + 1);
+    assert.equal(after.toolCalls, before.toolCalls + 1);
+    assert.equal(after.ragRuns, before.ragRuns + 1);
+    assert.equal(after.ttsRuns.skipped, before.ttsRuns.skipped + 1);
+});
@@ -0,0 +1,17 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {runSingleModelRequest} = await import("../dist/ai/model-call-stage.js");
+
+test("single model request wrapper executes exactly once", async () => {
+    let calls = 0;
+    const result = await runSingleModelRequest({
+        async execute() {
+            calls += 1;
+            return "ok";
+        },
+    });
+
+    assert.equal(result, "ok");
+    assert.equal(calls, 1);
+});
@@ -0,0 +1,33 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {PipelineFallbackNotificationRegistry} = await import("../dist/ai/user-request-pipeline/fallback-notifier-registry.js");
+const {resolvePipelineFallbackText} = await import("../dist/ai/user-request-pipeline/fallback-notifier-text.js");
+
+test("pipeline fallback text maps notify_user to a user-facing message", () => {
+    assert.match(resolvePipelineFallbackText("document_rag", "notify_user"), /RAG/i);
+    assert.match(resolvePipelineFallbackText("speech_to_text", "notify_user"), /transcription/i);
+    assert.match(resolvePipelineFallbackText("tool_loop", "notify_user"), /tool/i);
+});
+
+test("pipeline fallback text is localized when locale is provided", () => {
+    assert.match(resolvePipelineFallbackText("document_rag", "notify_user", "ru"), /RAG|документ/i);
+    assert.match(resolvePipelineFallbackText("text_to_speech", "notify_user", "ua"), /аудіо|мовлення/i);
+});
+
+test("pipeline fallback text stays silent for continue_without_stage", () => {
+    assert.equal(resolvePipelineFallbackText("document_rag", "continue_without_stage"), undefined);
+    assert.equal(resolvePipelineFallbackText("tool_loop", "continue_without_stage"), undefined);
+});
+
+test("pipeline fallback notification registry deduplicates one request-stage-action", () => {
+    const registry = new PipelineFallbackNotificationRegistry();
+    const decision = {
+        stage: "tool_loop",
+        action: "notify_user",
+    };
+
+    assert.equal(registry.claim("request-1", decision), true);
+    assert.equal(registry.claim("request-1", decision), false);
+    assert.equal(registry.claim("request-2", decision), true);
+});
@@ -0,0 +1,398 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {UserRequestPipeline} = await import("../dist/ai/user-request-pipeline/pipeline.js");
+const {PIPELINE_ATTACHMENT_LIMIT_BYTES} = await import("../dist/ai/user-request-pipeline/types.js");
+
+class FakeTelegramStreamMessage {
+    constructor() {
+        this.status = "";
+        this.text = "";
+        this.toolExecutions = [];
+        this.outputAttachments = [];
+        this.internalAttachments = [];
+        this.pipelineAudits = [];
+        this.finished = false;
+        this.failed = false;
+    }
+
+    setStatus(status) {
+        this.status = status;
+    }
+
+    clearStatus() {
+        this.status = "";
+    }
+
+    append(delta) {
+        this.text += delta;
+    }
+
+    replaceText(text) {
+        this.text = text;
+    }
+
+    getText() {
+        return this.text;
+    }
+
+    recordToolExecution(record) {
+        this.toolExecutions.push(record);
+    }
+
+    getToolExecutions() {
+        return [...this.toolExecutions];
+    }
+
+    recordOutputAttachment(record) {
+        this.outputAttachments.push(record);
+    }
+
+    getOutputAttachments() {
+        return [...this.outputAttachments];
+    }
+
+    async storeInternalAttachment(attachment) {
+        this.internalAttachments.push(attachment);
+    }
+
+    async storePipelineAudit(events) {
+        this.pipelineAudits.push(...events);
+    }
+
+    async finish() {
+        this.finished = true;
+    }
+
+    async fail() {
+        this.failed = true;
+    }
+}
+
+class FakeProviderAdapter {
+    constructor() {
+        this.calls = [];
+    }
+
+    async callModel(request, execute) {
+        this.calls.push(request);
+        return await execute();
+    }
+
+    appendToolResults(messages, calls, results) {
+        for (const [index, call] of calls.entries()) {
+            messages.push({
+                role: "tool",
+                name: call.name,
+                content: results[index] ?? "",
+            });
+        }
+    }
+}
+
+class FakeMemoryStore {
+    constructor() {
+        this.rows = [];
+    }
+
+    persist(state) {
+        this.rows.push({
+            requestId: state.requestId,
+            audit: [...state.audit],
+            artifacts: [...state.artifacts],
+            outputAttachments: [...state.outputAttachments],
+        });
+    }
+}
+
+function createBaseState() {
+    return {
+        requestId: "integration-request-1",
+        chatId: 10,
+        messageId: 20,
+        fromId: 30,
+        receivedAt: new Date().toISOString(),
+        text: "process my attachments",
+        settings: {
+            provider: "OLLAMA",
+            responseLanguage: "en",
+            voiceMode: "execute",
+            imageOutputMode: "photo",
+        },
+        inputAttachments: [],
+        outputAttachments: [],
+        artifacts: [],
+        toolRankDecisions: [],
+        audit: [],
+    };
+}
+
+function artifact(kind, stage, extra = {}) {
+    return {
+        kind,
+        stage,
+        createdAt: "2026-05-18T00:00:00.000Z",
+        ...extra,
+    };
+}
+
+function outputAttachment(fileName, kind = "file") {
+    return {
+        direction: "output",
+        kind,
+        fileId: `${fileName}-file-id`,
+        fileName,
+        sizeBytes: 1024,
+        cachePath: `/tmp/${fileName}`,
+    };
+}
+
+test("integration pipeline rejects oversized attachment before later stages", async () => {
+    const stream = new FakeTelegramStreamMessage();
+    const state = createBaseState();
+    state.inputAttachments.push({
+        direction: "input",
+        kind: "document",
+        fileId: "doc-oversized",
+        fileName: "big.pdf",
+        sizeBytes: PIPELINE_ATTACHMENT_LIMIT_BYTES + 1,
+        cachePath: "/tmp/big.pdf",
+    });
+
+    const pipeline = new UserRequestPipeline({
+        stages: [{
+            name: "input_size_gate",
+            async run() {
+                stream.setStatus("Checking size");
+                const tooLarge = state.inputAttachments.some(attachment => attachment.sizeBytes > PIPELINE_ATTACHMENT_LIMIT_BYTES);
+                stream.clearStatus();
+
+                return {
+                    stage: "input_size_gate",
+                    status: tooLarge ? "fallback" : "succeeded",
+                    fallbackAction: tooLarge ? "notify_user" : undefined,
+                };
+            },
+        }],
+        stageNames: ["input_size_gate"],
+    });
+
+    await pipeline.run(state, new AbortController().signal);
+
+    assert.equal(state.audit.at(-1)?.status, "fallback");
+    assert.equal(state.audit.at(-1)?.details?.fallbackAction, "notify_user");
+    assert.equal(stream.status, "");
+});
+
+test("integration pipeline carries artifacts through fake document, voice, tool and tts stages", async () => {
+    const stream = new FakeTelegramStreamMessage();
+    const adapter = new FakeProviderAdapter();
+    const store = new FakeMemoryStore();
+    const state = createBaseState();
+    state.inputAttachments.push(
+        {
+            direction: "input",
+            kind: "document",
+            fileId: "doc-1",
+            fileName: "contract.pdf",
+            sizeBytes: 1024,
+            cachePath: "/tmp/contract.pdf",
+        },
+        {
+            direction: "input",
+            kind: "audio",
+            fileId: "audio-1",
+            fileName: "voice.ogg",
+            sizeBytes: 2048,
+            cachePath: "/tmp/voice.ogg",
+        },
+    );
+
+    const pipeline = new UserRequestPipeline({
+        stages: [
+            {
+                name: "input_size_gate",
+                async run() {
+                    return {
+                        stage: "input_size_gate",
+                        status: "succeeded",
+                    };
+                },
+            },
+            {
+                name: "document_rag",
+                async run() {
+                    stream.setStatus("RAG");
+                    stream.clearStatus();
+                    return {
+                        stage: "document_rag",
+                        status: "succeeded",
+                        artifacts: [artifact("rag", "document_rag", {
+                            provider: "OLLAMA",
+                            sourceAttachmentIds: ["doc-1"],
+                            extractedText: "contract text",
+                        })],
+                    };
+                },
+            },
+            {
+                name: "speech_to_text",
+                async run() {
+                    return {
+                        stage: "speech_to_text",
+                        status: "succeeded",
+                        artifacts: [artifact("transcript", "speech_to_text", {
+                            text: "transcribed voice",
+                            sourceAttachmentIds: ["audio-1"],
+                            model: "fake-stt",
+                        })],
+                    };
+                },
+            },
+            {
+                name: "model_call",
+                async run() {
+                    const reply = await adapter.callModel({provider: "OLLAMA", model: "fake-model"}, async () => {
+                        stream.append("final answer");
+                        return "final answer";
+                    });
+
+                    return {
+                        stage: "model_call",
+                        status: "succeeded",
+                        artifacts: [artifact("final_text", "model_call", {
+                            text: reply,
+                        })],
+                    };
+                },
+            },
+            {
+                name: "tool_loop",
+                async run() {
+                    const calls = [{id: "tool-call-1", name: "read_file", argumentsText: "{\"path\":\"docs/a.md\"}"}];
+                    const results = ["tool result"];
+                    adapter.appendToolResults([], calls, results);
+                    stream.recordToolExecution({
+                        toolName: "read_file",
+                        callId: "tool-call-1",
+                        argumentsText: "{\"path\":\"docs/a.md\"}",
+                        resultChars: results[0].length,
+                        startedAt: "2026-05-18T00:00:00.000Z",
+                        finishedAt: "2026-05-18T00:00:01.000Z",
+                    });
+
+                    return {
+                        stage: "tool_loop",
+                        status: "succeeded",
+                        artifacts: [artifact("tool_result", "tool_loop", {
+                            toolName: "read_file",
+                            callId: "tool-call-1",
+                            resultText: results[0],
+                        })],
+                    };
+                },
+            },
+            {
+                name: "persist_output_artifacts",
+                async run() {
+                    const generatedFile = outputAttachment("report.txt", "file");
+                    stream.recordOutputAttachment({
+                        artifactKind: "generated_file",
+                        fileName: generatedFile.fileName,
+                        mimeType: "text/plain",
+                        sizeBytes: generatedFile.sizeBytes,
+                        messageId: 321,
+                    });
+
+                    return {
+                        stage: "persist_output_artifacts",
+                        status: "succeeded",
+                        artifacts: [artifact("generated_file", "persist_output_artifacts", {
+                            attachmentId: generatedFile.fileId,
+                        })],
+                        attachments: [generatedFile],
+                    };
+                },
+            },
+            {
+                name: "text_to_speech",
+                async run() {
+                    stream.recordOutputAttachment({
+                        artifactKind: "tts_audio",
+                        fileName: "answer.ogg",
+                        mimeType: "audio/ogg",
+                        sizeBytes: 4096,
+                        messageId: 322,
+                    });
+
+                    return {
+                        stage: "text_to_speech",
+                        status: "succeeded",
+                        artifacts: [artifact("tts_audio", "text_to_speech", {
+                            attachmentId: "tts-audio-id",
+                        })],
+                        attachments: [outputAttachment("answer.ogg", "audio")],
+                    };
+                },
+            },
+            {
+                name: "audit_finish",
+                async run() {
+                    store.persist(state);
+                    return {
+                        stage: "audit_finish",
+                        status: "succeeded",
+                    };
+                },
+            },
+        ],
+        stageNames: [
+            "input_size_gate",
+            "document_rag",
+            "speech_to_text",
+            "model_call",
+            "tool_loop",
+            "persist_output_artifacts",
+            "text_to_speech",
+            "audit_finish",
+        ],
+    });
+
+    await pipeline.run(state, new AbortController().signal);
+
+    assert.equal(adapter.calls.length, 1);
+    assert.equal(stream.getText(), "final answer");
+    assert.equal(stream.getToolExecutions().length, 1);
+    assert.equal(stream.getOutputAttachments().length, 2);
+    assert.equal(state.artifacts.some(entry => entry.kind === "rag"), true);
+    assert.equal(state.artifacts.some(entry => entry.kind === "transcript"), true);
+    assert.equal(state.artifacts.some(entry => entry.kind === "final_text"), true);
+    assert.equal(state.artifacts.some(entry => entry.kind === "tool_result"), true);
+    assert.equal(state.artifacts.some(entry => entry.kind === "generated_file"), true);
+    assert.equal(state.artifacts.some(entry => entry.kind === "tts_audio"), true);
+    assert.equal(store.rows.length, 1);
+    assert.equal(store.rows[0].artifacts.length >= 6, true);
+});
+
+test("integration pipeline stops on fail_request fallback", async () => {
+    const stream = new FakeTelegramStreamMessage();
+    const state = createBaseState();
+    const pipeline = new UserRequestPipeline({
+        stages: [{
+            name: "input_size_gate",
+            async run() {
+                stream.setStatus("Boom");
+                throw new Error("boom");
+            },
+        }],
+        stageNames: ["input_size_gate", "document_rag"],
+        fallbackPolicies: [{
+            stage: "input_size_gate",
+            onUnavailable: "fail_request",
+            onFailed: "fail_request",
+        }],
+    });
+
+    await assert.rejects(() => pipeline.run(state, new AbortController().signal), /PipelineRequestFailure/);
+    assert.equal(state.audit.some(entry => entry.stage === "document_rag"), false);
+});
@@ -0,0 +1,83 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {
+    extractOpenAiToolCalls,
+    extractOpenAiStreamingToolCalls,
+    extractOpenAiTextDelta,
+    extractMistralToolCalls,
+    extractMistralTextDelta,
+    extractOllamaToolCalls,
+    extractOllamaTextDelta,
+} = await import("../dist/ai/provider-adapter-contract.js");
+
+test("openai contract extracts text delta and function calls", () => {
+    assert.equal(extractOpenAiTextDelta({type: "response.output_text.delta", delta: "hello"}), "hello");
+
+    const calls = extractOpenAiToolCalls({
+        output: [{
+            type: "function_call",
+            call_id: "call-1",
+            name: "read_file",
+            arguments: "{\"path\":\"src/index.ts\"}",
+        }],
+    });
+
+    assert.equal(calls.length, 1);
+    assert.equal(calls[0].id, "call-1");
+    assert.equal(calls[0].name, "read_file");
+
+    const streamed = extractOpenAiStreamingToolCalls({
+        type: "response.output_item.added",
+        item: {
+            type: "function_call",
+            id: "call-2",
+            name: "search_files",
+            arguments: "{\"query\":\"sendMessage\"}",
+        },
+    });
+
+    assert.equal(streamed.length, 1);
+    assert.equal(streamed[0].id, "call-2");
+    assert.equal(streamed[0].name, "search_files");
+});
+
+test("mistral contract extracts content and tool calls", () => {
+    assert.equal(extractMistralTextDelta({
+        content: [{text: "hello"}, {text: " world"}],
+    }), "hello world");
+
+    const calls = extractMistralToolCalls({
+        toolCalls: [{
+            id: "m-1",
+            function: {
+                name: "get_weather",
+                arguments: {location: "Moscow"},
+            },
+        }],
+    });
+
+    assert.equal(calls.length, 1);
+    assert.equal(calls[0].id, "m-1");
+    assert.equal(calls[0].name, "get_weather");
+});
+
+test("ollama contract extracts content and tool calls", () => {
+    assert.equal(extractOllamaTextDelta({
+        message: {content: "hello from ollama"},
+    }), "hello from ollama");
+
+    const calls = extractOllamaToolCalls({
+        tool_calls: [{
+            id: "o-1",
+            function: {
+                name: "web_search",
+                arguments: {query: "openai docs"},
+            },
+        }],
+    });
+
+    assert.equal(calls.length, 1);
+    assert.equal(calls[0].id, "o-1");
+    assert.equal(calls[0].name, "web_search");
+});
@@ -1,32 +1,13 @@
-import test, {after} from "node:test";
+import test from "node:test";
 import assert from "node:assert/strict";
-import fs from "node:fs";
-import os from "node:os";
-import path from "node:path";

-const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "tg-chat-bot-rag-"));
-process.env.BOT_TOKEN = process.env.BOT_TOKEN ?? "test-token";
-process.env.CREATOR_ID = process.env.CREATOR_ID ?? "1";
-process.env.DATA_PATH = tempRoot;
-process.env.DB_PATH = `file:${path.join(tempRoot, "test.sqlite")}`;
-process.env.TEST_ENVIRONMENT = "true";
-
-const {Environment} = await import("../dist/common/environment.js");
-Environment.load();
-
-const {DatabaseManager} = await import("../dist/db/database-manager.js");
-DatabaseManager.init();
-await DatabaseManager.ready;
-
-const {ArtifactStore} = await import("../dist/common/artifact-store.js");
-const {filterUserVisibleStoredAttachments} = await import("../dist/common/stored-attachment-utils.js");
+const {
+    buildRagArtifactPayload,
+} = await import("../dist/ai/rag-artifact-payload.js");
+const {
+    filterUserVisibleStoredAttachments,
+} = await import("../dist/common/attachment-visibility.js");
 const {AiProvider} = await import("../dist/model/ai-provider.js");
-const {persistRagArtifactAttachment} = await import("../dist/ai/rag-artifact-store.js");
-
-after(async () => {
-    await DatabaseManager.close().catch(() => undefined);
-    fs.rmSync(tempRoot, {recursive: true, force: true});
-});

 test("internal artifacts are not treated as user-visible attachments", () => {
    const visible = filterUserVisibleStoredAttachments([
@@ -50,105 +31,57 @@ test("internal artifacts are not treated as user-visible attachments", () => {
    assert.equal(visible[0].fileId, "visible");
 });

-test("RAG artifacts persist structured ollama metadata", async () => {
-    const chatId = 42;
-    const messageId = 7;
-
-    const attachment = await persistRagArtifactAttachment({
+test("RAG artifact payload keeps ollama retrieval metadata", () => {
+    const payload = buildRagArtifactPayload({
        provider: AiProvider.OLLAMA,
-        prepared: {
-            provider: AiProvider.OLLAMA,
-            prepared: true,
-            cleanup: async () => undefined,
-            artifact: {
-                query: "What is in the file?",
-                extractedDocuments: [
-                    {documentIndex: 0, fileName: "report.txt", textChars: 120},
-                ],
-                selectedChunks: [
-                    {
-                        sourceId: "doc1-1",
-                        documentIndex: 0,
-                        documentName: "report.txt",
-                        chunkIndex: 0,
-                        chunkCount: 1,
-                        textChars: 120,
-                        score: 0.91,
-                    },
-                ],
-                skippedDocuments: [
-                    {documentIndex: 1, fileName: "ignored.bin", reason: "unsupported format"},
-                ],
-                providerState: {
-                    embeddingModel: "nomic-embed-text:latest",
-                    topK: 8,
-                    chunkSize: 1400,
-                    chunkOverlap: 220,
-                    maxContextChars: 14000,
-                    minScore: 0.12,
-                    maxArchiveFiles: 200,
-                    maxArchiveBytes: 50 * 1024 * 1024,
-                    maxArchiveDepth: 2,
-                },
-            },
-        },
-        downloads: [{
-            kind: "document",
+        createdAt: "2026-01-01T00:00:00.000Z",
+        sources: [{
            fileId: "file-1",
            fileName: "report.txt",
-            buffer: Buffer.from("hello world"),
-            path: path.join(tempRoot, "report.txt"),
+            mimeType: "text/plain",
+            sizeBytes: 12,
+            sha256: "abc123",
+            uploadedFileId: "uploaded-1",
        }],
-        chatId,
-        messageId,
-        details: {
+        providerState: {
+            provider: AiProvider.OLLAMA,
+            prepared: true,
            embeddingModel: "nomic-embed-text:latest",
            topK: 8,
            chunkSize: 1400,
            chunkOverlap: 220,
            maxContextChars: 14000,
-            artifact: {
-                query: "What is in the file?",
-                extractedDocuments: [
-                    {documentIndex: 0, fileName: "report.txt", textChars: 120},
-                ],
-                selectedChunks: [
-                    {
-                        sourceId: "doc1-1",
-                        documentIndex: 0,
-                        documentName: "report.txt",
-                        chunkIndex: 0,
-                        chunkCount: 1,
-                        textChars: 120,
-                        score: 0.91,
-                    },
-                ],
-                skippedDocuments: [
-                    {documentIndex: 1, fileName: "ignored.bin", reason: "unsupported format"},
-                ],
-                providerState: {
-                    embeddingModel: "nomic-embed-text:latest",
-                    topK: 8,
-                    chunkSize: 1400,
-                    chunkOverlap: 220,
-                    maxContextChars: 14000,
-                    minScore: 0.12,
-                    maxArchiveFiles: 200,
-                    maxArchiveBytes: 50 * 1024 * 1024,
-                    maxArchiveDepth: 2,
+            extractedDocuments: [
+                {documentIndex: 0, fileName: "report.txt", textChars: 120},
+            ],
+            selectedChunks: [
+                {
+                    sourceId: "doc1-1",
+                    documentIndex: 0,
+                    documentName: "report.txt",
+                    chunkIndex: 0,
+                    chunkCount: 1,
+                    textChars: 120,
+                    score: 0.91,
                },
-            },
+            ],
+            skippedDocuments: [
+                {documentIndex: 1, fileName: "ignored.bin", reason: "unsupported format"},
+            ],
+            minScore: 0.12,
+            maxArchiveFiles: 200,
+            maxArchiveBytes: 50 * 1024 * 1024,
+            maxArchiveDepth: 2,
+            query: "What is in the file?",
        },
    });

-    assert.equal(attachment?.artifactKind, "rag");
-    assert.equal(fs.existsSync(attachment.cachePath), true);
-
-    const stored = await ArtifactStore.getByMessage(chatId, messageId);
-    assert.equal(stored.length, 1);
-    assert.equal(stored[0].kind, "rag");
-    assert.equal(stored[0].payload.providerState.query, "What is in the file?");
-    assert.equal(stored[0].payload.providerState.selectedChunks[0].score, 0.91);
-    assert.equal(stored[0].payload.providerState.skippedDocuments[0].reason, "unsupported format");
-    assert.equal(stored[0].payload.providerState.ollama.embeddingModel, "nomic-embed-text:latest");
+    assert.equal(payload.artifactKind, "rag");
+    assert.equal(payload.provider, AiProvider.OLLAMA);
+    assert.equal(payload.sources[0].uploadedFileId, "uploaded-1");
+    assert.equal(payload.providerState.provider, AiProvider.OLLAMA);
+    assert.equal(payload.providerState.query, "What is in the file?");
+    assert.equal(payload.providerState.selectedChunks[0].score, 0.91);
+    assert.equal(payload.providerState.skippedDocuments[0].reason, "unsupported format");
+    assert.equal(payload.providerState.embeddingModel, "nomic-embed-text:latest");
 });
@@ -0,0 +1,53 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {buildStaleRagCleanupPlan} = await import("../dist/ai/rag-retention-planner.js");
+
+test("stale rag cleanup plan selects only older rag artifacts", () => {
+    const plan = buildStaleRagCleanupPlan([
+        {
+            id: "recent-openai",
+            createdAt: "2026-05-18T00:00:00.000Z",
+            payload: JSON.stringify({
+                artifactKind: "rag",
+                providerState: {
+                    provider: "OPENAI",
+                    vectorStoreIds: ["vs_1"],
+                    uploadedFileIds: ["file_1"],
+                },
+            }),
+        },
+        {
+            id: "stale-openai",
+            createdAt: "2026-04-01T00:00:00.000Z",
+            payload: JSON.stringify({
+                artifactKind: "rag",
+                providerState: {
+                    provider: "OPENAI",
+                    vectorStoreIds: ["vs_2"],
+                    uploadedFileIds: ["file_2"],
+                },
+            }),
+        },
+        {
+            id: "stale-ollama",
+            createdAt: "2026-04-01T00:00:00.000Z",
+            payload: JSON.stringify({
+                artifactKind: "rag",
+                providerState: {
+                    provider: "OLLAMA",
+                    prepared: true,
+                },
+            }),
+        },
+    ], 14, new Date("2026-05-18T00:00:00.000Z"));
+
+    assert.equal(plan.targets.length, 1);
+    assert.deepEqual(plan.targets[0], {
+        artifactId: "stale-openai",
+        createdAt: "2026-04-01T00:00:00.000Z",
+        provider: "OPENAI",
+        vectorStoreIds: ["vs_2"],
+        uploadedFileIds: ["file_2"],
+    });
+});
@@ -0,0 +1,131 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {filterUserInputStoredAttachments} = await import("../dist/common/attachment-visibility.js");
+const {mergeReplyChainDownloads, shouldPreferCurrentDownloads} = await import("../dist/ai/reply-chain-downloads.js");
+
+test("reply chain attachment visibility keeps only user input attachments", () => {
+    const attachments = filterUserInputStoredAttachments([
+        {
+            kind: "document",
+            fileId: "user-doc",
+            fileName: "user.txt",
+            cachePath: "/tmp/user.txt",
+            scope: "user_input",
+        },
+        {
+            kind: "document",
+            fileId: "bot-doc",
+            fileName: "bot.txt",
+            cachePath: "/tmp/bot.txt",
+            scope: "bot_output",
+        },
+        {
+            kind: "document",
+            fileId: "internal-doc",
+            fileName: "internal.json",
+            cachePath: "/tmp/internal.json",
+            scope: "internal_artifact",
+        },
+    ]);
+
+    assert.equal(attachments.length, 1);
+    assert.equal(attachments[0].fileId, "user-doc");
+});
+
+test("reply chain downloads keep current input first and deduplicate chain copies", () => {
+    const merged = mergeReplyChainDownloads(
+        [
+            {
+                kind: "document",
+                fileId: "new-doc",
+                fileName: "new.txt",
+                buffer: Buffer.from("new"),
+                path: "/tmp/new.txt",
+            },
+            {
+                kind: "document",
+                fileId: "shared-doc",
+                fileName: "shared.txt",
+                buffer: Buffer.from("current"),
+                path: "/tmp/current-shared.txt",
+            },
+        ],
+        [
+            {
+                kind: "document",
+                fileId: "shared-doc",
+                fileName: "shared.txt",
+                buffer: Buffer.from("reply-chain"),
+                path: "/tmp/reply-shared.txt",
+            },
+            {
+                kind: "document",
+                fileId: "old-doc",
+                fileName: "old.txt",
+                buffer: Buffer.from("old"),
+                path: "/tmp/old.txt",
+            },
+        ],
+    );
+
+    assert.equal(merged.length, 3);
+    assert.equal(merged[0].fileId, "new-doc");
+    assert.equal(merged[1].fileId, "shared-doc");
+    assert.equal(merged[2].fileId, "old-doc");
+});
+
+test("reply chain downloads are used when there is no new document", () => {
+    const merged = mergeReplyChainDownloads([], [
+        {
+            kind: "document",
+            fileId: "reply-doc",
+            fileName: "reply.txt",
+            buffer: Buffer.from("reply"),
+            path: "/tmp/reply.txt",
+        },
+    ]);
+
+    assert.equal(merged.length, 1);
+    assert.equal(merged[0].fileId, "reply-doc");
+});
+
+test("reply chain prefers current downloads when user points to this file", () => {
+    assert.equal(
+        shouldPreferCurrentDownloads("Please answer about this file", [{
+            kind: "document",
+            fileId: "new-doc",
+            fileName: "new.txt",
+            buffer: Buffer.from("new"),
+            path: "/tmp/new.txt",
+        }]),
+        true,
+    );
+
+    assert.equal(
+        shouldPreferCurrentDownloads("ответь по этому файлу", [{
+            kind: "document",
+            fileId: "new-doc",
+            fileName: "new.txt",
+            buffer: Buffer.from("new"),
+            path: "/tmp/new.txt",
+        }]),
+        false,
+    );
+
+    assert.equal(
+        shouldPreferCurrentDownloads("ответь на этот файл", [{
+            kind: "document",
+            fileId: "new-doc",
+            fileName: "new.txt",
+            buffer: Buffer.from("new"),
+            path: "/tmp/new.txt",
+        }]),
+        true,
+    );
+
+    assert.equal(
+        shouldPreferCurrentDownloads("this file", []),
+        false,
+    );
+});
@@ -0,0 +1,21 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {summarizeModelOutput} = await import("../dist/ai/response-model-output.js");
+
+test("model output summary trims text and copies attachment records", () => {
+    const toolExecutions = [{toolName: "read_file", callId: "1", argumentsText: "{}", resultChars: 10}];
+    const outputAttachments = [{artifactKind: "generated_file", fileName: "out.txt", sizeBytes: 12}];
+
+    const summary = summarizeModelOutput({
+        text: "  hello world  ",
+        toolExecutions,
+        outputAttachments,
+    });
+
+    assert.deepEqual(summary, {
+        text: "hello world",
+        toolExecutions,
+        outputAttachments,
+    });
+});
@@ -0,0 +1,40 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {decideToolLoopContinuation} = await import("../dist/ai/tool-loop-control.js");
+
+test("tool loop continuation stops when there are no tool calls", () => {
+    const decision = decideToolLoopContinuation({
+        round: 0,
+        maxRounds: 3,
+        toolCalls: [],
+    });
+
+    assert.equal(decision.continue, false);
+    assert.equal(decision.reason, "no_tool_calls");
+    assert.equal(decision.remainingRounds, 2);
+});
+
+test("tool loop continuation stops on the last allowed round", () => {
+    const decision = decideToolLoopContinuation({
+        round: 2,
+        maxRounds: 3,
+        toolCalls: [{id: "call-1", name: "read_file", argumentsText: "{}"}],
+    });
+
+    assert.equal(decision.continue, false);
+    assert.equal(decision.reason, "max_rounds_reached");
+    assert.equal(decision.remainingRounds, 0);
+});
+
+test("tool loop continuation allows further rounds when tools remain and rounds are left", () => {
+    const decision = decideToolLoopContinuation({
+        round: 1,
+        maxRounds: 3,
+        toolCalls: [{id: "call-1", name: "read_file", argumentsText: "{}"}],
+    });
+
+    assert.equal(decision.continue, true);
+    assert.equal(decision.reason, undefined);
+    assert.equal(decision.remainingRounds, 1);
+});
@@ -0,0 +1,37 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {runToolLoopRounds} = await import("../dist/ai/tool-loop-runner.js");
+
+test("tool loop runner stops when handler requests it", async () => {
+    const rounds = [];
+
+    await runToolLoopRounds({
+        maxRounds: 5,
+        async onRound(round) {
+            rounds.push(round);
+            return {shouldContinue: round < 1};
+        },
+    });
+
+    assert.deepEqual(rounds, [0, 1]);
+});
+
+test("tool loop runner calls max rounds hook when handler never stops", async () => {
+    const rounds = [];
+    let maxRoundsReached = -1;
+
+    await runToolLoopRounds({
+        maxRounds: 3,
+        async onRound(round) {
+            rounds.push(round);
+            return {shouldContinue: true};
+        },
+        onMaxRoundsReached(round) {
+            maxRoundsReached = round;
+        },
+    });
+
+    assert.deepEqual(rounds, [0, 1, 2]);
+    assert.equal(maxRoundsReached, 2);
+});
@@ -0,0 +1,47 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {summarizeToolLoop} = await import("../dist/ai/tool-loop-summary.js");
+
+test("tool loop summary skips empty tool execution batches", () => {
+    const summary = summarizeToolLoop({
+        text: "answer",
+        executions: [],
+        outputAttachments: [],
+    });
+
+    assert.equal(summary.status, "skipped");
+    assert.equal(summary.fallbackAction, "continue_without_stage");
+    assert.equal(summary.details.count, 0);
+    assert.deepEqual(summary.details.tools, []);
+    assert.deepEqual(summary.details.modelOutput, {
+        text: "answer",
+        toolExecutions: [],
+        outputAttachments: [],
+    });
+    assert.equal(summary.artifacts, undefined);
+});
+
+test("tool loop summary reports executions and summary artifact", () => {
+    const summary = summarizeToolLoop({
+        text: "answer",
+        executions: [{
+            toolName: "read_file",
+            callId: "call-1",
+            argumentsText: "{}",
+            resultChars: 12,
+        }],
+        outputAttachments: [],
+    });
+
+    assert.equal(summary.status, "succeeded");
+    assert.equal(summary.fallbackAction, undefined);
+    assert.equal(summary.details.count, 1);
+    assert.deepEqual(summary.details.tools, [{
+        toolName: "read_file",
+        callId: "call-1",
+        resultChars: 12,
+    }]);
+    assert.equal(summary.artifacts?.[0]?.kind, "tool_result");
+    assert.equal(summary.artifacts?.[0]?.stage, "tool_loop");
+});
@@ -0,0 +1,126 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {runToolRankStage} = await import("../dist/ai/tool-rank-stage.js");
+
+function createStreamMessage() {
+    const events = [];
+    const state = {
+        status: "",
+        events,
+        setStatus(value) {
+            state.status = value;
+        },
+        clearStatus() {
+            state.status = "";
+        },
+        async flush() {},
+        async storePipelineAudit(batch) {
+            events.push(...batch);
+        },
+    };
+
+    return state;
+}
+
+function createAuditRecorder() {
+    const events = [];
+    return {
+        events,
+        async storeAudit(params) {
+            events.push({
+                stage: "tool_rank",
+                status: params.error ? "failed" : "succeeded",
+                details: {
+                    round: params.round,
+                    availableTools: params.availableTools,
+                    selectedTools: params.selectedTools ?? [],
+                    usedRanker: params.usedRanker ?? false,
+                    toolRankDecision: {
+                        provider: params.provider,
+                        round: params.round,
+                        availableTools: params.availableTools,
+                        selectedTools: params.selectedTools ?? [],
+                        usedRanker: params.usedRanker ?? false,
+                    },
+                },
+            });
+        },
+    };
+}
+
+test("tool rank stage clears status after success and stores decision audit", async () => {
+    const streamMessage = createStreamMessage();
+    const audit = createAuditRecorder();
+    const result = await runToolRankStage({
+        provider: "OLLAMA",
+        model: "test-model",
+        round: 0,
+        config: {
+            toolRankerFallbackPolicy: "NO_TOOLS",
+        },
+        availableTools: [{name: "read_file"}],
+        messages: [{role: "user", content: "прочитай src/index.ts"}],
+        streamMessage,
+        signal: new AbortController().signal,
+        storeAudit: audit.storeAudit,
+        toolRanker: {
+            async selectTools() {
+                return {
+                    toolNames: ["read_file"],
+                    usedRanker: true,
+                };
+            },
+        },
+    });
+
+    assert.deepEqual(result.selectedToolNames, ["read_file"]);
+    assert.deepEqual(result.filteredTools, [{name: "read_file"}]);
+    assert.equal(result.usedRanker, true);
+    assert.equal(streamMessage.status, "");
+    assert.equal(audit.events.length, 1);
+    assert.equal(audit.events[0].stage, "tool_rank");
+    assert.equal(audit.events[0].status, "succeeded");
+    assert.deepEqual(audit.events[0].details.toolRankDecision, {
+        provider: "OLLAMA",
+        round: 0,
+        availableTools: ["read_file"],
+        selectedTools: ["read_file"],
+        usedRanker: true,
+    });
+});
+
+test("tool rank stage clears status after failure", async () => {
+    const streamMessage = createStreamMessage();
+    const audit = createAuditRecorder();
+    await assert.rejects(() => runToolRankStage({
+        provider: "OLLAMA",
+        model: "test-model",
+        round: 1,
+        config: {
+            toolRankerFallbackPolicy: "NO_TOOLS",
+        },
+        availableTools: [{name: "read_file"}],
+        messages: [{role: "user", content: "прочитай src/index.ts"}],
+        streamMessage,
+        signal: new AbortController().signal,
+        storeAudit: audit.storeAudit,
+        toolRanker: {
+            async selectTools() {
+                throw new Error("ranker failed");
+            },
+        },
+    }), /ranker failed/);
+
+    assert.equal(streamMessage.status, "");
+    assert.equal(audit.events.length, 1);
+    assert.equal(audit.events[0].stage, "tool_rank");
+    assert.equal(audit.events[0].status, "failed");
+    assert.deepEqual(audit.events[0].details.toolRankDecision, {
+        provider: "OLLAMA",
+        round: 1,
+        availableTools: ["read_file"],
+        selectedTools: [],
+        usedRanker: false,
+    });
+});
@@ -0,0 +1,69 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+const {ToolRankerFallbackPolicy} = await import("../dist/common/policies.js");
+const {
+    decideToolRankerFallback,
+    resolveToolRankerFallbackSelection,
+} = await import("../dist/ai/tool-ranker-fallback.js");
+
+const availableToolNames = ["read_file", "search_files"];
+
+test("tool ranker fallback returns no tools when policy is NO_TOOLS", () => {
+    assert.deepEqual(
+        resolveToolRankerFallbackSelection({
+            fallbackPolicy: ToolRankerFallbackPolicy.NO_TOOLS,
+            availableToolNames,
+        }),
+        {
+            toolNames: [],
+            usedRanker: false,
+        },
+    );
+});
+
+test("tool ranker fallback returns all tools when policy is ALL_TOOLS", () => {
+    assert.deepEqual(
+        resolveToolRankerFallbackSelection({
+            fallbackPolicy: ToolRankerFallbackPolicy.ALL_TOOLS,
+            availableToolNames,
+        }),
+        {
+            toolNames: ["read_file", "search_files"],
+            usedRanker: false,
+        },
+    );
+});
+
+test("tool ranker fallback decision uses executor semantics", () => {
+    assert.deepEqual(
+        decideToolRankerFallback({
+            fallbackPolicy: ToolRankerFallbackPolicy.MAIN_MODEL,
+            availableToolNames,
+            reason: "failed",
+        }),
+        {
+            stage: "tool_rank",
+            reason: "failed",
+            action: "use_alternate_target",
+            shouldContinue: true,
+            shouldNotifyUser: false,
+            shouldFailRequest: false,
+            toolNames: ["read_file", "search_files"],
+            usedRanker: false,
+        },
+    );
+});
+
+test("tool ranker fallback keeps all tools when policy is MAIN_MODEL", () => {
+    assert.deepEqual(
+        resolveToolRankerFallbackSelection({
+            fallbackPolicy: ToolRankerFallbackPolicy.MAIN_MODEL,
+            availableToolNames,
+        }),
+        {
+            toolNames: ["read_file", "search_files"],
+            usedRanker: false,
+        },
+    );
+});
Author	SHA1	Message	Date
melod1n	a143d512ab	Remove pipeline todo checklist	2026-05-18 22:43:51 +03:00
melod1n	d47e2288d6	Add pipeline integration tests	2026-05-18 22:09:44 +03:00
melod1n	7b2bc93bc1	Add stale RAG provider cleanup	2026-05-18 21:27:41 +03:00
melod1n	75253534d8	Add AI observability commands and metrics	2026-05-18 20:58:19 +03:00
melod1n	53e9798193	Merge reply-chain documents into AI requests	2026-05-18 20:43:35 +03:00
melod1n	df39d89ea8	Localize pipeline fallback notifications	2026-05-18 20:31:04 +03:00
melod1n	1773b44edd	Add fallback target logging and unified failures	2026-05-18 20:22:47 +03:00
melod1n	507b15aa5f	Add centralized pipeline fallback notifier	2026-05-18 20:13:19 +03:00
melod1n	d163d72a0b	Split model call and tool loop helpers	2026-05-18 19:55:00 +03:00
melod1n	57985ce87b	Persist tool loop summary artifact	2026-05-18 19:31:48 +03:00
melod1n	9a105caf0b	Add shared tool loop stop policy	2026-05-18 19:24:39 +03:00
melod1n	13df2a1c23	Extract shared tool batch adapter helper	2026-05-18 19:18:22 +03:00
melod1n	9352ade19f	Summarize tool loop output	2026-05-18 19:05:13 +03:00
melod1n	9d6cdb008b	Normalize model call output	2026-05-18 18:59:09 +03:00
melod1n	e520c412af	Route tool ranker fallback through executor	2026-05-18 17:16:28 +03:00
melod1n	58f5a645fd	Add tool ranker fallback policy tests	2026-05-18 16:23:32 +03:00
melod1n	c3481dfcfe	Inline tool rank audit into stage	2026-05-18 16:10:03 +03:00
melod1n	b16c213afb	Isolate tool rank stage pipeline	2026-05-18 16:03:47 +03:00
melod1n	8aede4b053	Add unified request pipeline stages	2026-05-18 15:45:39 +03:00