From 5f73298cd5c1aa53f010c4db3477a434701a854c Mon Sep 17 00:00:00 2001 From: Charlotte Hoblik Date: Wed, 17 Dec 2025 09:49:35 +0100 Subject: [PATCH 1/6] Add token usage section --- .../agent-builder/limitations-known-issues.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/solutions/search/agent-builder/limitations-known-issues.md b/solutions/search/agent-builder/limitations-known-issues.md index db7d53dcfc..c108a924dc 100644 --- a/solutions/search/agent-builder/limitations-known-issues.md +++ b/solutions/search/agent-builder/limitations-known-issues.md @@ -24,6 +24,21 @@ However, it must be enabled for non-serverless deployments {applies_to}`stack: p In the first release of {{agent-builder}} on serverless, the feature is **only available on {{es}} projects**. +### Token usage and counting + +When using {{agent-builder}}, total token usage will typically exceed the visible conversation text. Because {{agent-builder}} utilizes an agentic framework, a single user request often triggers multiple model calls (rounds) to process reasoning steps, execute tools, and interpret results. + +Token counts include: + +* **Input Tokens:** These accumulate throughout the session. They include the user's current query, the conversation history from previous rounds, system prompts, and the results returned from any tools used during execution. +* **Output Tokens:** These include the final response visible to the user, as well as all internal reasoning steps, tool calls, and intermediate results generated by the model. + +:::{note} +As the conversation history grows and the agent performs more complex reasoning loops, the input and output token counts will increase multiplicatively for each round of execution. +::: + +For more information on billing and token costs, refer to [Elastic pricing](https://www.elastic.co/pricing). + ## Known issues ### Incompatible LLMs From b4d6b6057b2ca6278b58b4e32637c475869ef4ee Mon Sep 17 00:00:00 2001 From: Charlotte Hoblik Date: Wed, 17 Dec 2025 12:16:57 +0100 Subject: [PATCH 2/6] Fix word choice --- solutions/search/agent-builder/limitations-known-issues.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/solutions/search/agent-builder/limitations-known-issues.md b/solutions/search/agent-builder/limitations-known-issues.md index c108a924dc..6e77d1ae27 100644 --- a/solutions/search/agent-builder/limitations-known-issues.md +++ b/solutions/search/agent-builder/limitations-known-issues.md @@ -26,7 +26,7 @@ In the first release of {{agent-builder}} on serverless, the feature is **only a ### Token usage and counting -When using {{agent-builder}}, total token usage will typically exceed the visible conversation text. Because {{agent-builder}} utilizes an agentic framework, a single user request often triggers multiple model calls (rounds) to process reasoning steps, execute tools, and interpret results. +When using {{agent-builder}}, total token usage typically exceeds the visible conversation text. Because {{agent-builder}} utilizes an agentic framework, a single user request often triggers multiple model calls (rounds) to process reasoning steps, run tools, and interpret results. Token counts include: @@ -34,7 +34,7 @@ Token counts include: * **Output Tokens:** These include the final response visible to the user, as well as all internal reasoning steps, tool calls, and intermediate results generated by the model. :::{note} -As the conversation history grows and the agent performs more complex reasoning loops, the input and output token counts will increase multiplicatively for each round of execution. +As the conversation history grows and the agent performs more complex reasoning loops, the input and output token count increases multiplicatively for each round of execution. ::: For more information on billing and token costs, refer to [Elastic pricing](https://www.elastic.co/pricing). From f8b4c12c35db88af2deb483480aca8a7e167d0ee Mon Sep 17 00:00:00 2001 From: Charlotte Hoblik Date: Thu, 18 Dec 2025 14:00:31 +0100 Subject: [PATCH 3/6] move token usage to a new page --- .../agent-builder/agent-builder-agents.md | 4 ++++ solutions/search/agent-builder/chat.md | 4 ++++ .../agent-builder/limitations-known-issues.md | 15 ------------ .../search/agent-builder/monitor-usage.md | 24 +++++++++++++++++++ solutions/search/agent-builder/tools.md | 4 ++++ solutions/search/elastic-agent-builder.md | 6 +++++ 6 files changed, 42 insertions(+), 15 deletions(-) create mode 100644 solutions/search/agent-builder/monitor-usage.md diff --git a/solutions/search/agent-builder/agent-builder-agents.md b/solutions/search/agent-builder/agent-builder-agents.md index 7c28d382fa..fe3e561be0 100644 --- a/solutions/search/agent-builder/agent-builder-agents.md +++ b/solutions/search/agent-builder/agent-builder-agents.md @@ -18,6 +18,10 @@ An agent parses user requests to define a goal and then runs tools in a loop to When you ask a question to an agent, it analyzes your request to define a specific goal. It selects the most appropriate tools and determines the right arguments to use. The agent evaluates the information returned after each action and decides whether to use additional tools or formulate a response. This iterative process of tool selection, execution, and analysis continues until the agent can provide a complete answer. +:::{note} +This iterative process consumes tokens. To understand how usage is calculated, refer to [Token usage in Elastic Agent Builder](monitor-usage.md#token-usage-in-elastic-agent-builder). +::: + {{agent-builder}} includes a default agent (named `Elastic AI Agent`) with access to all built-in tools. You can create specialized agents with custom instructions and selected tools to address specific use cases or workflows. :::{note} diff --git a/solutions/search/agent-builder/chat.md b/solutions/search/agent-builder/chat.md index 3b00b1d8d3..3102a8e436 100644 --- a/solutions/search/agent-builder/chat.md +++ b/solutions/search/agent-builder/chat.md @@ -35,6 +35,10 @@ This takes you to the chat GUI: Use the text input area to chat with an agent in real time. By default, you chat with the built-in Elastic AI Agent. +:::{note} +Conversations with agents consume tokens. To understand how usage is calculated, refer to [Token usage in Elastic Agent Builder](monitor-usage.md#token-usage-in-elastic-agent-builder). +::: + :::{image} images/agent-builder-chat-input.png :alt: Text input area for chatting with agents :width: 850px diff --git a/solutions/search/agent-builder/limitations-known-issues.md b/solutions/search/agent-builder/limitations-known-issues.md index 6e77d1ae27..db7d53dcfc 100644 --- a/solutions/search/agent-builder/limitations-known-issues.md +++ b/solutions/search/agent-builder/limitations-known-issues.md @@ -24,21 +24,6 @@ However, it must be enabled for non-serverless deployments {applies_to}`stack: p In the first release of {{agent-builder}} on serverless, the feature is **only available on {{es}} projects**. -### Token usage and counting - -When using {{agent-builder}}, total token usage typically exceeds the visible conversation text. Because {{agent-builder}} utilizes an agentic framework, a single user request often triggers multiple model calls (rounds) to process reasoning steps, run tools, and interpret results. - -Token counts include: - -* **Input Tokens:** These accumulate throughout the session. They include the user's current query, the conversation history from previous rounds, system prompts, and the results returned from any tools used during execution. -* **Output Tokens:** These include the final response visible to the user, as well as all internal reasoning steps, tool calls, and intermediate results generated by the model. - -:::{note} -As the conversation history grows and the agent performs more complex reasoning loops, the input and output token count increases multiplicatively for each round of execution. -::: - -For more information on billing and token costs, refer to [Elastic pricing](https://www.elastic.co/pricing). - ## Known issues ### Incompatible LLMs diff --git a/solutions/search/agent-builder/monitor-usage.md b/solutions/search/agent-builder/monitor-usage.md new file mode 100644 index 0000000000..a4572167eb --- /dev/null +++ b/solutions/search/agent-builder/monitor-usage.md @@ -0,0 +1,24 @@ +--- +navigation_title: "Monitor usage" +applies_to: + stack: preview 9.2 + serverless: + elasticsearch: preview + observability: unavailable + security: unavailable +--- + +# Token usage in Elastic Agent Builder + +When using {{agent-builder}}, total token usage typically exceeds the visible conversation text. Because {{agent-builder}} utilizes an agentic framework, a single user request often triggers multiple model calls (rounds) to process reasoning steps, run tools, and interpret results. + +Token counts include: + +* **Input Tokens:** These accumulate throughout the session. They include the user's current query, the conversation history from previous rounds, system prompts, and the results returned from any tools used during execution. +* **Output Tokens:** These include the final response visible to the user, as well as all internal reasoning steps, tool calls, and intermediate results generated by the model. + +:::{note} +As the conversation history grows and the agent performs more complex reasoning loops, the input and output token count increases multiplicatively for each round of execution. +::: + +For more information on billing and token costs, refer to [Elastic pricing](https://www.elastic.co/pricing). \ No newline at end of file diff --git a/solutions/search/agent-builder/tools.md b/solutions/search/agent-builder/tools.md index 52a8afce0e..86b105e07b 100644 --- a/solutions/search/agent-builder/tools.md +++ b/solutions/search/agent-builder/tools.md @@ -34,6 +34,10 @@ Tools enable agents to work with {{es}} data. When an agent receives a natural l Each tool is an atomic operation with a defined signature - accepting typed parameters and returning structured results in a format the agent can parse, transform, and incorporate into its response generation. +:::{note} +Tool execution and result processing consume tokens. To understand how usage is calculated, refer to [Token usage in Elastic Agent Builder](monitor-usage.md#token-usage-in-elastic-agent-builder). +::: + ## Built-in tools {{agent-builder}} ships with a comprehensive set of built-in tools that provide core capabilities for working with your {{es}} data. These tools are ready to use. They cannot be modified or deleted. diff --git a/solutions/search/elastic-agent-builder.md b/solutions/search/elastic-agent-builder.md index 5b6c0915fa..07cd8f4f3b 100644 --- a/solutions/search/elastic-agent-builder.md +++ b/solutions/search/elastic-agent-builder.md @@ -72,6 +72,12 @@ Configure security roles and API keys to control who can use agents, which tools [**Learn more about permissions and access control**](agent-builder/permissions.md) +## Monitor usage + +Understand how tokens are calculated and accumulated during agent execution to predict the impact on your usage and costs. + +[**Learn more about token usage**](agent-builder/monitor-usage.md) + ## Limitations and known issues {{agent-builder}} is in technical preview. From 336c538ce69b2e9b6bb84e07381bcc8c973b2702 Mon Sep 17 00:00:00 2001 From: Copilot <198982749+Copilot@users.noreply.github.com> Date: Thu, 18 Dec 2025 14:46:05 +0100 Subject: [PATCH 4/6] Remove anchor fragments from monitor-usage.md links (#4413) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Removed the `#token-usage-in-elastic-agent-builder` anchor from all links to `monitor-usage.md` to simplify navigation and improve maintainability. **Changes:** - Updated links in `chat.md`, `agent-builder-agents.md`, and `tools.md` to reference `monitor-usage.md` directly without anchor fragments - Links now point to the page itself rather than a specific section, allowing the page structure to evolve independently Before: ```markdown [Token usage in Elastic Agent Builder](monitor-usage.md#token-usage-in-elastic-agent-builder) ``` After: ```markdown [Token usage in Elastic Agent Builder](monitor-usage.md) ``` --- 💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey). --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: leemthompo <32779855+leemthompo@users.noreply.github.com> --- solutions/search/agent-builder/agent-builder-agents.md | 2 +- solutions/search/agent-builder/chat.md | 2 +- solutions/search/agent-builder/tools.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/solutions/search/agent-builder/agent-builder-agents.md b/solutions/search/agent-builder/agent-builder-agents.md index fe3e561be0..c563112dd8 100644 --- a/solutions/search/agent-builder/agent-builder-agents.md +++ b/solutions/search/agent-builder/agent-builder-agents.md @@ -19,7 +19,7 @@ An agent parses user requests to define a goal and then runs tools in a loop to When you ask a question to an agent, it analyzes your request to define a specific goal. It selects the most appropriate tools and determines the right arguments to use. The agent evaluates the information returned after each action and decides whether to use additional tools or formulate a response. This iterative process of tool selection, execution, and analysis continues until the agent can provide a complete answer. :::{note} -This iterative process consumes tokens. To understand how usage is calculated, refer to [Token usage in Elastic Agent Builder](monitor-usage.md#token-usage-in-elastic-agent-builder). +This iterative process consumes tokens. To understand how usage is calculated, refer to [Token usage in Elastic Agent Builder](monitor-usage.md). ::: {{agent-builder}} includes a default agent (named `Elastic AI Agent`) with access to all built-in tools. You can create specialized agents with custom instructions and selected tools to address specific use cases or workflows. diff --git a/solutions/search/agent-builder/chat.md b/solutions/search/agent-builder/chat.md index 3102a8e436..b547c691b1 100644 --- a/solutions/search/agent-builder/chat.md +++ b/solutions/search/agent-builder/chat.md @@ -36,7 +36,7 @@ This takes you to the chat GUI: Use the text input area to chat with an agent in real time. By default, you chat with the built-in Elastic AI Agent. :::{note} -Conversations with agents consume tokens. To understand how usage is calculated, refer to [Token usage in Elastic Agent Builder](monitor-usage.md#token-usage-in-elastic-agent-builder). +Conversations with agents consume tokens. To understand how usage is calculated, refer to [Token usage in Elastic Agent Builder](monitor-usage.md). ::: :::{image} images/agent-builder-chat-input.png diff --git a/solutions/search/agent-builder/tools.md b/solutions/search/agent-builder/tools.md index 86b105e07b..d95e3af371 100644 --- a/solutions/search/agent-builder/tools.md +++ b/solutions/search/agent-builder/tools.md @@ -35,7 +35,7 @@ Tools enable agents to work with {{es}} data. When an agent receives a natural l Each tool is an atomic operation with a defined signature - accepting typed parameters and returning structured results in a format the agent can parse, transform, and incorporate into its response generation. :::{note} -Tool execution and result processing consume tokens. To understand how usage is calculated, refer to [Token usage in Elastic Agent Builder](monitor-usage.md#token-usage-in-elastic-agent-builder). +Tool execution and result processing consume tokens. To understand how usage is calculated, refer to [Token usage in Elastic Agent Builder](monitor-usage.md). ::: ## Built-in tools From 6bfce8fee622252dadc0005d3c3f1a7c0338b9eb Mon Sep 17 00:00:00 2001 From: Copilot <198982749+Copilot@users.noreply.github.com> Date: Thu, 18 Dec 2025 15:11:07 +0100 Subject: [PATCH 5/6] [WIP] Add token usage and counting to Agent Builder (#4414) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Thanks for the feedback on #4384. I've created this new PR, which merges into #4384, to address your comment. I will work on the changes and keep this PR's description up to date as I make progress. Original PR: #4384 Triggering comment (https://github.com/elastic/docs-content/pull/4384#issuecomment-3670426298): > @copilot we need to add the new page to the `/solutions/toc.yml` file too --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: leemthompo <32779855+leemthompo@users.noreply.github.com> --- solutions/toc.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/solutions/toc.yml b/solutions/toc.yml index 82f9883d84..205b3bee8f 100644 --- a/solutions/toc.yml +++ b/solutions/toc.yml @@ -77,6 +77,7 @@ toc: - file: search/agent-builder/kibana-api.md - file: search/agent-builder/a2a-server.md - file: search/agent-builder/mcp-server.md + - file: search/agent-builder/monitor-usage.md - file: search/agent-builder/permissions.md - file: search/agent-builder/limitations-known-issues.md - file: search/rag.md From 3e7f4da36ea382f616a1b1ddf02cdae2f6667567 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 19 Dec 2025 16:19:03 +0100 Subject: [PATCH 6/6] Clarify how token usage scales with conversation rounds. --- solutions/search/agent-builder/monitor-usage.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/solutions/search/agent-builder/monitor-usage.md b/solutions/search/agent-builder/monitor-usage.md index a4572167eb..7703f56bc2 100644 --- a/solutions/search/agent-builder/monitor-usage.md +++ b/solutions/search/agent-builder/monitor-usage.md @@ -18,7 +18,7 @@ Token counts include: * **Output Tokens:** These include the final response visible to the user, as well as all internal reasoning steps, tool calls, and intermediate results generated by the model. :::{note} -As the conversation history grows and the agent performs more complex reasoning loops, the input and output token count increases multiplicatively for each round of execution. +Each conversation round includes all previous rounds as context. This means token usage at each step depends on the entire conversation size, not just the current message. ::: -For more information on billing and token costs, refer to [Elastic pricing](https://www.elastic.co/pricing). \ No newline at end of file +For more information on billing and token costs, refer to [Elastic pricing](https://www.elastic.co/pricing).