SimpleMCP - server/publc_simplechat tiny client updated with reasoning, vision, builtin clientside tool calls, markdown and mcpish toolcall support [WIP] #17853

hanishkvc · 2025-12-08T01:31:13Z

With this and other PRs in this series, the alternate tiny tools/server/public_simplechat chat client has been updated to support

tool calls along with a bunch of builtin ones for ready use
- one subset of the builtin ones make use of browser's web worker context
- another subset (related to web fetch, search, pdf) use the bundled simplemcp.py
  - the chat client and simplemcp handshake using a slightly modified mcp standard
- ai can call into prompt based self customised version of itself
reasoning and vision for ai models that support the same.
now each chat session has its own settings
ai responses are by default interpreted as markdown with option for user to override if required
- supports some basic markdown format rules/commands/types

Using this client ui along with llama-server one can get the local ai to fetch and summarise latest news or get the latest research papers / details from arxiv / ... for a topic of interest and summarise the same or generate javascript code snippets and test them out or use it to validate mathematical statements ai might make and or answer queries around these or ... its up to you and the ai model ... ai model can even call into system prompt based self modified variants of itself, if it deems it necessary ...

Remember to cross check the tool calls before allowing their execution and similarly cross checking responses before submitting them to the ai model, just to be on safe side.

One can peek into the reasoning from ai models that support the same. And for ai models that support vision, one can send images to explore the same.

One could get going with (update the arguments as needed)

build/bin/llama-server -m ../llama.cpp.models/gpt-oss-20b-mxfp4.gguf --jinja --path tools/server/public_simplechat/ --ctx-size 64000 --n-gpu-layers 12 -fa on

NOTE: even default context size should be good enough for simple stuffs. Explicitly set the --ctx-size if working with many web site / pdf contents or so as needed.

If one needs the additional power/flexibility got with web search, web fetch and pdf related tool calls, then also run

cd tools/server/public_simplechat/local.tools; python3 ./simplemcp.py --op.configFile simplemcp.json

NOTE: Remember to edit simplemcp.json with the list of sites you want to allow access to, as well as to disable local file access, if needed.

Look into included readme.md, details.md and changelog.md for additional info. Prev PR in this series #17506

All features (except for pdf - uses pypdf) are implemented internally without depending on any external libraries/modules, so also avoiding any need to track multiple external dependencies, and inturn should fit within ~50KB of compressed size. This is created using pure html+css+js in general, with additionally python for simplemcp to bypass the cors++ restrictions in browser environment for direct web access.

NOTE: The MCP ish handshake between the chat client and simplemcp currently implemented after glancing through the architecture page in mcp standard's website and the sample json shown there along with some logical guess work. Need to look into the actual MCP standard later, if needed. Also noticed some minimal changes between the MCP handshake structures and the OpenAi rest api handshake structures, for now have followed the OpenAI related structures with those tiny differences even in the mcp handshake.

Expose pdf2text tool call to ai server and handshake with simple proxy for the same.

Make the description bit more explicit with it supporting local file paths as part of the url scheme, as the tested ai model was cribbing about not supporting file url scheme. Need to check if this new description will make things better. Convert the text to bytes for writing to the http pipe. Ensure CORS is kept happy by passing AccessControlAllowOrigin in header.

Allow user to limit the max amount of result data returned to ai after a tool call. Inturn it is set by default to 2K. Update the pdf2text tool description to try make the local file path support more explicit

Needed to tweak the description further for the ai model to be able to understand that its ok to pass file:// scheme based urls Had forgotten how big the web site pages have become as also the need for more ResultDataLength wrt one shot PDF read to get atleast some good enough amount of content in it with large pdfs

Half asleep as usual ;)

This makes the logic more generic, as well as prepares for additional parameters to be passed to the simpleproxy.py helper handshakes. Ex: Restrict extracted contents of a pdf to specified start and end page numbers or so.

As I was seeing the truncated message even for stripped plain text web acces, relooking at that initial go at truncating, revealed a oversight, which had the truncation logic trigger anytime the iResultMaxDataLength was greater than 0, irrespective of whether the actual result was smaller than the allowed limit or not, thus adding that truncated message to end of result unnecessarily. Have fixed that oversight Also recent any number of args based simpleprox handshake helper in toolweb seems to be working (atleast for the existing single arg based calls).

Initial go, need to review the code flow as well as test it out

This is a initial go wrt the new overall flow, should work, but need to cross check.

Copy validate_url and build initial skeleton

Check if the specified scheme is allowed or not. If allowed then call corresponding validator to check remaining part of the url is fine or not

Add --allowed.schemes config entry as a needed config. Setup the url validator. Use this wrt urltext, urlraw and pdf2text This allows user to control whether local file access is enabled or not. By default in the sample simpleproxy.json config file local file access is allowed.

Also trap any exceptions while handling and send exception info to the client requesting service

also move debug dump helper to its own module also remember to specify the Class name in quotes, similar to refering to a class within a member of th class wrt python type checking.

Its not necessary to request a page number range always. Take care of page number starting from 1 and underlying data having 0 as the starting index

Added logic to help get a file from either the local file system or from the web, based on the url specified. Update pdfmagic module to use the same, so that it can support both local as well as web based pdf. Bring in the debug module, which I had forgotten to commit, after moving debug helper code from simpleproxy.py to the debug module

This also indirectly adds support for local file system access through the web / fetch (ie urlraw and urltext) service request paths.

Make it a details block and update the content a bit

Usage Note * Cleanup / fix some wording. * Pick chat history handshaked len from config Ensure the settings info is uptodate wrt available tool names by chaining a reshowing with tools manager initialisation.

Rename path and tags/identifiers from Pdf2Text to PdfText Rename the function call to pdf_to_text, this should also help indicate semantic more unambiguously, just in case, especially for smaller models.

Chances are for ai models which dont support tool calling, things will be such that the tool calls meta data shared will be silently ignored without much issue. So enabling tool calling feature by default, so that in case one is using a ai model with tool calling the feature is readily available for use. Revert SlidingWindow ChatHistory in Context from last 10 to last 5 (2 more then origianl, given more context support in todays models) by default, given that now tool handshakes go through the tools related side channel in the http handshake and arent morphed into normal user-assistant channel of the handshake.

helps ensure only service paths that can be serviced are enabled Use same to check for pypdf wrt pdftext

Define a typealias for HttpHeaders and use it where ever needed. Inturn map this to email.message.Message and dict for now. If and when python evolves Http Headers type into better one, need to replace in only one place. Add a ToolManager class which * maintains the list of tool calls and inturn allows any given tool call to be executed and response returned along with needed meta data * generate the overall tool calls meta data * add ToolCallResponseEx which maintains full TCOutResponse for use by tc_handle callers Avoid duplicating handling of some of the basic needed http header entries. Move checking for any dependencies before enabling a tool call into respective tc??? module. * for now this also demotes the logic from the previous fine grained per tool call based dependency check to a more global dep check at the respective module level

Build the list of tool calls Trap some of the MCP post json based requests and map to related handlers. Inturn implement the tool call execution handler. Add some helper dataclasses wrt expected MCP response structure TOTHINK: For now maintain id has a string and not int, with idea to map it directly to callid wrt tool call handshake by ai model. TOCHECK: For now suffle the order of fields wrt jsonrpc and type wrt MCP response related structures, assuming the order shouldnt matter. Need to cross check.

Fix a oversight wrt ToolManager.meta, where I had created a dict of name-keyed toolcall metas, instead of a simple list of toolcall metas. Rather I blindly duplicated structure I used for storing the tool calls in the tc_switch in the anveshika sallap client side code. Add dataclasses to mimic the MCP tools/list response. However wrt the 2 odd differences between the MCP structure and OpenAi tools handshake structure, for now I have retained the OpenAi tools hs structure. Add a common helper send_mcp to ProxyHandler given that both mcp_toolscall and mcp_toolslist and even others like mcp_initialise in future require a common response mechanism. With above and bit more implement initial go at tools/list response.

Given that there could be other service paths beyond /mcp exposed in future, and given that it is not necessary that their post body contain json data, so move conversion to json to within mcp_run handler. While retaining reading of the body in the generic do_POST ensures that the read size limit is implicitly enforced, whether /mcp now or any other path in future.

By default bearer based auth check is done always whether in https or http mode. However by updating the sec.bAuthAlways config entry to false, the bearer auth check will be carried out only in https mode.

As expected dataclass field member mutable default values needing default_factory. Dont forget returning after sending error response. TypeAlias type hinting flow seems to go beyond TYPE_CHECKING. Also email.message.Message[str,str] not accepted, so keep things simple wrt HttpHeaders for now.

Also enforce need for kind of a sane Content-Length header entry in our case. NOTE: it does allow for 0 or other small content lengths, which isnt necessarily valid.

Given toolcall.py maintains ToolCall, ToolManager and MCP related types and base classes, so rename to toolcalls.py Also add the bash script with curl used for testing the tools/list mcp command. Remove the sample function meta ref, as tools/list is working ok.

Add logic to fetch tools/list from mcp server and pass it to tools manager.

Also fix some minor oversights wrt tools/list

Setup to test initial go of the mcpish server and client logics

Move the search web tool call also from previous js client + python simpleproxy based logic to the new simplemcp based logic, while following the same overall logic of reusing the HtmlText's equiv logic with now predefined and user non-replacable (at runtime) tagDrops and template urls

Update the documentation a bit wrt switch from simpleproxy to simplemcp with mcp-ish kind of handshake between the chat client and simplemcp. Rename proxyUrl and related to mcpServerUrl and mcpServerAuth. Now include the path in the url itself, given that in future, we may want to allow the chat client logic to handshake with other mcp servers, which may expose their services through a different path or so. Drop SearchEngine related config entries from chat session settings, given that its now controlled directly in SimpleMCP.

hanishkvc · 2025-12-08T01:52:19Z

Hi @ggerganov

I understand, you have a different view to myself wrt this, but I feel, you should rethink things once, given all the feature additions I have done, compared to last time you looked at it, and also given slightly different philosophy between this chat client and the default web ui one, some of which I have noted below.

If you look at this PR, you will notice that this alternate client ui continues to use a pure html + css + js based flow (also avoiding dependence on external libraries in general) and now supports reasoning, vision, tool calling (with a bunch of useful built in client side based tool calls with no need for any additional setup, ++) and minimal mcp client capability. In turn all fitting within 50KB of compressed source code size (including the python simplemcp.py for web access and related tool calls). Also the logical ui elements have their own unique id/class, if one wants to theme.

While the default web ui is around 1.2 MB or so compressed, needs one to understand svelte framework (in addition to html/css/js) and needs one to track the different bundled external modules. Also currently it doesnt support tool calling, and the plan is more towards server side / back end MCP based tool call support, if I understand correctly.

Given the above significant differences, I feel it makes more sense to continue this updated lightweight alternate chat client + ui option within llama.cpp itself, parallel to the default webui. My embedded background also biases me towards simple yet flexible and functional options. Eitherway the final decision is up to you and the team of open source developers who work on llama.cpp proactively, rather than once in a bluemoon me, as to whether you would prefer to apply these into llama.cpp itself or not. Do let me know your thoughts.

NOTE: When I revisited ai after almost a year++ wanting to explore some of the recent ai developments, I couldnt find any sensible zero or minimal setup based tool calling supported open source ai clients out there including the default bundled web ui, so also I started on this series of patches/PRs.

ggerganov · 2025-12-08T08:03:27Z

@hanishkvc Appreciate the dedication, but I still think your client should be moved to a separate project. It's better to focus our efforts on the official WebUI as it is more feature-complete, secure and has wider developer adoption.

Had forgotten to update docs wrt renamed --op.configFile arg Remove the unneeded space from details.md, which was triggering the editorconfig check at upstream github repo.

hanishkvc · 2025-12-08T11:31:39Z

@hanishkvc Appreciate the dedication, but I still think your client should be moved to a separate project. It's better to focus our efforts on the official WebUI as it is more feature-complete, secure and has wider developer adoption.

Hi @ggerganov

Thanks for getting back.

One suggestion and request I have is that the webui team should allow tool calling support to be added to web-ui. Then there is also the issue of whether tool calling / mcp would be supported through the back end ie through llama engine or llama server or to let it be supported through the chat client logic (like what SimpleChat+SimpleMCP does). There is use cases for both kind of flows, and based on webui team's previous comments it appeared like they are waiting for the backend to be updated wrt tool calling / mcp, that along with me wanting to experiment some aspects of it now and also wanting the flexibility of client side based tool calling is what led to this series of PRs. What is your thoughts on where to place tool calling / mcp handshake ie at the backend or at the chat client end or both?

You mentioned about security, any specific reason why you feel llama-server+webui+(what ever mcp solution is finally employed) is more secure compared to llama-server+simplechat+simplemcp. Interested to understand your perspective there. I could be wrong, but chances are the architecture simplechat+simplemcp follows and the principle of minimal or no external dependencies along with https+bearer auth in turn along with option to place tool calling provider into a seperate vm or so, if needed, and inturn basics of most of these already included in this patch set, should ideally provide a fairly secure configurable environment / setup. Is there some aspect wrt security I have missed out?

One place where maybe I am purposefully being bit contrarian wrt security is with using python's built in socket and http mechanism to build the https server logic, instead of any 3rd party logic for the same and or say go lang based excellent built in standard modules to provide the same logic or so, but given that over time the core python standard modules is what will be the most checked, tested and fixed option, along with it being a readily experimented interpreted runtime with source directly being used as is with minimal inbetween transforms etal is the reason for me picking it over a 3rd party module or build things around rust (the standard bundled module set and its management is not tightly coupled enough) or go.

Also with external_ai tool call mechanism which I have included even ai could be used to validate a tool call before triggering the same, if needed in future or so.

Interested to hear your thoughts.

ggerganov · 2025-12-08T13:32:54Z

I think the tool calls should be done on the client - don't think there is a plan to do them on the server-side. The WebUI will likely soon add official support for tool calling and MCP.

About security - I noticed that your client has server-side PDF parsing (if I understood this correctly) which I think is less secure than the client-side PDF package that we use in the WebUI.

Overall, I'm not very familiar with Web programming best practices and can't strongly comment on which approach/framework is better. I can appreciate the minimalism of your implementation, but on the other hand, I consider it much healthier for the project to have a rapidly evolving WebUI without compromises. The Svelte WebUI and the team behind it so far are doing a great job, so IMO it's better to focus on that.

hanishkvc added 30 commits December 4, 2025 19:41

SimpleChatTC:SimpleProxy:Pdf2Text: Move handling url to its own

5ec2908

SimpleChatTC:SimpleProxy:Pdf2Text: Initial go

6054ddf

SimpleChatTC:SimpleProxy:Pdf2Text: js side initial plumbing

f97efb8

Expose pdf2text tool call to ai server and handshake with simple proxy for the same.

SimpleChatTC:ResultMaxDataLength, Desc

21544ea

Allow user to limit the max amount of result data returned to ai after a tool call. Inturn it is set by default to 2K. Update the pdf2text tool description to try make the local file path support more explicit

SimpleChatTC:Pdf2Text and otherwise readme update

61064ba

Half asleep as usual ;)

SimpleChatTC:Pdf2Text: Make it work with a subset of pages

dd0a7ec

Initial go, need to review the code flow as well as test it out

SimpleChatTC:Fixup auto toolcall wrt newer ChatShow flow

c21bef4

This is a initial go wrt the new overall flow, should work, but need to cross check.

SimpleChatTC:Update notes

d3a893c

SimpleChatTC:SimpleProxy:UrlValidator module initial skeleton

c8407a1

Copy validate_url and build initial skeleton

SimpleChatTC:SimpleProxy:UrlValidator initial go

6cab956

Check if the specified scheme is allowed or not. If allowed then call corresponding validator to check remaining part of the url is fine or not

SimpleChatTC:SimpleProxy: AuthAndRun hlpr for paths that check auth

b18aed4

Also trap any exceptions while handling and send exception info to the client requesting service

SimpleChatTC:SimpleProxy:Move pdf logic into its own module

a7de002

SimpleChatTC:SimpleProxy: Move web requests to its own module

350d7d7

SimpleChatTC:SimpleProxy: Avoid circular deps wrt Type Checking

d012d12

also move debug dump helper to its own module also remember to specify the Class name in quotes, similar to refering to a class within a member of th class wrt python type checking.

SimpleChatTC:SimpleProxy:Pdf2Text cleanup page number handling

a3beacf

Its not necessary to request a page number range always. Take care of page number starting from 1 and underlying data having 0 as the starting index

SimpleChatTC:SimpleProxy:Pdf2Text update /cleanup readme

e1cf2ba

SimpleChatTC:SimpleProxy:Switch web flow to use file helpers

3b929f9

This also indirectly adds support for local file system access through the web / fetch (ie urlraw and urltext) service request paths.

SimpleChatTC:SimpleProxy:Add generic arxiv.org entry to allowed

9efab62

SimpleChatTC: Cleanup - remove older now unused show chat logic

e10a826

SimpleChatTC:Cleanup Usage Note and its presentation a bit

a4483e3

Make it a details block and update the content a bit

SimpleChatTC:Cleanup:UsageNote, Initial SettingsInfo shown

8501759

Usage Note * Cleanup / fix some wording. * Pick chat history handshaked len from config Ensure the settings info is uptodate wrt available tool names by chaining a reshowing with tools manager initialisation.

SimpleChatTC:PdfText:Cleanup rename to follow a common convention

1d1894a

Rename path and tags/identifiers from Pdf2Text to PdfText Rename the function call to pdf_to_text, this should also help indicate semantic more unambiguously, just in case, especially for smaller models.

SimpleChatTC:SimpleProxy: Validate deps wrt enabled service paths

2cdf3f5

helps ensure only service paths that can be serviced are enabled Use same to check for pypdf wrt pdftext

hanishkvc added 18 commits December 6, 2025 19:46

SimpleSallap:SimpleMCP:Move towards post json based flow

0a445c8

SimpleSallap:SimpleMCP:Allow auth check to be bypassed, if needed

79cfbbf

By default bearer based auth check is done always whether in https or http mode. However by updating the sec.bAuthAlways config entry to false, the bearer auth check will be carried out only in https mode.

SimpleSallap:SimpleMCP:InitalGoCleanup Limit read to ContentLength

bc9dd58

Also enforce need for kind of a sane Content-Length header entry in our case. NOTE: it does allow for 0 or other small content lengths, which isnt necessarily valid.

SimpleSallap:SimpleMCP:SendMcp expects dataclass and uses asdict

f75f93f

SimpleSallap:SimpleMCP:duplicate toolweb.mjs for mcpish client

5f895b8

SimpleSallap:ToolMCP: Initial skeletons at mcpish client

091262d

Add logic to fetch tools/list from mcp server and pass it to tools manager.

SimpleSallap:ToolMCP:McpishClient:P2:Initial go wrt tools/call

00adebe

Also fix some minor oversights wrt tools/list

SimpleSallap:SimpleMCP/ToolMCP: bring in wrt tools manager

631aa7f

Setup to test initial go of the mcpish server and client logics

SimpleSallap:SimpleMCP:Cleanup in general

fffa6a8

SimpleSallap:SimpleMCP:Move out simpleproxy and its helpers

ff71090

github-actions bot added examples python python script changes server labels Dec 8, 2025

hanishkvc mentioned this pull request Dec 8, 2025

server/publc_simplechat tiny (50KB) web client updated with reasoning, vision, builtin clientside tool calls and markdown #17506

Closed

hanishkvc mentioned this pull request Dec 8, 2025

server/public_simplechat update - builtin client side tool calls with zero setup - reasoning - vision - uncompressed 300kb - no external deps #17451

Closed

loci-dev mentioned this pull request Dec 8, 2025

UPSTREAM PR #17853: SimpleMCP - server/publc_simplechat tiny client updated with reasoning, vision, builtin clientside tool calls, markdown and mcpish toolcall support [WIP] auroralabs-loci/llama.cpp#482

Open

SimpleSallap:SimpleMCP:Fix cmdline arg oversight, cleanup space

9d895b0

Had forgotten to update docs wrt renamed --op.configFile arg Remove the unneeded space from details.md, which was triggering the editorconfig check at upstream github repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SimpleMCP - server/publc_simplechat tiny client updated with reasoning, vision, builtin clientside tool calls, markdown and mcpish toolcall support [WIP] #17853

SimpleMCP - server/publc_simplechat tiny client updated with reasoning, vision, builtin clientside tool calls, markdown and mcpish toolcall support [WIP] #17853

hanishkvc commented Dec 8, 2025 •

edited

Loading

Uh oh!

hanishkvc commented Dec 8, 2025

Uh oh!

ggerganov commented Dec 8, 2025

Uh oh!

hanishkvc commented Dec 8, 2025 •

edited

Loading

Uh oh!

ggerganov commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SimpleMCP - server/publc_simplechat tiny client updated with reasoning, vision, builtin clientside tool calls, markdown and mcpish toolcall support [WIP] #17853

Are you sure you want to change the base?

SimpleMCP - server/publc_simplechat tiny client updated with reasoning, vision, builtin clientside tool calls, markdown and mcpish toolcall support [WIP] #17853

Conversation

hanishkvc commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanishkvc commented Dec 8, 2025

Uh oh!

ggerganov commented Dec 8, 2025

Uh oh!

hanishkvc commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hanishkvc commented Dec 8, 2025 •

edited

Loading

hanishkvc commented Dec 8, 2025 •

edited

Loading