An asynchronous terminal server/multiple-client setup for conducting and managing chats with LLMs.
This is the successor project to llama-farm
The RAG/agent functionality should be split out into an API layer.
- client/server RPC-type architecture
- message signing
- ensure chunk ordering
- basic chat persistence and management
- set, switch to saved system prompts (personalities)
- manage prompts like chats (as files)
- chat truncation to token length
- rename chat
- profiles (profile x personalities -> sets of chats)
- import/export chat to local file
- context workspace (load/drop files)
- client inject from file
- client inject from other sources, e.g. youtube (trag)
- templates for standard instruction requests (trag)
- context workspace - bench/suspend files (hidden by filename)
- local files / folders in transient workspace
- checkboxes for delete / show / hide
- can switch between Anthropic, OpenAI, tabbyAPI providers and models
- streaming
- syntax highlighting
- decent REPL
- REPL command mode
- cut/copy from output
- client-side prompt editing
- vimish keys in output
- client-side chat/message editing (how? temporarily set the input field history? Fire up
$EDITOR
in client?) - edit via chat local import/export - latex rendering (this is tricky in the context of prompt-toolkit, but see flatlatex).
- generation cancellation
- design with multimodal models in mind
- image sending and use
- image display
- use proper config dir (group?)
- dump default conf if missing
Use agents at the API level, which is to say, use an intelligent router. This separates the chatthy system from the RAG/LLM logic.
- (auto) tools (evolve from llama-farm -> trag)
- user defined tool plugins
- server use vdb context at LLM will (tool)
- iterative workflows (refer to llama-farm)
- tool chains
- tool: workspace file write, delete
- tool: workspace file patch/diff
- tool: rag query tool
- MCP agents?
- smolagents / archgw?
- summaries and standard client instructions (trag)
- server use vdb context on request
- consider best method of pdf conversion / ingestion, OOB
- full arxiv paper ingestion (fvdb) - consolidate into one latex file OOB
- vdb result reranking with context, and winnowing (agent?)
- vdb results -> workspace (agent?)
audio streaming ? - see matatonic's servers workflows (tree of instruction templates) tasks
arXiv paper -> latex / md pdf paper -> latex / md