Skip to content

Latest commit

 

History

History
103 lines (70 loc) · 2.82 KB

README.md

File metadata and controls

103 lines (70 loc) · 2.82 KB

chatthy

An asynchronous terminal server/multiple-client setup for conducting and managing chats with LLMs.

This is the successor project to llama-farm

The RAG/agent functionality should be split out into an API layer.

network architecture

  • client/server RPC-type architecture
  • message signing
  • ensure chunk ordering

chat management

  • basic chat persistence and management
  • set, switch to saved system prompts (personalities)
  • manage prompts like chats (as files)
  • chat truncation to token length
  • rename chat
  • profiles (profile x personalities -> sets of chats)
  • import/export chat to local file

context workspace

  • context workspace (load/drop files)
  • client inject from file
  • client inject from other sources, e.g. youtube (trag)
  • templates for standard instruction requests (trag)
  • context workspace - bench/suspend files (hidden by filename)
  • local files / folders in transient workspace
  • checkboxes for delete / show / hide

client interface

  • can switch between Anthropic, OpenAI, tabbyAPI providers and models
  • streaming
  • syntax highlighting
  • decent REPL
  • REPL command mode
  • cut/copy from output
  • client-side prompt editing
  • vimish keys in output
  • client-side chat/message editing (how? temporarily set the input field history? Fire up $EDITOR in client?) - edit via chat local import/export
  • latex rendering (this is tricky in the context of prompt-toolkit, but see flatlatex).
  • generation cancellation

multimodal

  • design with multimodal models in mind
  • image sending and use
  • image display

miscellaneous / extensions

  • use proper config dir (group?)
  • dump default conf if missing

tool / agentic use

Use agents at the API level, which is to say, use an intelligent router. This separates the chatthy system from the RAG/LLM logic.

  • (auto) tools (evolve from llama-farm -> trag)
  • user defined tool plugins
  • server use vdb context at LLM will (tool)
  • iterative workflows (refer to llama-farm)
  • tool chains
  • tool: workspace file write, delete
  • tool: workspace file patch/diff
  • tool: rag query tool
  • MCP agents?
  • smolagents / archgw?

RAG

  • summaries and standard client instructions (trag)
  • server use vdb context on request
  • consider best method of pdf conversion / ingestion, OOB
  • full arxiv paper ingestion (fvdb) - consolidate into one latex file OOB
  • vdb result reranking with context, and winnowing (agent?)
  • vdb results -> workspace (agent?)

unallocated / out of scope

audio streaming ? - see matatonic's servers workflows (tree of instruction templates) tasks

arXiv paper -> latex / md pdf paper -> latex / md