Release h2oGPT 0.2.0 Release · h2oai/h2ogpt

Official Release for h2oGPT 0.2.0

What's Changed

Add code to push spaces chatbot by @pseudotensor in #46
Fixes #48 by @pseudotensor in #55
More HF spaces restrictions to prevent OOM or no-good choices being chosen by @pseudotensor in #57
Add max_beams to client_test.py by @lo5 in #64
Fix directory name from h2o-llm to h2ogpt on install tutorial by @cpatrickalves in #63
h2o theme for background by @jefffohl in #68
Add option to save prompt and response as .json. by @arnocandel in #69
Update tos.md by @eltociear in #70
Use SAVE_DIR and --save_dir instead of SAVE_PATH and --save_path. by @arnocandel in #71
Make chat optional from UI/client by @pseudotensor in #74
Compare models by @pseudotensor in #42
H2O gradio theme by @jefffohl in #84
Refactor gradio into separate file and isolate it from torch specific stuff by @pseudotensor in #85
Refactor finetune so some of it can be used to check data and its tokenization by @pseudotensor in #93
Llama flash attn by @arnocandel in #86
Give default context to help chatbot by @pseudotensor in #100
CUDA mismatch work-around for no gradio case by @pseudotensor in #101
Add Triton deployment template. by @arnocandel in #91
Check data for unhelpful responses by @pseudotensor in #103
Clear torch cache memory every 20s by @pseudotensor in #90
Try transformers experimental streaming. Still uses threads, so probably won't fix browser exit GPU memory issue by @pseudotensor in #98
Handle thread stream generate exceptions. by @pseudotensor in #110
Specify chat separator by @pseudotensor in #114
[DOCS] README typo fix and readability improvements by @zainhaq-h2o in #118
Support OpenAssistant models in basic form, including 30B xor one by @pseudotensor in #119
Add stopping condition to pipeline case by @pseudotensor in #120
Allow auth control from CLI by @pseudotensor in #123
Improve data prep by @arnocandel in #122
[DOCS] Grammar / readability improvements for FAQ.md by @zainhaq-h2o in #124
neox Flash attn by @arnocandel in #31
Langchain integration by @pseudotensor in #111
Allow CLI add to db and clean-up handling of evaluate args by @pseudotensor in #137
Add zip upload and parallel doc processing by @pseudotensor in #138
Control visibility of buttons, but still gradio issues mean can't spin/block button while processing in background by @pseudotensor in #140
Add URL support by @pseudotensor in #142
HTML, DOCX, and better markdown support by @pseudotensor in #143
odt, pptx, epub, UI text paste, eml support (both text/html and text/plain) and refactor so glob simpler by @pseudotensor in #144
Reform chatbot client API code by @pseudotensor in #117
Add import control check to avoid leaking optional langchain stuff into generate/gradio. Add test by @pseudotensor in #146
[DevOps] Snyk Integration by @ChathurindaRanasinghe in #131
Add image support and show sources after upload by @pseudotensor in #147
Update finetune.py by @orellavie1212 in #132
ArXiv support via URL in chatbot UI by @pseudotensor in #152
Improve caption, include blip2 as option by @pseudotensor in #153
Control chats, save, export, import and otherwise manage by @pseudotensor in #156
Mac/Windows install and GPT4All as base model for pure CPU mode support by @pseudotensor in #157
Move loaders out of finetune, which is only for training, while loader used for generation too by @pseudotensor in #161
Allow selection of subset of docs in collection for query by @pseudotensor in #163
Improve datasource layout by @pseudotensor in #164
Refactor run_qa_db a bit, so can do other tasks by @pseudotensor in #167
Use latest peft/transformers/accelerate/bitsandbytes for 4-bit (qlora) by @arnocandel in #166
Refactor out run_eval out of generate.py by @pseudotensor in #173
Add CLI mode with tests by @pseudotensor in #174
Separate out FAISS from requirements by @pseudotensor in #184
Generalize h2oai_pipeline so works for any instruct model we have prompt_type for, so run_db_qa will stream and stop just like non-db code path by @pseudotensor in #190
Ensure can use offline by @pseudotensor in #191
Fix and test llamacpp by @pseudotensor in #197
Improve use of ctx vs. max_new_tokens for non-HF models, and if no docs, don't insert == since no docs, just confuses model by @pseudotensor in #199
UI help in FAQ by @pseudotensor in #205
Quantized model updates, switch to recommending TheBloke by @pseudotensor in #208
Fix nochat API by @pseudotensor in #209
Move docs and optional reqs to directories by @pseudotensor in #214
Allow for custom eval json file by @pseudotensor in #227
Fix run_eval and validate parameters are all passed by @pseudotensor in #228
Add setup.py wheel building option by @pseudotensor in #229
[DevOps] Fix condition issue for snyk test & snyk monitor by @ChathurindaRanasinghe in #169
Add weaviate support by @hsm207 in #218
More weaviate tests by @pseudotensor in #231
Allow add to db when loading from generate by @pseudotensor in #212
Allow update db from UI if files changed, since normally not constantly checking for new files by @pseudotensor in #232
More control over max_max_new_tokens and memory behavior from generate args by @pseudotensor in #234
Make API easier, and add prompt_dict for custom control over prompt as example of new API parameter don't need to pass by @pseudotensor in #238
Chunk improve by @pseudotensor in #239
Fix TypeError: can only concatenate str (not "list") to str on startup by @this in #242
Fix nochat in UI so enter works to submit again, and if langchain mode used then shows HTML links for sources by @pseudotensor in #244
Improve subset words and code by @pseudotensor in #245
use instructor embedding, and add migration of embeddings if ever changes, at least for chroma by @pseudotensor in #247
Add extra clear torch cache calls so embedding on GPU doesn't stick to GPU by @pseudotensor in #252
Fixes #249 by @pseudotensor in #255
Support connecting to a local weaviate instance by @hsm207 in #236
.gitignore updated for .idea and venv by @fazpu in #256
move enums and add test for export copy since keep changing what files have what structures by @pseudotensor in #260
Ensure generate hyperparameters are passed through to h2oai_pipelinepy for generation by @pseudotensor in #265
Submit button is now primary + more spacing between prompt area and action buttons by @fazpu in #261
input prompt - primary color border added + change in label text by @fazpu in #259
prompt form moved to a separate file by @fazpu in #258
Upgrade gradio by @pseudotensor in #269
Fixes #270 by @pseudotensor in #272
A couple of small updates to the documentation by @3x0dv5 in #274
Add documentation on how to connect to weaviate by @hsm207 in #267
Update Weaviate and FAISS a bit to be closer to Chroma in h2oGPT with limitations. Add testing. by @pseudotensor in #275
Escape so it outputs $LD_LIBRARY_PATH:/usr/local/cuda/lib64/ by @3x0dv5 in #276
Add h2oGPT Client by @this in #133
Update requirements and add code to get latest versions by @pseudotensor in #281
Pass actual eos id to generate, else doesn't know how to stop early if using non-standard eos id (normally=0, falcon GM was 11) by @pseudotensor in #286
Fixes #291 -- make user_path if doesn't exist but passed, and move gradio temp file to user_path if passed to generate. by @pseudotensor in #292
Add QUIP et al. metrics for context-Q-A testing by @pseudotensor in #293
Add ci support + wheel by @achraf-mer in #243
Option to fill-up context if top_k_docs=-1 by @pseudotensor in #294
Update README.md by @arnocandel in #296
Fixes #279 by @pseudotensor in #297
Update README.md by @eltociear in #301
Add support for text-generation-server, gradio inference server, OpenAI inference server. by @pseudotensor in #295
Use latest peft, to fix export failure. by @arnocandel in #310
Model N by @pseudotensor in #313
Control maxtime for TGI by @pseudotensor in #322
Fix prompting by @pseudotensor in #327
Protect evaluate against bad inputs by @pseudotensor in #326
Add explicit cast to bool for visible=kwarg[...] by @parkeraddison in #328
For multi-model ChatAll view, save all models together so can recover together by @pseudotensor in #331
Fixes #333 by quickly checking if can reach endpoint using requests by @pseudotensor in #334
ChatAll: stream as fast as one can with short timeout=0.01 to avoid stalling any single endpoint to appear in UI by @pseudotensor in #335
Handle exceptions better when doing multi-view model lock, and don't block good endpoints by @pseudotensor in #337
Queue input to avoid fresh submit using input_list at click/enter time, else truncates result because it uses input_list from time of click by @pseudotensor in #341
Cleanup gradio UI a bit, ask at top so not lost at bottom by @pseudotensor in #344
Log extra information, and fix max_max_new_tokens by @pseudotensor in #347
Fix prompting for gradio->gradio by @pseudotensor in #348
Reset client hash every client call, and reset client state if server changes for when want stateless client by @pseudotensor in #351
If have HF model/tokenizer, use that instead of faketokenizer (tiktoken) since see too large differences and failures even with 250 token buffer, still of by another 350. by @pseudotensor in #352
If only can add to MyData, automatically add without having to click button by @pseudotensor in #353
Get number of tokens after limited, although before prompt_type is applied, to reduce max_new_tokens for OpenAI by @pseudotensor in #366
Typo: One some systems -> On some systems by @ernstvanderlinden in #377
8bit mode command fix FAQ.md by @0xraks in #375
Fixes #368 by rotating session has for streaming too. by @pseudotensor in #384
Add a docker runtime, to be used to run h2o gpt models by @achraf-mer in #381
Update README_LangChain.md by @cimadure in #387
Fixes #382 for offloading llama to GPU using llama.cpp. by @pseudotensor in #393
Fix nochat by @pseudotensor in #394
Client revamp by @this in #349
Update README_CPU.md by @wienke in #402
Add summarization action by @pseudotensor in #365
Move files to src by @pseudotensor in #406
Add test to the Client to validate parameters order with h2ogpt by @this in #392
Tweaks for MAC M1 by @Mathanraj-Sharma in #408
Add AutoGPTQ -- Fixes #263 and Fixes #339 and Fixes #417 by @pseudotensor in #416
Fix prompt answer after broken after vicuna addition by @pseudotensor in #422
Improve UI and UX -- Fixes #285 by @pseudotensor in #429
FAQ.md - make the information about hugging face models stand out + fix for the prompter link by @fazpu in #435
[DOCS] Fix typos and improve readability (FAQ page) by @zainhaq-h2o in #441
Autoset langchain_mode if not passed by @pseudotensor in #436
Fix attribute error for NoneType - Python 3.9 by @Mathanraj-Sharma in #445
Add vLLM support -- Fixes #312 by @pseudotensor in #454
add more deps to docker by @achraf-mer in #452
Update Windows bitsandbytes wheel link by @jllllll in #458
Update docs for llama metal support by @Mathanraj-Sharma in #470
[DevOps] Update README for docker runtime image consumption by @ChathurindaRanasinghe in #477
Fix typo in timeout_iterator.py by @eltociear in #479
Add E2E test for fine-tuning/export/tgi, exposed issue with TGI 0.9.1, works in 0.8.2/0.9.2 by @arnocandel in #424
Use latest bitsandbytes and accelerate. by @arnocandel in #485
Revert "Use latest bitsandbytes and accelerate." by @pseudotensor in #488
Update main readme for docker runtime consumption by @ChathurindaRanasinghe in #489
Package more modules to python wheel by @achraf-mer in #492
Add note to install torch 2.1 for MPS by @Mathanraj-Sharma in #491
UI is spread to the full width by @fazpu in #495
add prompt template for llama2 by @arnocandel in #494
Fixes #398 -- Custom collection/db and user_path, persisted to disk f… by @pseudotensor in #476
Remove system prompt for llama2, too guarded. by @arnocandel in #506
Minor Shell Script Changes by @slycordinator in #487
Add tagging docker runtime with semver by @ChathurindaRanasinghe in #500
Update readme for semver docker runtime image by @ChathurindaRanasinghe in #511
Fixes #446 Control chat history being added to context for langchain or not by @pseudotensor in #507
LLaMa2 with AutoGPTQ and 16-bit mode with RoPE scaling by @pseudotensor in #517
Fixes #514 Fix llama2 prompting by @pseudotensor in #523
Support exllama by @pseudotensor in #526
Exclude unnecessary files & directories from wheel by @ChathurindaRanasinghe in #537
Improve docs by @pseudotensor in #542
feat: expose 'root_path' from gradio by @jcatana in #547
Add long context tests and fix tokenizer truncation for activated rope_scaling by @arnocandel in #524
Set max_seq_len outside config, config can't always set due to protections in class by @pseudotensor in #554
Add latest tag to docker runtime by @ChathurindaRanasinghe in #560
Load args from env vars as long as var starts with H2OGPT_ by @achraf-mer in #556
Isolate n_gpus base handling for CUDA from MPS by @Mathanraj-Sharma in #563
Efficient parallel summarization and use full docs, not vectordb chunks. by @pseudotensor in #551
Unblock streaming for multi-stream case by @pseudotensor in #577
Use docker entrypoint args instead of custom entry point by @achraf-mer in #521
Minor docs update by @achraf-mer in #591
Control embedding migration by @pseudotensor in #585
Fix client's ValueError: An event handler (fn) didn't receive enough input values (needed: 34, got: 32). by @pseudotensor in #600
Upgrade gradio-client version to v0.3.0 in the Client by @this in #601
Test docker of TGI + h2oGPT dockers by @pseudotensor in #602
docs: Add GCR link to the Docker README by @ChathurindaRanasinghe in #604
Better handling of pdfs if broken by @pseudotensor in #608
Add replicate support, Fixes #603 by @pseudotensor in #606
Document how to disable chroma telemetry by @mmalohlava in #543
Use /submit_nochat_api for the Text Completion API in the Client by @this in #609
Allow server to save history.json with requests headers by @pseudotensor in #613
Fine tune llama2 by @arnocandel in #574
[Client] Parse the return value from /submit_nochat_api to extract the response by @this in #627
Handle persistence of user states for personal/scratch spaces. by @pseudotensor in #618
Windows installer by @pseudotensor in #647
Ensure meta data in response and Fixes #649 and upgrade gradio by @pseudotensor in #653
Fix docker permissions and allow using a non root user by @achraf-mer in #664
some fixes for docker run by @zba in #659
Minor grammatical changes by @anfrd in #663
mac install readme updated - the tessaract command by @fazpu in #665
the message copy button moved closer to top border and message padding increased to 16px by @fazpu in #666
Azure OpenAI by @pseudotensor in #667
the copy button is placed at the bottom of each message by @fazpu in #668
Fixes for conda installation issues by @ChathurindaRanasinghe in #681
Improve bitsandbytes usage and control in UI by @pseudotensor in #682
Add pandas as a direct dependency for vLLM by @ChathurindaRanasinghe in #686
GitHub Action Workflow to Publish Python Package by @ChathurindaRanasinghe in #670
Fixes #678 and Fixes #451 and Fixes #434 by @pseudotensor in #690
Adding install Git command to windows installation instructions by @ceriseghost in #698
Fixes #709 -- improve in-context learning control by @pseudotensor in #720
add build id as a docker tag (will make it easier to trace in CI history) by @achraf-mer in #719
Improve offline caching by @achraf-mer in #715
Make sure cache directory is consistent, and is pointing to /workspace/.cache by @achraf-mer in #716
Add performance benchmarks. by @arnocandel in #648
Better control over prompting for document Q/A by @pseudotensor in #721
fix: Modify JavaScript code generation to be compatible with Gradio Blocks. by @mmalohlava in #696
explicitly set additional cache directories to be under ~/.cache (or /workspace/.cache) by @achraf-mer in #728
change doc to run with local host user, so local host cache can be reused. by @achraf-mer in #730
More documentation updates by @achraf-mer in #732
Add vLLM in docker by @ChathurindaRanasinghe in #714
Fix llama2 by @arnocandel in #747
Fix Llama2 7B fine-tuning by @arnocandel in #644
Add ability to control quality-effort of ingestion/parsing and add support for json, jsonl, gzip by @pseudotensor in #737
Fix make_db.py from docker and document in readme by @achraf-mer in #750
[Docs] Change docker image name in vllm by @ChathurindaRanasinghe in #753
adding first draft of doctr integration by @ryanchesler in #752
Rc/#762 fixes file upload hanging on UI by @ryanchesler in #765
Added prompter entries for lmsys/vicuna-7b-v1.5, lmsys/vicuna-13-v1.5… by @patrickhwood in #756
don't set envs, just keep the defaults from HOME env var by @achraf-mer in #733
[DOCS] Fix the link to offline README by @muendelezaji in #778
Added the option to create OCRed documents that are layout aware by @ryanchesler in #779
h2oGPT Helm Chart by @EshamAaqib in #770
doc(macos): pin llama-cpp-python version for support GGML by @iam4x in #780
Fixes #703 Bugfix: Broken multilanguage output by @Mins0o in #790
added pix2struct by @ryanchesler in #792
Fixes #508 by @pseudotensor in #805
attach button added to the prompt form by @fazpu in #674
DocTR handling of pdfs by @ryanchesler in #787
Improve docker layer caching to reduce overall image size by @achraf-mer in #803
Add prompt type for Falcon-180B(-chat) by @arnocandel in #806
[DevOps] Packer scripts for Azure, GCP & Jenkins pipeline by @ChathurindaRanasinghe in #788
Add softlink to preserve compatibility with old commands from docs and readme(s) by @achraf-mer in #808
consolidate install script in one place, speed up build, + fix caching for TGI and vLLM by @achraf-mer in #813
Rebuild duckdb with control over threads to avoid excessive threads per db when system has large core count by @pseudotensor in #810
enable_pdf_doctr in utils by @ffalkenberg in #819
Allow choose model from UI and client via model_active_choice option when using model_lock by @pseudotensor in #820
Merge nochat API model_active_choice with visible_models by @pseudotensor in #823
landing screens components re-ordered on mobile screens by @fazpu in #827
header styling changed on mobile screen by @fazpu in #829
prompt area and upload button adjusted for mobile screens by @fazpu in #830
visible models don't have the remove-all button by @fazpu in #831
Build duckdb using manylinux by @achraf-mer in #834
Bump helm chart build to 85. by @tomkraljevic in #833
Fix build tag by @achraf-mer in #836
Update to new chroma to fix DB corruption issues by @pseudotensor in #837
labels are brighter by @fazpu in #818
app styling updated by @fazpu in #856
Keyed access by @pseudotensor in #850
css cleanup - two unused css id definitions removed by @fazpu in #852
dark theme - secondary button styling improved, label background color a bit lighter by @fazpu in #857
helm chart improvements by @achraf-mer in #825
Simplify system_prompt, no more separate use_system_prompt, and ensur…e pass through that to all models that take system prompt, e.g. openai, replicate if supported, llama2, beluga, falcon180 by @pseudotensor in #867
Allow pre-appending chat conversation by @pseudotensor in #869
Chore: Add printing Makefile variables by @ChathurindaRanasinghe in #872
Fixes #873 by @pseudotensor in #874
Better prepare offline docs and code by @pseudotensor in #877
Add text_context_list to directly pass text lists to LLM to avoid db etc. steps if don't care about persisting state and just want LLM to use context as if uploaded docs by @pseudotensor in #879
Fix locking by @pseudotensor in #883
Account for prompt when counting tokens in prompt template by @pseudotensor in #891
Remove extra wget by @lamw in #892
Move the h2ogpt_key param to the constructor of the Client by @this in #899
Allow rw to /workspace data by @achraf-mer in #902
more cleanup to docker build scripts by @achraf-mer in #903
Web search and Agents by @pseudotensor in #858
configure update strategy by @lweren in #909
Standardizes --llamacpp_dict usage in docs by @jamesbraza in #845
[DevOps] Build wheel after modifying version in workflow by @ChathurindaRanasinghe in #921
fix volume mounts by @achraf-mer in #919
Fix handling of chat_conversation+system prompt using doing langchain by @pseudotensor in #920
Bump Helm Version by @EshamAaqib in #824
Fix OpenAI summarization and use of text_context_list prompting and simplify code by @pseudotensor in #924
Speed-up sim search if only doing chunk_id filter. Speed-up other various tasks if large db. by @pseudotensor in #929
Update docs for MACOS MPS by @Mathanraj-Sharma in #911
Update Windows Installer files for October 2023 by @pseudotensor in #930
Add docker compose for vllm and when running on CPU mode by @achraf-mer in #927
add extra env variables by @lweren in #940
Hk/main/benchmark plots by @hemenkapadia in #941
Improve summarization and add extraction -- speed-up streaming by @pseudotensor in #935
External LLM Support - Helm by @EshamAaqib in #944
Improve airgapped cache by @achraf-mer in #952
Add AWQ by @pseudotensor in #954
Bump helm chart version by @EshamAaqib in #957
Update README_ui.md by @squidwardthetentacles in #925
Improve timeout via max_time in UI/API by @pseudotensor in #958
[DOCS] Improve FAQ readability by @zainhaq-h2o in #959
Ensure clone takes into account client inside endpoints. Persist client typically unless can't or don't request, since always using clone now. by @pseudotensor in #966
[DOCS] Improve readability of README (second edit) by @zainhaq-h2o in #968
[DOCS] Improve readability of INSTALL.md by @zainhaq-h2o in #971
Add attention_sinks support for arbitrarily long generation by @pseudotensor in #973
Add gputil python package by @tomkraljevic in #982
Fixed typo gpu_mem_track.py modelling_RW_falcon40b.py modelling_RW_falcon7b.py by @AniketP04 in #981
Fixed typo timeout_iterator.py by @AniketP04 in #988
Catch exception if not quite in job.future._exception and raise up to gradio for adding to chat exceptions in UI or raise direct if API. by @pseudotensor in #989
Add prompt and test for https://huggingface.co/BAAI/AquilaChat2-34B-16K and related chat models by @pseudotensor in #986
fix the problem with image pull secrets by @lweren in #992
relax max_new_tokens to be per prompt by @pseudotensor in #998
Avoid system OOM when too many pages for doctr by @pseudotensor in #999
Implement HYDE by @pseudotensor in #1004
Use migration-safe /submit_nochat_api for the ChatCompletion API by @this in #1010
Add client.list_models() method by @this in #1012
Update get_limited_prompt and use tokenizer from llama.cpp directly by @pseudotensor in #1015
Typo by @MSZ-MGS in #1016
init container image override by @lweren in #1019
Fix source file link by @us8945 in #1024
For codellama or other JSON friendly models, stack system prompt with instructions and give document chunks in json, and ask for json output. by @pseudotensor in #978
Refactor models API in the Client by @this in #1026
Rename models param to model in the Client by @this in #1029
Update client/README.md by @surenH2oai in #1031
One click installer setup for MacOS by @Mathanraj-Sharma in #1033
Add client.server API to the Client by @this in #1036
[Client] Refactor classes in the completion APIs into a separate sub-module by @this in #1039
Make Helm work with external and local LLM's by @EshamAaqib in #1034
Add annotations to h2ogpt web svc by @ozahavi in #1044
[Docs] Add downloading client from the GH release by @ChathurindaRanasinghe in #1042
[Client] Add streaming support for the text completion API by @this in #1046
Various summarization/extraction fixes + easier llama.cpp control + redesign of Models UI by @pseudotensor in #1045
Allow multiple llama, but llama.cpp is not thread safe, so only allowed if doing inference server for all but one. by @pseudotensor in #1050
Update links by @arnocandel in #1056
Windows one-click Nov5 by @pseudotensor in #1055
made changes singtel requested by @overaneout in #1038
[DevOps] Cloud Image Fixes by @ChathurindaRanasinghe in #1057
Add configs related to MPS for one click installer by @Mathanraj-Sharma in #1060
Add Mac one click installer to README - NOV 08, 2023 by @Mathanraj-Sharma in #1064
Fix prompting in langchain pandas csv agents, missing format_instructions and uses mrkl prompt even if make class on top, and no way to work around by @pseudotensor in #1058
Youtube and local audio transcription by @pseudotensor in #1070
Commands to give permissions for Mac one-click installers by @Mathanraj-Sharma in #1071
Fix preload of ASR and allow embedding model to be on any GPU by @pseudotensor in #1074
Parse files inside tar.gz by @Mathanraj-Sharma in #1073
Reorganize UI a bit, and make it easier to upload url vs. text, autodetect by @pseudotensor in #1075
Add deepseek coder prompt by @pseudotensor in #1083
Update README_offline.md by @achraf-mer in #1096
[DOCS] Improve readability of README_ui.md by @zainhaq-h2o in #1103
Streaming Speech-to-Text (STT) and Streaming Text-to-Speech (TTS) with Voice Cloning and Hands-Free Chat by @pseudotensor in #1089
Update README.md (cosmetics) by @MSZ-MGS in #1104
Fix Typo in FAQ by @daanknoope in #1112
[DOCS] Client APIs README typo fixes and readability edit by @zainhaq-h2o in #1117
[HELM] Remove default values from overrideConfig by @EshamAaqib in #1120
[DOCS] Improve GPU readme by @zainhaq-h2o in #1141
Upgrade to gradio4 by @pseudotensor in #1110
For Issue #1142, not a specific fix yet. Noticed documents that failed to parse were coming up as selectable documents. Fix that. by @pseudotensor in #1150
More for Issue #1142 -- allow filter files and content by substrings and operations and/or by @pseudotensor in #1151
Return prompt_raw so e.g. LLM and langchain prompting with docs can be seen by API by @pseudotensor in #1152
Update gpt_langchain.py to support Youtube Shorts by @cherrerajobs in #1154
web scrape by @pseudotensor in #1156
Chunk streaming to help speed due to gradio UI slowness/bugs by @pseudotensor in #1162
Use openai v1 for vllm by @pseudotensor in #1164
[DOCS] Improve Linux readme by @zainhaq-h2o in #1149
More streaming optimizations for good UX by @pseudotensor in #1171
Gradio API call examples by @us8945 in #1174
Make Claude and other non-sytem prompt models use chat history to mimic system prompt by @pseudotensor in #1177
Minor doc improvements by @zainhaq-h2o in #1187
Remove call to ngpus and openai/vllm client creation so faster when using by @pseudotensor in #1192
Add video frame extraction, image chat, and image generation by @pseudotensor in #1181
[HELM] Add option to run vLLM and h2oGPT on same pod by @EshamAaqib in #1194
Gemini by @pseudotensor in #1208
[DOCS] Minor doc fixes and improvements by @zainhaq-h2o in #1205
Support docsgpt https://huggingface.co/Arc53/docsgpt-7b-mistral by @pseudotensor in #1215
improve streaming and error logging by @pseudotensor in #1218
docs: faq: document auth.json file format by @Blacksuan19 in #1206
Use transformers version of attention sinks: https://github.com/huggingface/transformers/releases/tag/v4.36.0 by @pseudotensor in #1219
Allow private model that fails to load to not revert tokenizer to None if passed tokenizer_base_model by @pseudotensor in #1223
hide action selection if only one action is enabled by @Blacksuan19 in #1224
add ability to set custom page title and favicon by @Blacksuan19 in #1225
OpenAI Proxy Server redirects to Gradio Server by @pseudotensor in #1231
Improve testing for OpenAI server and fix key issues with auth etc. by @pseudotensor in #1234
Handle errors better for OpenAI client by @pseudotensor in #1235
Reachout by @pseudotensor in #1236
Allow langchain for eval and add test -- Fixes #1244 by @pseudotensor in #1246
Allow persistence for GradioClient for Issue #1247 by @pseudotensor in #1249
Fixes #1247 by @pseudotensor in #1251
Go back to checking system hash since stored in docker image now, even if takes 0.2s, worth it. Could delay checks to every minute or something, but more risky. by @pseudotensor in #1253
[HELM] Add vLLM check when running as stack by @EshamAaqib in #1255
Control llava prompt by @pseudotensor in #1262
Remove HYDE accordion outputs if present before giving history to LLM, and remove chat=True/False for prompt generation, hold-over and led to bugs in prompting for gradio->gradio by @pseudotensor in #1263
Better exceptions docview by @pseudotensor in #1264
Fixes to Helm Chart by @EshamAaqib in #1269
use docker compose with a Dockerfile to force rebuild if new by @achraf-mer in #1273
Windows update Jan 8, 2024 by @pseudotensor in #1272
minor package upgrades by @pseudotensor in #1275
MistralAI by @pseudotensor in #1290
Enforce allow_upload_to_user_data and allow_upload_to_my_data -- Fixes #1296 by @pseudotensor in #1297
Update README_MACOS.md by @antoninadert in #1292
[DOCS] readme minor readability improvements by @zainhaq-h2o in #1299
Ensure parameters for OpenAI->h2oGPT are transcribed. by @pseudotensor in #1301
Rotate image before OCR/DocTR - WIP by @pseudotensor in #1239
Allow API call for conversion of text to audio by @pseudotensor in #1310
Update README_MACOS.md by @antoninadert in #1308
More API protection by @pseudotensor in #1314
[HELM] Fix PodLabels by @EshamAaqib in #1318
Update README_MACOS.md by @antoninadert in #1325
exposing imagePullSecret and tag in values.yaml by @robinliubin in #1328
Update QR code. by @arnocandel in #1336
h2ogpt support namespaceOverride by @robinliubin in #1337
Update docker for better vllm support, go to higher cuda for cuda kernels to exist by @pseudotensor in #1339
Some package updates by @pseudotensor in #1344
Add verifier -- only via API for now by @pseudotensor in #1267
fix-33075_adding_shared_memory by @robinliubin in #1352
Cu121 by @pseudotensor in #1368
Add vision models as llms by @pseudotensor in #1369
Upgrade to gradio4 3rd attempt by @pseudotensor in #1380
Increase timeout when have failure to make sure we know the reason. by @pseudotensor in #1384
Faster for llava by @pseudotensor in #1392
Fixes #1270 by @pseudotensor in #1396
[DOCS] Minor doc improvements by @zainhaq-h2o in #1402
Fixes #1324 -- clear memory when browser tab closes by @pseudotensor in #1407
Fix TEI use of HuggingFaceHubEmbeddings by @pseudotensor in #1424
Fix login if chatbot counts differ from in auth file by @pseudotensor in #1429
Improve auth/login for OpenAI API and fix AWQ by @pseudotensor in #1434
GPT's user review functionality added by @Darshan-Malaviya in #1436
[HELM] Add option to disable anti affinity by @EshamAaqib in #1423
Update MacOS doc with information related to BFloat16 error by @Mathanraj-Sharma in #1442

New Contributors

@lo5 made their first contribution in #64
@cpatrickalves made their first contribution in #63
@jefffohl made their first contribution in #68
@eltociear made their first contribution in #70
@ChathurindaRanasinghe made their first contribution in #131
@orellavie1212 made their first contribution in #132
@hsm207 made their first contribution in #218
@this made their first contribution in #242
@fazpu made their first contribution in #256
@3x0dv5 made their first contribution in #274
@parkeraddison made their first contribution in #328
@ernstvanderlinden made their first contribution in #377
@0xraks made their first contribution in #375
@cimadure made their first contribution in #387
@wienke made their first contribution in #402
@jllllll made their first contribution in #458
@slycordinator made their first contribution in #487
@jcatana made their first contribution in #547
@mmalohlava made their first contribution in #543
@zba made their first contribution in #659
@anfrd made their first contribution in #663
@ceriseghost made their first contribution in #698
@ryanchesler made their first contribution in #752
@patrickhwood made their first contribution in #756
@muendelezaji made their first contribution in #778
@iam4x made their first contribution in #780
@Mins0o made their first contribution in #790
@ffalkenberg made their first contribution in #819
@tomkraljevic made their first contribution in #833
@lamw made their first contribution in #892
@lweren made their first contribution in #909
@jamesbraza made their first contribution in #845
@hemenkapadia made their first contribution in #941
@squidwardthetentacles made their first contribution in #925
@AniketP04 made their first contribution in #981
@MSZ-MGS made their first contribution in #1016
@us8945 made their first contribution in #1024
@surenH2oai made their first contribution in #1031
@ozahavi made their first contribution in #1044
@overaneout made their first contribution in #1038
@daanknoope made their first contribution in #1112
@cherrerajobs made their first contribution in #1154
@Blacksuan19 made their first contribution in #1206
@antoninadert made their first contribution in #1292
@Darshan-Malaviya made their first contribution in #1436

Full Changelog: https://github.com/h2oai/h2ogpt/commits/0.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

h2oGPT 0.2.0 Release

What's Changed

New Contributors

Contributors