Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list of open-source publicly-available llms for code #49

Open
andre15silva opened this issue May 3, 2023 · 23 comments
Open

list of open-source publicly-available llms for code #49

andre15silva opened this issue May 3, 2023 · 23 comments

Comments

@andre15silva
Copy link
Member

andre15silva commented May 3, 2023

Name Publication Date Model Type Sizes URL
- CodeGen 03/22 Decoder 350M, 2B, 6B, 16B https://huggingface.co/Salesforce/codegen-16B-mono
- InCoder 04/22 Decoder 1.3B, 6.7B https://huggingface.co/facebook/incoder-6B
- CodeGeeX 09/22 Decoder 13B https://huggingface.co/spaces/THUDM/CodeGeeX
- santacoder https://huggingface.co/bigcode/santacoder
- replit https://huggingface.co/replit/replit-code-v1_5-3b
- codet5 https://huggingface.co/Salesforce/codet5-large
- plbart https://huggingface.co/models?other=plbart
@monperrus
Copy link
Contributor

diff-codegen-350m
diff-codegen-2b
diff-codegen-6b

all fine-tuned from Salesforce’s CodeGen code synthesis models

ref: https://carper.ai/diff-models-a-new-way-to-edit-code/

@monperrus
Copy link
Contributor

monperrus commented May 10, 2023

@andre15silva
Copy link
Member Author

andre15silva commented May 10, 2023

to merge: https://github.com/eugeneyan/open-llms (section "Open LLMs for code")

@andre15silva
Copy link
Member Author

codegen2 (also supports infilling)

https://github.com/salesforce/CodeGen2

@monperrus
Copy link
Contributor

https://github.com/bigcode-project/starcoder
15.5B parameter model supports code generation and infilling

@monperrus
Copy link
Contributor

@monperrus
Copy link
Contributor

@monperrus monperrus changed the title list of llms (for code) list of publicly-available llms for code Aug 26, 2023
@monperrus
Copy link
Contributor

monperrus commented Aug 26, 2023

code-llama / CodeLlama

code-llama by Meta
https://about.fb.com/news/2023/08/code-llama-ai-for-coding/

Code Llama: Open Foundation Models for Code
https://arxiv.org/pdf/2308.12950

@monperrus
Copy link
Contributor

monperrus commented Oct 16, 2023

The Mistral models https://mistral.ai/

@martinezmatias says they are good.

Mistral 7B
https://arxiv.org/pdf/2310.06825

@monperrus
Copy link
Contributor

CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model
https://arxiv.org/pdf/2310.06266

@monperrus
Copy link
Contributor

monperrus commented Oct 21, 2023

Qwen

CODE-QWEN and CODE-QWEN-CHAT 通义千问 (Alibaba)

QWEN TECHNICAL REPORT
https://arxiv.org/pdf/2309.16609.pdf
https://github.com/QwenLM/Qwen

Qwen2. 5-Coder Technical Report
https://arxiv.org/pdf/2409.12186

Nov 2024:

Qwen 2.5-Coder-32B-Instruct Performance: @Alibaba_Qwen announced Qwen 2.5-Coder-32B-Instruct, which matches or surpasses GPT-4o on multiple coding benchmarks. Early testers reported it as "indistinguishable from o1-preview results" (@hrishioa) and noted its competitive performance in code generation and reasoning.

updated @andre15silva

@monperrus
Copy link
Contributor

DeepSeek Coder: Let the Code Write Itself

@monperrus
Copy link
Contributor

Magicoder
Magicoder: Source Code Is All You Need
https://arxiv.org/abs/2312.02120
https://huggingface.co/TheBloke/Magicoder-S-DS-6.7B-GGUF

@monperrus
Copy link
Contributor

@monperrus
Copy link
Contributor

CodeShell Technical Report
https://arxiv.org/pdf/2403.15747

CodeShell-Base, a seven billion-parameter foundation model with 8K context length, showcasing exceptional proficiency in code comprehension, which outperforms CodeLlama in Humaneval after training on just 500 billion tokens (5 epochs).

@monperrus
Copy link
Contributor

Mixtral, @FredBonux is able to use it over groq

@andre15silva
Copy link
Member Author

andre15silva commented Jun 7, 2024

mistralai/Codestral-22B-v0.1

https://huggingface.co/mistralai/Codestral-22B-v0.1

@monperrus
Copy link
Contributor

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
https://arxiv.org/pdf/2406.11931

@ASSERT-KTH ASSERT-KTH deleted a comment from bbaudry Oct 21, 2024
@monperrus
Copy link
Contributor

aiXcoder-7B: A Lightweight and Effective Large Language Model for Code Completion
https://www.semanticscholar.org/reader/2c5dd0f56eff1caa3edb20354374a9585181ea73

@monperrus monperrus changed the title list of publicly-available llms for code list of open-source publicly-available llms for code Oct 22, 2024
@monperrus
Copy link
Contributor

Tencent 's Hunyuan (huggingface, paper)

@monperrus
Copy link
Contributor

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
https://arxiv.org/pdf/2411.04905

@monperrus
Copy link
Contributor

update from https://github.com/underlines/awesome-ml/

Model Link Description
CursorCore https://huggingface.co/collections/TechxGenus/cursorcore-series-6706618c38598468866b60e2 Coding LLMs for use within CursorCore and the VL-Vicuna model is a novel VL-LLM. Paper, code
gemma2 2b https://huggingface.co/bartowski/gemma-2-2b-it-GGUF 2b small language models, trained on 2T tokens, 87% code 13% English / Chinese, up to 33B with 16K context sizemodel by google achieving SOTA performance for sub 3b models on coding benchmarks
DeepSeekCoderv2 https://github.com/deepseek-ai/DeepSeek-Coder-V2?tab=readme-ov-file#2-model-downloads 16b and code, news
codegemma https://huggingface.co/google/codegemma-7b google's coding models from 2b base, 7b base and 52B MoE at 12B each
grantie https://huggingface.co/collections/ibm-granite/granite-code-models-6624c5cec322e4c148c8b330 IBMs code models available in 3b, 8b, 20b size as base and tool use supporting Embedinstruct variants with up to 128k context size
codeqwen1.5 https://huggingface.co/Qwen/CodeQwen1.5-7B base and Rerank methodology. model weights
InternLM2.5 https://huggingface.co/internlm/internlm2_5-7b-chat 7B base andinstruct models for code completion and chat - DeepSeek-V2 21B Strong, Economical,models focusing reasoning, math and Efficient Mixture-of-Experts Language Model
CodeGeeX4 https://huggingface.co/THUDM/codegeex4-all-9b 9B multilingual code generation model based on phi-3-mini, allowing tofor chat and instruct based onwith ajson template that the model fills from unstructured text provided
Mamba-Codestral https://huggingface.co/mistralai/Mamba-Codestral-7B-v0.1 by mistral based on the Mamba2 architecture performing on parquality of current SOTA closed source models
CodeStral-22B https://huggingface.co/mistralai/Codestral-22B-v0.1 Coding model trained on 80+ languages with great performanceinstruct and 128k context windows forFill in the 7 and 72b models
Granite https://huggingface.co/ibm-granite family of Code Models from IBM with7B parameters and good quality
wavecoder-ultra-6.7b https://huggingface.co/microsoft/wavecoder-ultra-6.7b covering four general code-related tasks: code generation, code summary, code translation, and 236b mixturecode repair
aiXcoder https://huggingface.co/aiXcoder/aixcoder-7b-base 7B Code LLM for code completion, comprehension, generation
StarCoder2 https://huggingface.co/bigcode/starcoder2-15b 15B, 7B and 3B code completion models trained on The Stack v2
Poro https://huggingface.co/LumiOpen/Poro-34B SiloGen model checkpoints of a family of multilingual open source LLMs covering all official European languages and code, news
deepseek-coder https://github.com/deepseek-ai/DeepSeek-Coder code language models, trained on 2T tokens, 87% code 13% English / Chinese, up to convert HTML33B with 16K context size achieving SOTA performance on coding benchmarks
CodeShell https://github.com/WisdomShell/codeshell/blob/main/README_EN.md code LLM with 7b parameters trained on 500b tokens, context length of 8k outperforming CodeLlama and Starcoder on humaneval, weights
salesforce/CodeT5 https://github.com/salesforce/codet5 code assistant, has released their codet5+ 16b and content identification an LLM task
replit-code https://huggingface.co/replit/ focused on Code Completion. The model has been trained on a subset of the Stack Dedup v1.2 dataset.
BigCode https://huggingface.co/bigcode Open Scientific collaboration to have native "listening" ability, using an early fusion technique,train a coding LLM
CodeGeeX 13B https://huggingface.co/spaces/THUDM/CodeGeeX Multi Language Code Generation Model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants