Skip to content
View ezlandau's full-sized avatar

Highlights

  • Pro

Block or report ezlandau

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Set of tools to assess and improve LLM security.

Python 2,834 469 Updated Jan 16, 2025

Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".

Python 155 32 Updated Oct 1, 2024
Python 89 11 Updated Nov 13, 2023

We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.

Python 264 29 Updated Feb 23, 2024

Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

Jupyter Notebook 112 9 Updated Jul 19, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 33,918 5,203 Updated Jan 17, 2025

LaTeX Templates for TU Darmstadt

TeX 218 72 Updated Jan 17, 2025

A collection of benchmarks and datasets for evaluating LLM.

367 24 Updated Jul 13, 2024

Papers about red teaming LLMs and Multimodal models.

91 5 Updated Nov 22, 2024

The guide to online assessments and interviews

1,752 130 Updated Sep 21, 2024

AIR-Bench 2024 is a safety benchmark that aligns with emerging government regulations and company policies

Jupyter Notebook 12 2 Updated Aug 14, 2024

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image …

Python 2,016 265 Updated Jan 17, 2025

Тестовые задания для самостоятельного выполнения от разных it компаний

HTML 6,157 907 Updated Dec 13, 2024

A Domain-Specific Language, Jailbreak Attack Synthesizer and Dynamic LLM Redteaming Toolkit

Jupyter Notebook 7 1 Updated Dec 5, 2024

Repository for "StrongREJECT for Empty Jailbreaks" paper

Jupyter Notebook 112 5 Updated Nov 3, 2024

Pure C++ implementation of several models for real-time chatting on your computer (CPU)

C++ 487 36 Updated Jan 16, 2025

The official Python library for the OpenAI API

Python 24,029 3,423 Updated Jan 17, 2025

Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs

Jupyter Notebook 204 25 Updated Jun 7, 2024

Universal and Transferable Attacks on Aligned Language Models

Python 3,570 485 Updated Aug 2, 2024

SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.

5,725 535 Updated Dec 1, 2024

Fire native system events from Cypress.

HTML 775 69 Updated Jan 17, 2025

Yacine's LLM nvim scripts

Lua 729 68 Updated Oct 24, 2024

Röttger et al. (2023): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"

Jupyter Notebook 77 9 Updated Dec 28, 2023

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.

Zig 36,593 2,633 Updated Jan 17, 2025

Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.

Go 15,420 1,179 Updated Jan 6, 2025

LLM101n: Let's build a Storyteller

31,032 1,697 Updated Aug 1, 2024

Huly — All-in-One Project Management Platform (alternative to Linear, Jira, Slack, Notion, Motion)

TypeScript 18,292 1,141 Updated Jan 17, 2025

PDF++: the most Obsidian-native PDF annotation & viewing tool ever. Comes with optional Vim keybindings.

TypeScript 994 25 Updated Jan 15, 2025

Fast, easy and reliable testing for anything that runs in a browser.

JavaScript 47,661 3,210 Updated Jan 17, 2025

a script to run docker-compose.yml using podman

Python 5,194 494 Updated Jan 16, 2025
Next
Showing results