GitHub
δΈ­EN

An easy-to-use Python framework to generate adversarial jailbreak prompts by assembling different methods

EasyJailbreak is an easy-to-use Python framework designed for researchers and developers focusing on LLM security. Specifically, EasyJailbreak decomposes the mainstream jailbreaking process into several iterable steps: initialize mutation seeds, select suitable seeds, add constraint, mutate, attack, and evaluate. On this basis, EasyJailbreak provides a component for each step, constructing a playground for further research and attempts. More details can be found in our paper.

Model
ReNeLLM
GPTFuzz
ICA
AutoDAN
PAIR
JailBroken
Cipher
JailBroken
DeepInception
MultiLingual
GCG
Avg
GPT3.5
87%
86%
0%
19%
100%
80%
100%
66%
100%
12%
61.1%
GPT4
38%
0%
1%
12%
58%
75%
58%
35%
63%
0%
31.3%
Llama2-7B-chat
31%
46%
0%
25%
52%
6%
61%
6%
8%
2%
46%
27.7%
Llama2-13B-chat
69%
42%
0%
8%
4%
90%
4%
0%
0%
46%
28.8%
Vicuna7B
77%
100%
52%
100%
100%
100%
57%
100%
29%
94%
94%
80.3%
Vicuna13B
87%
100%
80%
100%
100%
100%
61%
100%
17%
100%
94%
83.9%
ChatGLM3
86%
100%
54%
100%
96%
95%
32%
95%
33%
100%
34%
73.0%
Qwen-7B-chat
70%
100%
37%
100%
82%
100%
34%
100%
58%
99%
48%
72.8%
Intern7B
67%
100%
23%
100%
96%
100%
85%
100%
36%
99%
10%
71.6%
Mistral
90%
100%
67%
100%
94%
100%
60%
100%
40%
100%
82%
83.3%
Avg
70%
77%
31%
89%
66%
76%
64%
76%
32%
76%
47%