Skip to content

lmaeda/LLMGoat

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLMGoat Logo

LLMGoat

Offensive LLM Security — Learn by Exploiting

OWASP LLM Top 10 Python 3.12 Mistral / Ollama Status

🎥 Watch Demo🚀 Get Started📋 Challenges


What is LLMGoat?

LLMGoat is a gamified, hands-on exploitation environment for learning offensive LLM security. As LLMs get rapidly adopted in production, developers and security teams often underestimate attack surfaces unique to these models — prompt injection, data poisoning, system prompt leakage, and more.

LLMGoat lets you safely exploit, observe, and measure the impact of these attacks in realistic scenarios — so you understand the risks before they hit production.

10 core challenges, each mapped directly to the OWASP LLM Top 10.


🎥 Demo

LLMGoat Demo

LLMGoat Screenshot


📋 OWASP LLM Top 10 Challenges

# Vulnerability Description
LLM01 Prompt Injection Manipulate the model's behavior by injecting adversarial instructions
LLM02 Sensitive Information Disclosure Trick the model into revealing confidential or private data
LLM03 Supply Chain Exploit vulnerabilities in third-party LLM components or pipelines
LLM04 Data and Model Poisoning Corrupt training data or fine-tuning to alter model behavior
LLM05 Improper Output Handling Exploit unsafe handling of LLM-generated output downstream
LLM06 Excessive Agency Abuse over-permissioned LLM agents to take unintended actions
LLM07 System Prompt Leakage Extract hidden system-level instructions from the model
LLM08 Vector and Embedding Weaknesses Attack RAG pipelines via embedding manipulation
LLM09 Misinformation Cause the model to generate and spread false or harmful content
LLM10 Unbounded Consumption Exploit resource exhaustion or runaway inference costs

🚀 Deployment Steps

Prerequisites

  • Ollama installed and running
  • Python 3.12

Pull the required models:

# LLM for inference
ollama pull mistral

# Embedding model (all-MiniLM-L6-v2) — downloaded automatically via gpt4all on first run
# No manual download needed

Setup

# 1. Move into the app directory
cd app

# 2. Create a virtual environment
python3.12 -m venv venv

# 3. Activate it
source venv/bin/activate          # macOS / Linux
# venv\Scripts\activate           # Windows

# 4. Install dependencies
pip3 install -r requirements.txt

# 5. Run the app
python3 app.py

Open your browser at http://127.0.0.1:8000 and start exploiting.


🗂️ Project Structure

LLMGoat/
├── app/
│   ├── app.py              # Main application entry point
│   ├── requirements.txt    # Python dependencies
│   ├── llms/               # LLM integrations and configurations
│   ├── embeddings/         # Vector/embedding logic
│   ├── templates/          # HTML challenge templates
│   └── static/             # Static assets (CSS, JS)
├── assets/                 # Images and media
├── solution-manuals/       # Step-by-step challenge walkthroughs
└── readme.md

📖 Solution Manuals

Step-by-step walkthroughs are available for each challenge in the solution-manuals/ directory.

Challenge Walkthrough
LLM01: Prompt Injection View
LLM02: Sensitive Information Disclosure View
LLM03: Supply Chain Compromise View
LLM04: Data and Model Poisoning View
LLM05: Improper Output Handling View
LLM06: Excessive Agency View
LLM07: System Prompt Leakage View
LLM08: Vector and Embedding Weakness View
LLM09: Misinformation View
LLM10: Unbounded Consumption View

⚠️ Disclaimer

LLMGoat is intended strictly for educational and authorized security research purposes. All exploitation scenarios are contained within the local environment. Do not use techniques learned here against systems you do not own or have explicit permission to test.


Built for security practitioners, red teamers, and anyone curious about LLM attack surfaces.

About

LLMGoat: Offensive LLM Security Environment

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • HTML 50.3%
  • Python 42.6%
  • CSS 5.4%
  • JavaScript 1.7%