All posts
-
Best LLM Vulnerability Scanners 2026: Garak, PyRIT, Promptfoo, and Mindgard Compared
A practitioner's guide to the best LLM vulnerability scanners in 2026 — Garak v0.15.0, PyRIT, Promptfoo (now OpenAI), and Mindgard. OWASP LLM Top 10 coverage, CI/CD fit, and buyer profiles.
-
Open Source LLM Red Teaming Tools: PyRIT, Garak, HarmBench, and What to Use When
A practitioner's guide to the main open source LLM red teaming tools — PyRIT, Garak, HarmBench, TextAttack — what each does, what it misses, and how to build them into a real testing pipeline.
-
Automated LLM Red-Teaming in CI: garak vs PyRIT vs Promptfoo
Three open-source tools can gate your pipeline on LLM security findings — garak, PyRIT, and Promptfoo. A practitioner comparison of how each fits CI/CD, what it scans, and which to run where.
-
Choosing an LLM Guardrail: Llama Guard, NeMo Guardrails, Guardrails AI
A decision guide for picking an LLM guardrail in 2026 — Meta's Llama Guard 4, NVIDIA's NeMo Guardrails, and Guardrails AI. What each one actually is, and which shape fits your problem.
-
Prompt-Injection Detectors Compared: Rebuff, Vigil, and LLM Guard
A practitioner comparison of open-source prompt-injection detectors — Rebuff, Vigil, and LLM Guard's PromptInjection scanner — including detection architecture, maintenance status, and which to actually deploy in 2026.
-
LLM Guard: Input and Output Scanning for Production LLM Apps
A practical breakdown of LLM Guard by Protect AI — its input and output scanners, how the sanitize/scan pipeline works, where it fits as a runtime guardrail, and its real limits.
-
PyRIT: Microsoft's AI Red-Teaming Framework, Explained
A technical breakdown of PyRIT, Microsoft's Python Risk Identification Tool for generative AI — its target/dataset/orchestrator/converter/scorer architecture, multi-turn attack strategies, and where it fits next to garak.
-
False Positive Cost in Refusal Systems: Measure and Tune
Practical methods for quantifying the cost of refusal false positives in LLM products — eval design, baseline rates, threshold tuning, and the regression suite you need to keep them stable.
-
Best LLM Security Scanners: Open-Source and Enterprise Compared
A practitioner's comparison of the best LLM security scanners — Garak, PyRIT, LLM Guard, Promptfoo, Vigil, and enterprise options. Coverage, CI/CD fit, and runtime use cases.
-
Garak LLM Vulnerability Scanner: How It Works and When to Use It
A technical breakdown of the garak LLM vulnerability scanner — its probe architecture, supported attack categories, CLI workflow, and how it fits into a real AI red-teaming pipeline.
-
Classifier-on-Output: Catching Misbehavior Post-Generation
How production teams use post-generation classifiers to catch what input filters and refusal training miss — architectures, tradeoffs, and where output classifiers earn their latency budget.
-
Llama Guard vs NeMo vs OpenAI Moderation: Production Tradeoffs
A practitioner comparison of Llama Guard, NeMo Guardrails, and the OpenAI Moderation API — coverage, latency, customization, and where each one breaks in production.
-
What this site is for
Best LLM Scanners is a practitioner's comparison of LLM security scanners — Garak, PyRIT, promptmap, vendor scanners — coverage gaps, false-positive profiles, integration cost, and when 'best' depends on what you're defending.