Best LLM Scanners
A CI/CD pipeline diagram with a security gate, representing automated LLM red-teaming in CI
tools

Automated LLM Red-Teaming in CI: garak vs PyRIT vs Promptfoo

Three open-source tools can gate your pipeline on LLM security findings — garak, PyRIT, and Promptfoo. A practitioner comparison of how each fits CI/CD, what it scans, and which to run where.

By Best LLM Scanners Editorial · · 8 min read

The point of automated LLM red-teaming is to fail a build when a model or prompt change introduces a security regression — before it ships, not after an incident. Three open-source tools can play that gate: NVIDIA’s garak, Microsoft’s PyRIT, and Promptfoo. They overlap enough to be confused and differ enough that the right answer is usually “more than one.” This guide compares how each fits a CI/CD pipeline, what it actually scans, and where each earns its place.

The CI gate, defined

A useful CI security gate has four properties: it runs unattended, it produces a machine-parseable result, it lets you set pass/fail thresholds, and it’s fast enough that engineers don’t route around it. Hold the three tools against that bar.

garak: the breadth-first probe scanner

garak (NVIDIA, Apache-2.0) is the closest thing LLM security has to a Nessus-style scanner: a large library of pre-built probes covering jailbreaks, prompt injection, toxicity, hallucination, data leakage, and encoding attacks, run with a single command. For CI, its strengths are exactly the ones that matter:

  • One-command invocation. python -m garak --target_type ... --probes ... runs unattended with no code.
  • Machine-parseable output. garak writes JSONL hit logs and reports plus an HTML report; a shell step can parse the JSONL, check whether any probe category exceeded a threshold, and fail the build.
  • Targeted subsets. Running all probes takes hours; in CI you run a focused subset to bring scan time down to minutes, then run the full suite nightly.

garak is the natural always-on regression gate: pin a probe subset, set per-category failure thresholds, and run it on every model or system-prompt change. Our garak walkthrough covers the probe/detector/generator architecture in depth. Its limit for CI is that it’s a fixed-corpus scanner — strong on known attack classes, thinner on multi-turn and bespoke campaigns.

PyRIT: the programmable campaign

PyRIT (Microsoft, MIT) is a red-teaming SDK, not a one-command scanner. You compose targets, datasets, orchestrators, converters, and scorers in Python. That makes it more work to wire into CI — there’s no single CLI gate out of the box — but it buys two things garak can’t:

  • Multi-turn attack strategies (Crescendo, TAP, Skeleton Key) that carry conversation state, catching vulnerabilities a single-shot scanner structurally can’t.
  • Custom scorers tied to your own policy, so the pass/fail signal reflects your definition of harm rather than a generic detector’s.

In CI, PyRIT fits as a scheduled campaign rather than a per-commit gate: a committed Python harness (orchestrators, datasets, scorers in version control) run on a schedule, emitting scores your pipeline thresholds against. Our PyRIT explainer covers the architecture. The cost is that you own the integration glue; the benefit is depth garak doesn’t reach.

Promptfoo: red-teaming built for the pipeline

Promptfoo (MIT-licensed; the project is now part of OpenAI and remains open source) was designed CI-first. It started as a prompt/RAG evaluation tool and grew a red-team mode that auto-generates adversarial prompts using a large library of attack plugins — prompt injection, jailbreaks, PII leakage, excessive agency, and many more — driven by a declarative YAML config rather than code.

For CI specifically, Promptfoo is the most turnkey of the three:

  • Declarative config. The whole eval/red-team run is a YAML file you commit — no harness code.
  • First-class CI/CD integration, including a GitHub Action for red-team scanning, so a failing finding blocks a pull request directly.
  • Compliance mappings. Its red-team presets map to the OWASP LLM Top 10, NIST AI RMF, and MITRE ATLAS, which is useful when the gate has to produce an auditable report, not just a pass/fail.

Promptfoo’s sweet spot is teams that want red-teaming wired into pull-request CI with minimal custom code and a compliance-flavored report at the end.

Side-by-side for CI

garakPyRITPromptfoo
MaintainerNVIDIAMicrosoftPromptfoo (part of OpenAI)
LicenseApache-2.0MITMIT
ShapeProbe scanner (CLI)Red-team SDK (Python)Eval + red-team (declarative + CLI)
CI integrationJSONL parse + thresholdCustom harness, scheduledNative GitHub Action
Multi-turn attacksLimitedStrong (Crescendo, TAP)Plugin-based
Config styleCLI flags / option filesPython codeYAML
Compliance reportingManualManualOWASP / NIST / MITRE presets
Best CI rolePer-commit regression gateScheduled deep campaignPR-blocking red-team gate

They’re complementary, not exclusive

The mature pipeline uses more than one, mapped to where each is strong:

  • Promptfoo or garak as the fast per-commit / per-PR gate — declarative and CI-native (Promptfoo) or one-command with threshold parsing (garak).
  • garak full suite nightly, for breadth the per-commit subset skips.
  • PyRIT on a schedule for the deep, multi-turn, custom-scored campaigns that need conversation state and your own policy.

All three are pre-deployment tools. None provides runtime protection — for live input/output screening you need a separate layer like LLM Guard or a guardrail model, covered in our guardrail selection guide and at guardml.io. And a gate is only as good as its thresholds: set per-category pass/fail bars deliberately, because they will otherwise be negotiated under pressure after an incident.

The threshold problem

A red-team CI gate fails builds, which means a badly tuned gate either lets regressions through (thresholds too loose) or blocks every release on noise (thresholds too tight). Two disciplines keep it honest:

  1. Pin the target model to a dated snapshot. A gate that runs against a floating model alias will flip pass/fail when the provider silently updates the model, and you’ll waste days chasing a “regression” you didn’t cause. aisecbench.com covers the reproducibility discipline these gates depend on.
  2. Set thresholds against measured false-positive cost. An attack-detection threshold that’s too aggressive blocks legitimate releases; our false-positive cost guide covers turning detection rates into a defensible bar.

For the attack techniques all three tools generate, aisec.blog breaks down the mechanics.

Practical Recommendation

If you want the fastest path to a PR-blocking gate with a compliance report, start with Promptfoo — declarative YAML and a native GitHub Action make it the lowest-friction CI option. If you’re already running garak, keep it as the per-commit regression gate (subset) plus a nightly full suite; its JSONL output parses cleanly into a threshold check. Add PyRIT as a scheduled deep campaign when your threat model needs multi-turn strategies or custom scoring tied to your policy. Pin the target to a dated snapshot in every case, and set per-category thresholds before they matter. For where these scanners sit in a complete stack, see Best LLM Security Scanners: Open-Source and Enterprise Compared, and aidefense.dev for surrounding defense strategy.


Sources

  • NVIDIA/garak — GitHub: LLM vulnerability scanner. Apache-2.0. Single-command probe runs, JSONL/HTML output suitable for CI threshold parsing.
  • microsoft/PyRIT — GitHub: Python Risk Identification Tool. MIT. Programmable orchestrators with multi-turn strategies (Crescendo, TAP, Skeleton Key) and custom scorers.
  • promptfoo/promptfoo — GitHub: Test and red-team LLM apps. MIT-licensed; now part of OpenAI and remains open source. Declarative configs, CI/CD and GitHub Action integration, attack plugins with OWASP/NIST/MITRE mappings.

Sources

  1. NVIDIA/garak — LLM vulnerability scanner (GitHub)
  2. microsoft/PyRIT — Python Risk Identification Tool (GitHub)
  3. promptfoo/promptfoo — Test and red-team LLM apps (GitHub)
Subscribe

Best LLM Scanners — in your inbox

Comparing LLM security scanners and detection tools. — delivered when there's something worth your inbox.

No spam. Unsubscribe anytime.

Related

Comments