Tag
#content-moderation
2 posts tagged content-moderation.
- guardrails
Classifier-on-Output: Catching Misbehavior Post-Generation
How production teams use post-generation classifiers to catch what input filters and refusal training miss — architectures, tradeoffs, and where output classifiers earn their latency budget.
- guardrails
Llama Guard vs NeMo vs OpenAI Moderation: Production Tradeoffs
A practitioner comparison of Llama Guard, NeMo Guardrails, and the OpenAI Moderation API — coverage, latency, customization, and where each one breaks in production.