Skip to content
Ad: top-banner (320x50)

Toxicity Classifier

Check whether text is toxic, insulting, threatening, or hateful before you post or share it. Runs a small AI model 100% in your browser — no account, no upload.

Your data never leaves your device
Ad: sidebar (300x250)
Ad: mid-banner (300x250)

Frequently Asked Questions

  • What counts as "toxic"?

    The model returns scores for six categories defined by the Jigsaw/Conversation AI dataset (used to train moderation models on Wikipedia comments) — toxic, severe toxic, obscene, threat, insult, and identity hate. Higher scores mean the model is more confident the category applies.

  • Which model is used?

    A distilled BERT classifier (Xenova/toxic-bert) served as a quantized ONNX file (~60 MB). It runs via @huggingface/transformers on WebAssembly, and the weights are cached in your browser after the first use.

  • Is my text uploaded anywhere?

    No. All classification runs in your browser. Nothing is sent to a server.

  • How accurate is it?

    The model is trained on English-language social-media and comment data, so it performs best on similar content. It can miss sarcasm, dog whistles, and non-English toxicity. Always use it as an assistant, not a sole arbiter.