You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model · Demo · Quickstart · Quick comparisons


PixAI Tagger v0.9

A practical anime multi-label tagger. Not trying to win benchmarks; trying to be useful.
High recall, updated character coverage, trained on a fresh Danbooru snapshot (2025-01).
We’ll keep shipping: v1.0 (with updated tags) is next.

TL;DR

  • ~13.5k Danbooru-style tags (general, character, copyright)
  • Headline: strong character performance; recall-leaning defaults
  • Built for search, dataset curation, caption assistance, and text-to-image conditioning

What it is (in one breath)

pixai-tagger-v0.9 is a multi-label image classifier for anime images. It predicts Danbooru-style tags and aims to find more of the right stuff (recall) so you can filter later. We continued training the classification head of EVA02 (from WD v3) on a newer dataset, and used embedding-space MixUp to help calibration.

  • Last trained: 2025-04
  • Data snapshot: Danbooru IDs 1–8,600,750 (2025-01)
  • Finetuned from: SmilingWolf/wd-eva02-large-tagger-v3 (encoder frozen)
  • License (weights): Apache 2.0 (Note: Danbooru content has its own licenses.)

Why you might care

  • Newer data. Catches more recent IPs/characters.
  • Recall-first defaults. Good for search and curation; dial thresholds for precision.
  • Character focus. We spent time here; it shows up in evals.
  • Simple to run. Works as an endpoint or locally; small set of knobs.

Quickstart

Recommended defaults (balanced):

  • top_k = 128
  • threshold_general = 0.30
  • threshold_character = 0.75

Coverage preset (recall-heavier): threshold_general = 0.10 (expect more false positives)

1) Inference Endpoint

Deploy as an HF Inference Endpoint and test with the following command:

# Replace with your own endpoint URL
curl "https://YOUR_ENDPOINT_URL.huggingface.cloud" \
  -X POST \
  -H "Accept: application/json" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {"url": "https://your.cdn/image.jpg"},
    "parameters": {
      "top_k": 128,
      "threshold_general": 0.10,
      "threshold_character": 0.75
    }
  }'

2) Python (InferenceClient)

from huggingface_hub import InferenceClient

client = InferenceClient("https://YOUR_ENDPOINT_URL.huggingface.cloud")
out = client.post(json={
    "inputs": {"url": "https://your.cdn/image.jpg"},
    "parameters": {"top_k": 128, "threshold_general": 0.10, "threshold_character": 0.75}
})
# out: [{"tag": "1girl", "score": 0.97, "group": "general"}, {"tag": "mika_(blue_archive)", "score": 0.92, "group": "character"}, ...]

3) Local Deployment

Also , this Tagger can be used via the imgutils tool.


Training notes (short version)

  • Source: Danbooru (IDs 1–8,600,750; snapshot 2025-01)
  • Tag set: ~13,461 tags (≥600 occurrences); grouped as general/character/copyright
  • Filtering: remove images with <10 general tags (WD v3 heuristic)
  • Setup: EVA02 encoder frozen; classification head continued training
  • Input: 448×448; standard Danbooru tag normalization
  • Augment: MixUp in embedding space (α=200)
  • Optim: Adam 1e-5, cycle schedule; batch 2048; full precision
  • Compute: ~1 day on 1× 8×H100 node
  • (Explored full-backbone training; deferred—head-only was more stable and faster for data iteration.)

Evaluation (what to expect)

Metric style: Fixed thresholds (above). Reported as micro-averaged unless noted.

  • All-tags (13k) micro-F1: ~0.60 (recall-leaning)
  • Character subset (4k) micro-F1: 0.865 @ t_char=0.75
  • Reference: WD v3 SwinV2 character F1 ≈ 0.608 (same protocol)

Internal “accuracy/coverage” snapshot

Model Coverage-F1 Accuracy-F1 Acc-Recall Acc-Precision Cov-Precision Cov-Recall
PixAI v0.9 0.4910 0.4403 0.6654 0.3634 0.4350 0.6547
WD-v3-EVA02 0.4155 0.4608 0.4465 0.5248 0.4580 0.4083
WD-v3-SwinV2 0.3349 0.3909 0.3603 0.4821 0.3906 0.3171
Camie-70k 0.4877 0.4800 0.5743 0.4123 0.4288 0.5930

Notes • Character uses t≈0.75; coverage often uses t≈0.10. • Keep micro vs macro consistent when updating numbers.

image/png

Note: Plots show internal candidate versions (v2.x). Current release is equivalent to pixai-tagger-v0.9 (ex-v2.4.1). Follow-up version is in progress.


Quick comparisons

A fast feel for where v0.9 sits. Numbers are from our protocol and may differ from others’.

Topic PixAI Tagger v0.9 WD v3 (EVA02 / SwinV2) What it means in practice
Data snapshot Danbooru to 2025-01 Danbooru to 2024-02 Better coverage of newer IPs
Tag vocabulary ~13.5k tags ~10.8k tags More labels to catch long-tail
Character F1 ≈0.865 (@ 0.75 threshold) ~0.61 (SwinV2 ref) Stronger character recognition
Default posture Recall-leaning (tune down for precision) Often more balanced Good for search/curation; more false positives; set your own thresholds
Model size ~1.27 GB checkpoint Similar ballpark Easy to host; endpoint-friendly
Training strategy Head-only; encoder frozen (EVA02) Depends on release Faster iteration on data updates

Intended use

You can:

  • Auto-tag anime images with Danbooru-style tags
  • Build tag-search indices
  • Assist caption generation (merge tags with NL captions)
  • Feed tags into text-to-image pipelines (alone or alongside text)

Please don’t rely on it for:

  • Legal/safety moderation or age verification
  • Non-anime imagery (performance will drop)
  • Fine-grained counting/attributes without human review

Limitations & risks

  • NSFW & sensitive tags. The dataset contains them; outputs may too.
  • Recall vs precision. Low thresholds increase false positives.
  • Hallucinations. Number-sensitive or visually similar tags can be mispredicted.
  • Representation bias. Mirrors Danbooru’s styles, tropes, and demographics.
  • IP/character names. Can be wrong or incomplete; use allow/deny lists and co-occurrence rules.

Tuning tips

  • Set different thresholds for general vs character tags.
  • Consider allow/deny lists for your domain.
  • Add simple co-occurrence rules to suppress contradictions.

Authors / Contributors

  • Linso — primary contributor (training, data processing)
  • narugo1992 — contributions
  • AngelBottomless (PixAI) — contributions
  • trojblue (PixAI) — contributions
  • The rest of the PixAI team — further development support and testing

We also appreciate the broader anime image generation community. Several ideas, discussions, and experiments from outside PixAI helped shape this release.


Maintenance

  • We plan future releases with updated snapshots.
  • v1.0 will include updated tags + packaging improvements.
  • Changelog will live in the repo.

Other

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pixai-labs/pixai-tagger-v0.9

Quantizations
1 model

Space using pixai-labs/pixai-tagger-v0.9 1