Subject: Trainable classifier completes with 0% accuracy / "training completed with failures" — need diagnostics
Product: Microsoft Purview — Trainable Classifiers (Data Classification)
Summary: We created a custom trainable classifier using seed content stored in SharePoint. Training completes but returns a 0% accuracy score with the message "Training completed with failures — based on your accuracy score, we recommend trying again with a different set of samples." The test-results view for the classifier is empty, so we have no per-document detail on what failed.
Environment / setup:
- Seed content is in a dedicated SharePoint site, in the default document library, organized into top-level folders.
- Each positive folder has 50+ files; negative pool is well over 150 files. Files are unencrypted Office/PDF documents in English.
- Classifier was created successfully and the seed folders were selectable in the picker.
- Account creating the classifier has Compliance Administrator and Security Administrator roles; tenant is licensed for E5/E5 Compliance.
What we have already checked:
- Confirmed sample counts meet the 50 positive / 150 negative minimums.
- Confirmed files are unencrypted and in supported formats/language.
- Confirmed the seed site/library is indexed and the folders resolve in the classifier picker.
- Reviewed the classifier detail page — it shows the 0% score but the test-items / matched-items view is empty.
Questions / requests:
- Can you check the service-side training job for this classifier and confirm the specific reason training failed (e.g. ingestion/parse errors, insufficient differentiation between positive and negative sets, or other)?
- Are there any per-document or per-sample diagnostic logs available on your side indicating which samples could not be ingested/parsed, or were found non-differentiable? The portal exposes none to us.
- Is the 0% score the result of the model falling below an internal confidence threshold, and if so, is that threshold documented anywhere we can reference?
- Is there anything on the service/indexing side (content classification crawl, sample ingestion) that could be contributing, independent of the sample content itself?
- Any recommended remediation beyond "choose a different sample set" — e.g. specific guidance on positive/negative set composition for this failure mode?
Goal: Confirm whether this is purely a sample-composition issue or whether anything service-side is contributing, before we invest in rebuilding the sample sets.
Impact: Blocking an auto-labeling rollout; medium priority. Classifier ID and supporting screenshots available on request through the case.Subject: Trainable classifier completes with 0% accuracy / "training completed with failures" — need diagnostics
Product: Microsoft Purview — Trainable Classifiers (Data Classification)
Summary:
We created a custom trainable classifier using seed content stored in SharePoint. Training completes but returns a 0% accuracy score with the message "Training completed with failures — based on your accuracy score, we recommend trying again with a different set of samples." The test-results view for the classifier is empty, so we have no per-document detail on what failed.
Environment / setup:
- Seed content is in a dedicated SharePoint site, in the default document library, organized into top-level folders.
- Each positive folder has 50+ files; negative pool is well over 150 files. Files are unencrypted Office/PDF documents in English.
- Classifier was created successfully and the seed folders were selectable in the picker.
- Account creating the classifier has Compliance Administrator and Security Administrator roles; tenant is licensed for E5/E5 Compliance.
What we have already checked:
- Confirmed sample counts meet the 50 positive / 150 negative minimums.
- Confirmed files are unencrypted and in supported formats/language.
- Confirmed the seed site/library is indexed and the folders resolve in the classifier picker.
- Reviewed the classifier detail page — it shows the 0% score but the test-items / matched-items view is empty.
Questions / requests:
- Can you check the service-side training job for this classifier and confirm the specific reason training failed (e.g. ingestion/parse errors, insufficient differentiation between positive and negative sets, or other)?
- Are there any per-document or per-sample diagnostic logs available on your side indicating which samples could not be ingested/parsed, or were found non-differentiable? The portal exposes none to us.
- Is the 0% score the result of the model falling below an internal confidence threshold, and if so, is that threshold documented anywhere we can reference?
- Is there anything on the service/indexing side (content classification crawl, sample ingestion) that could be contributing, independent of the sample content itself?
- Any recommended remediation beyond "choose a different sample set" — e.g. specific guidance on positive/negative set composition for this failure mode?
Goal: Confirm whether this is purely a sample-composition issue or whether anything service-side is contributing, before we invest in rebuilding the sample sets.
Impact: Blocking an auto-labeling rollout; medium priority. Classifier ID and supporting screenshots available on request through the case.