Classification Performance

Performance changelog for the Classification processor. This section covers label accuracy improvements, confidence calibration, drift monitoring, and optimizations for high-throughput document routing.


What’s Included

  • Accuracy: Precision and recall improvements across document categories
  • Confidence Calibration: Score alignment with observed prediction quality
  • Drift Detection: Monitoring for distribution shifts in incoming documents
  • Human-in-the-Loop: Review queue integration for uncertain predictions

Recent Updates

2024-12-14 — Confidence Score Recalibration

Deployed updated calibration model reducing expected calibration error from 0.08 to 0.03. Confidence scores now more accurately reflect prediction reliability.

  • Impact: Accuracy

2024-12-02 — Multi-Label Classification Support

Added support for assigning multiple labels per document. Useful for documents spanning categories such as “Invoice + Contract” or “Receipt + Warranty”.

  • Impact: UX

2024-11-20 — Active Learning Pipeline

Introduced automated active learning loop that identifies high-uncertainty samples for human review. F1 score improved by 4 points on receipt classification after two review cycles.

  • Impact: Accuracy

2024-11-06 — Batch Inference Streaming

Batch classification jobs now stream partial results for runs exceeding 500 documents. Webhook notifications available at 25%, 50%, 75%, and 100% completion.

  • Impact: UX

2024-10-25 — Drift Detection Alerts

Added automated drift detection with configurable thresholds. Alerts trigger when incoming document distribution diverges from training baseline by more than 15%.

  • Impact: Reliability

2024-10-12 — Benchmark Expansion

Expanded internal benchmark suite to include transportation, insurance, and healthcare document sets. Total benchmark coverage now exceeds 12,000 labeled samples.

  • Impact: Accuracy

2024-09-29 — Human Review Queue

Launched Console integration for human-in-the-loop review. Documents below confidence threshold automatically route to review queue with annotation interface.

  • Impact: Accuracy

2024-09-16 — GPU Memory Optimization

Reduced GPU memory footprint by 22% allowing larger batch sizes on standard GPU instances. No accuracy degradation observed.

  • Impact: Latency

Compatibility Notes

  • Multi-label classification requires API v2.1 or later
  • Drift detection available for Enterprise tier
  • Active learning requires minimum 50 reviewed samples to activate

Roadmap (Next Quarter)

  • Zero-shot classification for new document types without training
  • Explainability reports with feature attribution
  • Custom confidence thresholds per label category