3-5× fewer labels for the same accuracy

Label only what your modelis unsure about.

The fastest way to improve a CV model isn't labelling more images — it's labelling the right images. Active learning sorts your unlabelled pool by model uncertainty so annotators spend time on the 50 images that move accuracy the most, not the 500 the model already gets right.

Strategy
Detector confidence + entropy + margin sampling
Inputs
project_id (images already in project)
Outputs
Uncertainty-sorted image queue
Speed-up
3-5× vs random sampling
Free quota
300 calls / month

Standard labelling sprints pick images at random from the unlabelled pool. That works, but it's wasteful — most of the labels re-confirm what the model already knows. Active learning runs the current model over the unlabelled pool, scores each image by uncertainty, and hands the most-uncertain ones to your annotators. Empirically this hits a target accuracy with 3-5× fewer labels.

mSightFlow's /v1/label/score-batch endpoint applies confidence, margin, and entropy sampling — combined or individually — and optionally pairs uncertainty with CLIP-based diversity so the top-N doesn't cluster on visually similar hard cases.

When active learning is the right tool

Limited labelling budget

You have 1,000 images and labels for 100. Active learning picks the 100 that maximise accuracy.

Closed-loop retraining

Train → score → label uncertain → retrain. Plateaus typically after 3-5 rounds; each round adds the next 50-100 highest-value labels.

Edge-case discovery

Models fail at decision boundaries. Margin sampling surfaces them so annotators find blind spots the team didn't know existed.

Sampling strategies

StrategyWhat it surfaces
lowest_confidenceSort by minimum max-class probability per image. Simplest and usually strong.
margin_samplingSort by smallest gap between top-1 and top-2 class probabilities. Surfaces decision-boundary cases.
entropySort by entropy of class distribution. Surfaces cases where the model is genuinely confused vs confidently wrong.
diverse_uncertaintyCombines uncertainty with CLIP-based diversity. Top-N covers different visual modes — best for batch labelling.

Default strategy is diverse_uncertainty. Pass strategy=... to switch.

Code — score, loop, diversify

Score a project
import os, requests

api = "https://api.msightflow.ai/v1/label/score-batch"
resp = requests.get(
    api,
    headers={"Authorization": f"Bearer {os.environ['MSF_API_KEY']}"},
    params={
        "project_id": "PROJECT_ID",
        "strategy": "diverse_uncertainty",   # uncertainty + CLIP-based diversity
        "limit": 50,
    },
).json()

for item in resp["queue"]:
    print(f"{item['uncertainty_score']:.3f}  {item['image_id']}  {item['thumbnail']}")
Closed loop
# Closed-loop active learning: auto-label → score → relabel → retrain.
def active_learning_round(project_id):
    # 1. Auto-label current pool
    requests.post(api + "/labeling/auto",
        headers=hdr, data={"project_id": project_id, "tasks": "detect"})

    # 2. Score uncertainty
    queue = requests.get(api + "/labeling/score-batch",
        headers=hdr, params={"project_id": project_id, "limit": 50}).json()

    # 3. Send top-50 to annotation queue
    requests.post(api + "/assignments/queue",
        headers=hdr, json={"image_ids": [q["image_id"] for q in queue["queue"]]})

    # 4. After human review, retrain (external) and run round again

for round_n in range(5):
    active_learning_round("PROJECT_ID")
    # Typical: model accuracy plateaus after 3-5 rounds for most datasets
Diverse uncertainty
# Diversity-aware uncertainty — combines model uncertainty with CLIP-based
# visual diversity. Reduces 'top-N all look the same' failure mode.
resp = requests.get(api + "/labeling/score-batch",
    headers=hdr,
    params={
        "project_id": "PROJECT_ID",
        "strategy": "diverse_uncertainty",
        "diversity_weight": 0.4,    # 0=pure uncertainty, 1=pure diversity
        "limit": 50,
    },
)

Pricing — same as every other endpoint

Free

$0

  • 300 API calls / month
  • Uncertainty scoring on any project
Start free

Pro

$29/mo

  • Unlimited calls
  • Tunable strategy weights
  • Higher per-provider quotas
Go Pro

Related features

FAQ

How does this beat random sampling?

Empirically, active learning typically gets you to a target accuracy with 3-5× fewer labels than random sampling on object-detection tasks. The reason: a model that's already 90% accurate doesn't learn much from labels on images it already gets right. Uncertainty sampling surfaces the images where the model's wrong or near the decision boundary — exactly where new labels move the needle.

What sampling strategies do you use?

Three combined: (1) lowest-confidence — sort by the lowest max-class probability per image, (2) margin sampling — sort by the smallest gap between top-1 and top-2 class probabilities, (3) entropy — sort by the entropy of the class probability distribution. Each surfaces slightly different uncertainty patterns; we return a weighted combination. Strategy weights are tunable on the Pro tier.

Does this work without a trained model?

Yes — you can use any of mSightFlow's hosted models (YOLO, Grounding DINO) to score the queue. The closer the scoring model matches your eventual deployment model, the better the labels-per-improvement ratio. For best results: train a baseline model first, score with it, label the top N, retrain.

How do I avoid bias toward 'hard' images?

Uncertainty sampling can over-prioritise truly impossible cases (motion-blurred, occluded, weird angles). Best practice: pair with diversity sampling — pick the top N uncertain *and* visually diverse images using CLIP embeddings. mSightFlow's score-batch endpoint accepts a `strategy=diverse_uncertainty` flag that does this combination.

What does the response look like?

JSON list sorted ascending by combined uncertainty: each item has image_id, uncertainty_score (0-1, lower = more uncertain), per-strategy scores, and a thumbnail URL for quick review. Take the top N for your next labelling sprint.

Label the 50 images that matter.

300 free API calls / month. Uncertainty-sorted, diversity-weighted, closed-loop ready.