One click — three formats — webhook included

Export to COCO, YOLO,or Pascal VOC — in one call.

Take your labelled dataset to any trainer in any framework. Auto- generated dataset.yaml, deterministic train/val/test split, optional webhook on completion, and a DatasetVersion snapshot for reproducibility. No format-conversion scripts.

Formats
COCO · YOLO · Pascal VOC
Inputs
project_id · format · split · webhook URL
Outputs
zipped dataset · dataset.yaml · webhook on completion
Speed
~5k images / min
Free quota
50 exports / month

Every CV team eventually writes its own COCO-to-YOLO converter, its own train/val/test splitter, its own dataset.yaml generator. Then they write the converter the other way. mSightFlow gives you all three formats out of one project state — pick by what your trainer needs, switch when you swap frameworks. Webhook + DatasetVersion means exports are CI-friendly and reproducible.

Three formats — pick by trainer

COCO JSON

  • Detection + segmentation + keypoints in one file
  • Universal — every modern trainer reads it
  • RLE-encoded masks for instance segmentation

YOLO TXT

  • One TXT per image with normalised bbox coords
  • Auto-generated dataset.yaml ready for Ultralytics
  • Polygon format for YOLOv8-seg

Pascal VOC XML

  • One XML per image (annotation_file)
  • Legacy support for older trainers
  • Detection only (no native mask format)

Code — export, cURL, webhook

Export — Python
import os, requests

# Synchronous GET — the response body is the dataset ZIP.
# split syntax is "train/val/test" as integers, e.g. "80/10/10".
resp = requests.get(
    "https://api.msightflow.ai/v1/projects/PROJECT_ID/export",
    headers={"Authorization": f"Bearer {os.environ['MSF_API_KEY']}"},
    params={"format": "yolo", "split": "80/10/10"},
    stream=True,
)
resp.raise_for_status()
with open("dataset.zip", "wb") as f:
    for chunk in resp.iter_content(chunk_size=1 << 20):
        f.write(chunk)
print("saved dataset.zip ·", len(resp.content), "bytes")
Export — cURL
# Same call as a one-liner — pipes the ZIP directly to disk.
curl -L \
  -H "Authorization: Bearer $MSF_API_KEY" \
  -o dataset.zip \
  "https://api.msightflow.ai/v1/projects/PROJECT_ID/export?format=yolo&split=80/10/10"
Webhook on completion
# Each export also fires a project-level "project.exported" webhook if the
# project has a webhook URL configured. Configure it once in project settings
# (or via the projects API); every subsequent export delivers a POST to the URL.

# Webhook payload (POST to your URL):
# {
#   "event": "project.exported",
#   "project_id": "PROJECT_ID",
#   "format": "yolo",
#   "image_count": 4231,
#   "version_id": "v3",
#   "user": "you@example.com"
# }
Generated dataset.yaml
# Example dataset.yaml generated for YOLO format
path: ./
train: images/train
val: images/val
test: images/test

# Classes
names:
  0: person
  1: hard_hat
  2: safety_vest

# Metadata
exported_at: '2026-05-14T13:42:00Z'
source: mSightFlow
project_id: 'PROJECT_ID'
version: 'v3'

Pricing — same as every other endpoint

Free

$0

  • 50 exports / month
  • All 3 formats
  • DatasetVersion snapshots
Start free

Pro

$29/mo

  • Unlimited exports
  • Custom output schema
Go Pro

Related features

FAQ

Which format should I use?

COCO if you have segmentation masks or keypoints — it's the only format that natively supports all annotation types. YOLO if you're training a YOLO-family model and want fastest data loading. Pascal VOC if your downstream tool (legacy Detectron, MMDetection) expects it. For most projects, default to COCO — every modern training pipeline can read it.

Are train/val/test splits reproducible?

Yes. Splits use a deterministic hash of image_id seeded by the export request, so the same project + same split ratios always produce the same files. Add a seed parameter to vary the split for cross-validation experiments without changing the underlying ratios.

What does dataset.yaml contain?

For YOLO format, the YAML has train/val/test paths (relative), class names (index → name), and number of classes. It drops straight into Ultralytics' train command: `yolo detect train data=dataset.yaml model=yolov8m.pt`. We also generate a README.md with class distribution + export timestamp.

Can I include only verified annotations?

Yes — pass include_unverified=false to filter out annotations flagged source=ai_generated that haven't been human-confirmed. Combined with the active-learning workflow, you can ship a high-quality export even from a mostly-auto-labelled project.

Are exports versioned?

Each export creates a DatasetVersion snapshot — the project state at export time is preserved so you can re-download the same dataset later even if the project changes. Versions are listed in the project settings and can be tagged (v1.0, v1.1-augmented).

From project to dataset.zip, one call.

50 free exports / month. COCO, YOLO, Pascal VOC. Webhook + version snapshots.