YOLOv8-Pose — multi-person, real-time

Body keypoints, 17 joints,one API call.

Detect human bodies and 17 COCO keypoints (eyes, shoulders, elbows, wrists, hips, knees, ankles) in any image — multi-person, with skeleton overlay, in under a second.

Try in Studio free Read API reference

Model: YOLOv8-Pose (m)
Inputs: JPG/PNG ≤ 25 MB
Outputs: 17 keypoints/person · bbox · skeleton overlay
Latency: ~150 ms p50
Free quota: 300 calls / month

YOLOv8-Pose extends YOLOv8 detection with a keypoint regression head, giving you per-person bounding boxes and the 17 standard COCO keypoints in a single forward pass. Faster than two-stage top-down models, accurate enough for production sports analysis, ergonomics, AR avatars, and behavioural monitoring.

mSightFlow hosts YOLOv8m-Pose (medium-size checkpoint) by default with optional skeleton overlay rendering. For custom keypoint schemas — hands, faces, animals, mechanical parts — define your template via /skeletons/ and we'll render and export accordingly.

When pose estimation is the right tool

Sports & fitness

Form analysis, rep counting, joint-angle measurement, technique scoring. Build a fitness coach in an afternoon.

Workplace ergonomics

Posture monitoring on production lines, lifting-form risk scoring, fatigue-trigger detection. RULA / REBA pipelines.

AR & avatars

Body tracking for AR filters, avatar driving, motion-capture-lite. Combine with depth for 3D-aware avatars.

The 17 COCO keypoints

Default skeleton schema. Index, name, and the order they appear in the keypoints array of each person.

Index	Keypoint	Index	Keypoint
`0`	nose	`9`	left_wrist
`1`	left_eye	`10`	right_wrist
`2`	right_eye	`11`	left_hip
`3`	left_ear	`12`	right_hip
`4`	right_ear	`13`	left_knee
`5`	left_shoulder	`14`	right_knee
`6`	right_shoulder	`15`	left_ankle
`7`	left_elbow	`16`	right_ankle
`8`	right_elbow

For custom schemas (hands, faces, animals, mechanical), see the skeleton templates API.

Code — Python, Node, cURL

Python

import os, requests
from pathlib import Path

resp = requests.post(
    "https://api.msightflow.ai/v1/pose",
    headers={"Authorization": f"Bearer {os.environ['MSF_API_KEY']}"},
    files={"image": Path("athlete.jpg").read_bytes()},
    data={"return_overlay": "true"},
)
for p in resp.json()["persons"]:
    nose = p["keypoints"][0]            # COCO index 0 = nose
    l_wrist, r_wrist = p["keypoints"][9], p["keypoints"][10]
    print(f"person bbox={p['box']}, nose={nose}, wrists={l_wrist}, {r_wrist}")

Node.js

import fetch from "node-fetch";
import FormData from "form-data";
import fs from "fs";

const form = new FormData();
form.append("image", fs.createReadStream("athlete.jpg"));
form.append("return_overlay", "true");

const resp = await fetch("https://api.msightflow.ai/v1/pose", {
  method: "POST",
  headers: { Authorization: `Bearer ${process.env.MSF_API_KEY}` },
  body: form,
});
const { persons } = await resp.json();
console.log(`${persons.length} persons, total keypoints: ${persons.reduce((n,p)=>n+p.keypoints.length,0)}`);

cURL

curl -X POST https://api.msightflow.ai/v1/pose \
  -H "Authorization: Bearer $MSF_API_KEY" \
  -F "image=@athlete.jpg" \
  -F "return_overlay=true"

Joint angles for form analysis

# Compute elbow angle for form analysis (rep counting, ergonomics).
import numpy as np

def angle(a, b, c):
    ba, bc = np.array(a) - np.array(b), np.array(c) - np.array(b)
    cos = np.dot(ba, bc) / (np.linalg.norm(ba) * np.linalg.norm(bc))
    return float(np.degrees(np.arccos(np.clip(cos, -1, 1))))

person = resp.json()["persons"][0]
kp = person["keypoints"]
# COCO indices: 5 left-shoulder, 7 left-elbow, 9 left-wrist
l_elbow_angle = angle(kp[5][:2], kp[7][:2], kp[9][:2])
print(f"left elbow flexion: {l_elbow_angle:.1f}°")

Pricing — same as every other endpoint

Free

300 API calls / month
Multi-person, COCO schema
Skeleton overlay PNG
No credit card

Start free

Standard

$10/mo

5,400 API calls / month
Custom skeleton templates
Batch up to 10 images / call

Pick Standard

Pro

$29/mo

Unlimited calls
Higher per-provider quotas

Go Pro

Related features

Depth estimation

Combine 17 keypoints with per-pixel depth for 3D-aware pose, AR avatars, ergonomics scoring.

Learn more

Object detection

Detect people, equipment, and PPE alongside pose for full-scene workplace and sports analytics.

Learn more

SAM segmentation

Pixel-level body masks (silhouettes) for compositing, background removal, AR effects.

Learn more

FAQ

Which keypoints are returned?

The 17 standard COCO keypoints: nose, left/right eye, left/right ear, left/right shoulder, left/right elbow, left/right wrist, left/right hip, left/right knee, left/right ankle. Each comes with an (x, y) image coordinate and a confidence score 0-1.

What if I need keypoints for hands, faces, or animals?

Use /features/pose-estimation/skeletons to define a custom keypoint schema. mSightFlow ships templates for COCO Person (17), MediaPipe Hand (21), MediaPipe Face Mesh (468), and Animal Pose (24). For brand-new schemas, contact us — we can host a fine-tuned checkpoint on the Pro tier.

Multi-person handling?

Yes — the response is an array of detected persons, each with their own bbox + 17 keypoints + confidence. The model handles occlusion via confidence-down-weighting; expect lower per-keypoint confidence for partially-visible bodies.

How accurate is YOLOv8-Pose?

On COCO Person Keypoints, YOLOv8m-Pose hits ~64 mAP — strong for a single-stage model. Faster than top-down models like HRNet, less accurate than the largest two-stage models. Best balance for real-time and batch processing.

Can I use this for video?

Yes — submit a video via /v1/video/upload (tier-gated by length: 8 s free / 15 s Standard / unlimited Pro). The pose model runs on sampled frames, returning per-frame keypoints. For continuous streams the streaming endpoint is on the roadmap.

17 joints, every person, one call.

300 free API calls / month. YOLOv8-Pose. Custom schemas on Pro.

Start free Or try in Studio