YOLOv8-Pose — multi-person, real-time

Body keypoints, 17 joints,one API call.

Detect human bodies and 17 COCO keypoints (eyes, shoulders, elbows, wrists, hips, knees, ankles) in any image — multi-person, with skeleton overlay, in under a second.

Model
YOLOv8-Pose (m)
Inputs
JPG/PNG ≤ 25 MB
Outputs
17 keypoints/person · bbox · skeleton overlay
Latency
~150 ms p50
Free quota
300 calls / month

YOLOv8-Pose extends YOLOv8 detection with a keypoint regression head, giving you per-person bounding boxes and the 17 standard COCO keypoints in a single forward pass. Faster than two-stage top-down models, accurate enough for production sports analysis, ergonomics, AR avatars, and behavioural monitoring.

mSightFlow hosts YOLOv8m-Pose (medium-size checkpoint) by default with optional skeleton overlay rendering. For custom keypoint schemas — hands, faces, animals, mechanical parts — define your template via /skeletons/ and we'll render and export accordingly.

When pose estimation is the right tool

Sports & fitness

Form analysis, rep counting, joint-angle measurement, technique scoring. Build a fitness coach in an afternoon.

Workplace ergonomics

Posture monitoring on production lines, lifting-form risk scoring, fatigue-trigger detection. RULA / REBA pipelines.

AR & avatars

Body tracking for AR filters, avatar driving, motion-capture-lite. Combine with depth for 3D-aware avatars.

The 17 COCO keypoints

Default skeleton schema. Index, name, and the order they appear in the keypoints array of each person.

IndexKeypointIndexKeypoint
0nose9left_wrist
1left_eye10right_wrist
2right_eye11left_hip
3left_ear12right_hip
4right_ear13left_knee
5left_shoulder14right_knee
6right_shoulder15left_ankle
7left_elbow16right_ankle
8right_elbow

For custom schemas (hands, faces, animals, mechanical), see the skeleton templates API.

Code — Python, Node, cURL

Python
import os, requests
from pathlib import Path

resp = requests.post(
    "https://api.msightflow.ai/v1/pose",
    headers={"Authorization": f"Bearer {os.environ['MSF_API_KEY']}"},
    files={"image": Path("athlete.jpg").read_bytes()},
    data={"return_overlay": "true"},
)
for p in resp.json()["persons"]:
    nose = p["keypoints"][0]            # COCO index 0 = nose
    l_wrist, r_wrist = p["keypoints"][9], p["keypoints"][10]
    print(f"person bbox={p['box']}, nose={nose}, wrists={l_wrist}, {r_wrist}")
Node.js
import fetch from "node-fetch";
import FormData from "form-data";
import fs from "fs";

const form = new FormData();
form.append("image", fs.createReadStream("athlete.jpg"));
form.append("return_overlay", "true");

const resp = await fetch("https://api.msightflow.ai/v1/pose", {
  method: "POST",
  headers: { Authorization: `Bearer ${process.env.MSF_API_KEY}` },
  body: form,
});
const { persons } = await resp.json();
console.log(`${persons.length} persons, total keypoints: ${persons.reduce((n,p)=>n+p.keypoints.length,0)}`);
cURL
curl -X POST https://api.msightflow.ai/v1/pose \
  -H "Authorization: Bearer $MSF_API_KEY" \
  -F "image=@athlete.jpg" \
  -F "return_overlay=true"
Joint angles for form analysis
# Compute elbow angle for form analysis (rep counting, ergonomics).
import numpy as np

def angle(a, b, c):
    ba, bc = np.array(a) - np.array(b), np.array(c) - np.array(b)
    cos = np.dot(ba, bc) / (np.linalg.norm(ba) * np.linalg.norm(bc))
    return float(np.degrees(np.arccos(np.clip(cos, -1, 1))))

person = resp.json()["persons"][0]
kp = person["keypoints"]
# COCO indices: 5 left-shoulder, 7 left-elbow, 9 left-wrist
l_elbow_angle = angle(kp[5][:2], kp[7][:2], kp[9][:2])
print(f"left elbow flexion: {l_elbow_angle:.1f}°")

Pricing — same as every other endpoint

Free

$0

  • 300 API calls / month
  • Multi-person, COCO schema
  • Skeleton overlay PNG
  • No credit card
Start free

Pro

$29/mo

  • Unlimited calls
  • Higher per-provider quotas
Go Pro

Related features

FAQ

Which keypoints are returned?

The 17 standard COCO keypoints: nose, left/right eye, left/right ear, left/right shoulder, left/right elbow, left/right wrist, left/right hip, left/right knee, left/right ankle. Each comes with an (x, y) image coordinate and a confidence score 0-1.

What if I need keypoints for hands, faces, or animals?

Use /features/pose-estimation/skeletons to define a custom keypoint schema. mSightFlow ships templates for COCO Person (17), MediaPipe Hand (21), MediaPipe Face Mesh (468), and Animal Pose (24). For brand-new schemas, contact us — we can host a fine-tuned checkpoint on the Pro tier.

Multi-person handling?

Yes — the response is an array of detected persons, each with their own bbox + 17 keypoints + confidence. The model handles occlusion via confidence-down-weighting; expect lower per-keypoint confidence for partially-visible bodies.

How accurate is YOLOv8-Pose?

On COCO Person Keypoints, YOLOv8m-Pose hits ~64 mAP — strong for a single-stage model. Faster than top-down models like HRNet, less accurate than the largest two-stage models. Best balance for real-time and batch processing.

Can I use this for video?

Yes — submit a video via /v1/video/upload (tier-gated by length: 8 s free / 15 s Standard / unlimited Pro). The pose model runs on sampled frames, returning per-frame keypoints. For continuous streams the streaming endpoint is on the roadmap.

17 joints, every person, one call.

300 free API calls / month. YOLOv8-Pose. Custom schemas on Pro.