This repository contains Python scripts to:
- extract frames from local videos,
- run YOLO live detection on webcam frames,
- expose video streams and detection/tracking data through Flask HTTP endpoints.
- Python 3.8+
- A trained YOLO model (
best.pt) compatible with your classes (apples,pears) - Webcam connected to the Jetson device (or host machine)
Install dependencies:
pip3 install ultralytics opencv-python flaskNote: on Jetson, OpenCV and CUDA/TensorRT are often installed differently. Keep the package versions aligned with your JetPack setup.
Each script has inline config variables near the top, including:
MODEL_PATH- camera index and frame size
- inference settings (
imgsz, confidence, device) - optional tracker config (
bytetrack.yamlorbotsort.yaml)
Update MODEL_PATH in each script before running.
Create an "Object Detection" Project in Roboflow
Once you donwloaded the latest Yolo11n from ultralytics github (n stands for light, possibly nano) you can launch the follwoing command in order to have the Jetson.
yolo train model=yolo11n.pt data=./data.yaml epochs=100 imgsz=320 batch=2 workers=0 device=0yolo11n→ nano, light.ptpre-trainedmodel=yolo11n.pt→ Where are the images? / Where are the labels? / How many classes are there? / What are they called?epochs=100→ How many times does the model process the entire dataset? 1 epoch means all images have been seen once. 100 epochs = 100 timesimgsz=320→ 320 × 320batch=2→ 2 images at a timeworkers=0→ 0 all in one thread (slower but more stable)device=0→ usingCUDAvsCPUthat would useGPUwith far more time
Extracts evenly distributed frames from videos inside ./videos and saves JPEGs in ./frames_out.
Run:
python3 extract_frames.pyWhat it does:
- scans
./videosfor supported formats (.mp4,.mov,.m4v,.avi,.mkv) - extracts
FRAMES_PER_VIDEOframes per video (or fewer if video is short) - optionally resizes images while preserving aspect ratio (
RESIZE_LONG_SIDE)
HTTP endpoints: none
JSON output: none (console logs only)
Runs YOLO detection from webcam and shows an annotated OpenCV window.
Run:
python3 live_detect.pyControls:
- press
qto quit
HTTP endpoints: none
JSON output: none (local window stream only)
Runs YOLO detection from webcam and publishes an MJPEG stream via Flask.
Run:
python3 live_detect_web.pyDefault server:
http://<JETSON_IP>:5000/- simple HTML page with embedded streamhttp://<JETSON_IP>:5000/video_feed- MJPEG stream endpoint
GET /- Returns a minimal HTML page showing the stream
GET /video_feed- Returns
multipart/x-mixed-replaceMJPEG frames
- Returns
JSON output: none
Runs YOLO detection, overlays live counters on the frame, streams MJPEG, and exposes current counts as JSON.
Run:
python3 live_detect_count_web.pyDefault server:
http://<JETSON_IP>:5000/http://<JETSON_IP>:5000/video_feedhttp://<JETSON_IP>:5000/counts
GET /- HTML page with stream preview and link to
/counts
- HTML page with stream preview and link to
GET /video_feed- MJPEG stream with bounding boxes and overlay text
GET /counts- Latest detection counters and runtime metadata in JSON
Example /counts response:
{
"apples": 2,
"pears": 1,
"total": 3,
"timestamp": 1776846031.52,
"fps": 14.6
}Runs YOLO tracking (model.track) to avoid double counting by using persistent object IDs.
Exposes both aggregate counters and per-object tracking data.
Run:
python3 live_track_count_web.pyDefault server:
http://<JETSON_IP>:5000/http://<JETSON_IP>:5000/video_feedhttp://<JETSON_IP>:5000/countshttp://<JETSON_IP>:5000/objectshttp://<JETSON_IP>:5000/reset_counts(GET or POST)
GET /- HTML page with stream and links
GET /video_feed- MJPEG stream with boxes, IDs, and counters overlay
GET /counts- Aggregated unique/visible counts and telemetry JSON
GET /objects- List of currently visible tracked objects (ID/class/confidence/bbox/center)
GET /reset_countsPOST /reset_counts- Clears unique counted IDs for apples and pears
Example /counts response:
{
"unique_apples": 6,
"unique_pears": 4,
"unique_total": 10,
"visible_apples": 2,
"visible_pears": 1,
"visible_total": 3,
"tracked_ids_total": 10,
"timestamp": 1776846031.52,
"fps": 13.9
}Example /objects response:
[
{
"id": 12,
"class_name": "apples",
"confidence": 0.9134,
"bbox": [101, 55, 188, 140],
"center": [144, 97]
},
{
"id": 21,
"class_name": "pears",
"confidence": 0.8761,
"bbox": [260, 82, 320, 170],
"center": [290, 126]
}
]Example /reset_counts response:
{
"status": "ok",
"message": "Counts reset"
}