Small, focused scripts for preparing YOLO detection/pose datasets: annotation generation, label cleanup/merging, mosaics, and image preprocessing.
- Current Working Directory (CWD): where your
images/andlabels/live. - Script Directory: where the scripts, their
.mddocs, and.ptmodels live. - Most tools read/write relative to CWD unless a path is provided.
annotate_images.py: Run a YOLO pose model on images and write pose labels, with optional flipped inference.cleanup_labels.py: Replace spaces with underscores inimages/andlabels/, move orphan labels tolabels-x, and create empty labels for unlabeled images.correct_keypoints.py: Normalize person keypoint visibility flags and zero coordinates when invisible.correct_face_keypoints.py: Merge improved face keypoints into pose labels with weighted blending and deviation stats.add_face_keypoints.py: Add face keypoints from face labels into base pose labels using closest face-area matching (or nose matching with--nose).correct_mpii_keypoints.py: Replace selected COCO keypoints with MPII pose keypoints using bbox-based matching.predict_pose_optical_flow.py: Predict pose keypoints for a target indexed image/range by tracking from previous labeled frames with pyramidal Lucas-Kanade optical flow, reverse back-check, and optional multi-frame fusion to reduce drift.crop_portrait_square_yolo.py: Face-centered square crops of portraits using YOLO pose keypoints, with optional resize/rotate/ratio/flip/debug.crop_detected_objects_yolo.py: Crop images to keep all YOLO-detected boxes and visible keypoints with configurable pixel boundary, then rewrite labels.crop_to_annotations_yolo.py: Crop images and YOLO labels around all boxes and keypoints with padded aspect ratio selection and prefixed outputs.download_google_images.py: Download full-resolution images from Google or Yandex Images search URLs, save as JPEGs, and optionally resize.download_videos_yt_dlp.py: Download TikTok/YouTube videos fromurls.txt(or a single CLI URL) in best available quality using yt-dlp.extend_flip_yolo.py: Extend images with a flipped duplicate and update bounding boxes/keypoints.extract_video_frames.py: Extract frames from all videos in CWD into per-video folders, with optional frame skipping.delete_similar_frames.py: Delete near-duplicate frames by comparing each image to the previous one.extract_tfrecord_images.py: Extract JPEG images from TFRecord files with progress and stats.merge_datasets.py: Merge multiple datasets into a unified train/val layout and writecontent.mdcounts.merge_pose_results.py: Merge body and face pose labels, refining face points.mosaic_self_yolo.py: Build self-mosaics and rewrite YOLO detection/pose labels.mosaic_yolo.py: Build multi-image mosaics with optional flip/rotate and merged labels.optimize_dataset_tiles_yolo.py: Tile images by size/aspect into mosaics, rewrite YOLO labels, and copy/rescale remaining images.resize_images.py: Resize images and convert formats to JPEG by default with progress and stats.sam3.py: Run SAM3 text prompts on one image and export YOLO bbox labels.rename_images_labels.py: Rename images with matching labels using a pattern and update label filenames.rotate_head_tilt_yolo.py: Rotate portraits based on head tilt and update pose labels.rotate_images_labels.py: Rotate images to fixed angles and update labels, supporting YOLO detection and pose.yolo_pose_to_coco_json.py: Convert YOLO11-pose labels into COCO JSON files for train/val splits.visualize-pose.py: Overlay YOLO pose keypoints and boxes onto images for quick inspection.
Each script has a matching .md file in this directory with full usage and
arguments.