# PySceneDetect PySceneDetect is an open-source Python library and command-line tool for automatically detecting scene changes (cuts and fades) in video files. It analyzes video frame-by-frame and produces a list of timecoded scene boundaries that can be used to split the video into individual clips, export scene data in various formats (CSV, HTML, EDL, FCPXML, OTIO), or drive custom downstream processing pipelines. The library supports multiple video backends (OpenCV, PyAV, MoviePy) and requires only OpenCV as a mandatory dependency. The core functionality centers on a `SceneManager` that coordinates one or more `SceneDetector` instances against a `VideoStream`. Five built-in detectors cover the common cases: `ContentDetector` for fast cuts via HSV color-space differencing, `AdaptiveDetector` for the same scoring with a rolling-average threshold, `ThresholdDetector` for fade-in/fade-out detection, `HistogramDetector` for YUV histogram comparison, and `HashDetector` for perceptual-hash–based detection. Results are returned as lists of `FrameTimecode` pairs and can be exported or fed directly into ffmpeg/mkvmerge for video splitting. --- ## `detect` — One-call scene detection High-level convenience function that opens a video, runs a detector from start to end (or a specified window), and returns a `SceneList`. Ideal for simple scripts where manual `SceneManager` control is not needed. ```python from scenedetect import detect, ContentDetector, AdaptiveDetector # Detect all fast cuts in a video file (default threshold = 27.0) scenes = detect("video.mp4", ContentDetector()) for i, (start, end) in enumerate(scenes): print(f"Scene {i+1}: {start.get_timecode()} -> {end.get_timecode()} " f"({end.frame_num - start.frame_num} frames)") # Detect only within a time window and save per-frame stats for tuning scenes = detect( "video.mp4", ContentDetector(threshold=30.0), stats_file_path="video.stats.csv", show_progress=True, start_time="00:00:10", # HH:MM:SS string end_time=120.0, # seconds as float start_in_scene=True, # treat the first frame as inside a scene ) # Expected output: list of (FrameTimecode, FrameTimecode) tuples # e.g. [(<00:00:10.000>, <00:00:15.040>), (<00:00:15.040>, <00:01:59.960>)] ``` --- ## `open_video` — Open a video with a specific backend Opens a video file (or URL) and returns a `VideoStream` object suitable for passing to `SceneManager.detect_scenes()`. Falls back to OpenCV if the requested backend is unavailable or fails. ```python from scenedetect import open_video, AVAILABLE_BACKENDS # List backends available on this system print(AVAILABLE_BACKENDS) # e.g. {'opencv': , 'pyav': ...} # Open with default backend (opencv) video = open_video("video.mp4") print(video.frame_rate, video.frame_size, video.duration) # Open with PyAV backend, overriding detected framerate video = open_video("video.mp4", framerate=23.976, backend="pyav") # Open with OpenCV backend and override a backend-specific option from scenedetect.backends.opencv import VideoStreamCv2 video = open_video("video.mp4", backend="opencv") # Read frames manually while True: frame = video.read() if frame is False: break print(f"Total frames read: {video.frame_number}") ``` --- ## `SceneManager` — Core detection orchestrator Coordinates one or more `SceneDetector` instances against a `VideoStream`. Decodes video in a background thread for better performance. Supports optional cropping, downscaling, callbacks on detected scenes, and time-window limits. ```python import numpy as np from scenedetect import open_video, SceneManager, ContentDetector, StatsManager video = open_video("video.mp4") # Attach a StatsManager to persist per-frame metrics for later threshold tuning stats = StatsManager() scene_manager = SceneManager(stats_manager=stats) scene_manager.add_detector(ContentDetector(threshold=27.0, min_scene_len=15)) # Optional: crop to a region of interest (X0, Y0, X1, Y1), zero-indexed inclusive scene_manager.crop = (0, 0, 1279, 359) # top half of a 1280x720 video # Optional callback invoked on the first frame of each new detected scene def on_scene(frame_img: np.ndarray, timecode): print(f"New scene at {timecode}") num_frames = scene_manager.detect_scenes( video=video, end_time="00:05:00", # stop after 5 minutes frame_skip=0, # must be 0 when using StatsManager show_progress=True, callback=on_scene, ) print(f"Processed {num_frames} frames") scenes = scene_manager.get_scene_list(start_in_scene=True) for start, end in scenes: print(f"{start} -> {end} ({(end-start).seconds:.3f}s)") # Save statistics CSV for manual threshold analysis stats.save_to_csv("video.stats.csv") ``` --- ## `ContentDetector` — Fast-cut detection via HSV differencing Detects hard cuts by comparing the per-channel HSV difference between consecutive frames against a fixed `threshold`. The default weight set uses hue, saturation, and luma equally. A `FlashFilter` prevents double-detection of very brief cuts. ```python from scenedetect import detect, ContentDetector # Default settings (threshold=27, min_scene_len=15 frames) scenes = detect("video.mp4", ContentDetector()) # Luma-only mode – more robust on desaturated content scenes = detect("video.mp4", ContentDetector(luma_only=True, threshold=20.0)) # Custom component weights: emphasise luma and edges, ignore hue weights = ContentDetector.Components( delta_hue=0.0, delta_sat=0.5, delta_lum=1.0, delta_edges=0.5, ) scenes = detect( "video.mp4", ContentDetector( threshold=25.0, min_scene_len="0.5s", # 500 ms minimum scene length weights=weights, kernel_size=7, # explicit edge-dilation kernel (must be odd >= 3) filter_mode=ContentDetector.FlashFilter.Mode.MERGE, ), ) print(f"Detected {len(scenes)} scenes") ``` --- ## `AdaptiveDetector` — Adaptive-threshold fast-cut detection Extends `ContentDetector` with a two-pass approach: the raw `content_val` score is divided by the rolling average of `window_width` neighboring frames on each side, yielding an `adaptive_ratio` that is robust to global illumination changes and fast camera pans. ```python from scenedetect import detect, AdaptiveDetector # Default: adaptive_threshold=3.0, window_width=2 frames each side scenes = detect("video.mp4", AdaptiveDetector()) # Lower adaptive_threshold catches more subtle scene changes; # min_content_val filters out frames with very little motion scenes = detect( "video.mp4", AdaptiveDetector( adaptive_threshold=2.5, min_content_val=10.0, window_width=4, # wider context window min_scene_len=25, # at least 25 frames between cuts luma_only=True, ), ) for start, end in scenes: print(start.get_timecode(), "->", end.get_timecode()) ``` --- ## `ThresholdDetector` — Fade-in/fade-out detection Detects transitions to/from black (or another fixed luminance level) by comparing average frame brightness against a pixel-intensity threshold. Use `Method.FLOOR` for fade-to-black and `Method.CEILING` for fade-from-black. ```python from scenedetect import detect, ThresholdDetector # Default: threshold=12, detects fade-outs to near-black scenes = detect("video.mp4", ThresholdDetector()) # Custom fade-to-black detection with timing bias and final scene scenes = detect( "video.mp4", ThresholdDetector( threshold=20, min_scene_len=30, fade_bias=0.0, # cut placed at midpoint of the fade add_final_scene=True, # emit a scene for the trailing fade-out method=ThresholdDetector.Method.FLOOR, ), ) print(f"Scenes: {len(scenes)}") # e.g. Scenes: 5 # CEILING mode: detects when brightness rises above threshold scenes = detect( "video.mp4", ThresholdDetector(threshold=200, method=ThresholdDetector.Method.CEILING), ) ``` --- ## `HistogramDetector` — YUV histogram comparison Compares the luma (Y) channel histogram between consecutive frames using the OpenCV correlation metric. A drop in correlation below `1.0 - threshold` triggers a cut. ```python from scenedetect import detect, HistogramDetector # Default: threshold=0.05, bins=256 (5 % correlation drop triggers cut) scenes = detect("video.mp4", HistogramDetector()) # More sensitive: lower threshold, fewer bins scenes = detect( "video.mp4", HistogramDetector( threshold=0.03, bins=128, min_scene_len="0.4s", ), ) for start, end in scenes: print(f"{start} -> {end}") # Compute a histogram manually (static helper) import cv2 frame = cv2.imread("frame.jpg") hist = HistogramDetector.calculate_histogram(frame, bins=256, normalize=True) print(hist.shape) # (256,) ``` --- ## `HashDetector` — Perceptual hash detection Computes a perceptual hash via DCT lowpass filtering (pHash) for each frame and triggers a cut when the normalized Hamming distance between consecutive hashes exceeds `threshold`. ```python from scenedetect import detect, HashDetector # Default: threshold=0.395, size=16, lowpass=2 scenes = detect("video.mp4", HashDetector()) # More conservative: require a larger hash distance to fire scenes = detect( "video.mp4", HashDetector( threshold=0.45, size=16, # DCT block size (larger = more robust but slower) lowpass=2, # keep lower 1/lowpass fraction of frequency data min_scene_len=20, ), ) print(f"Detected {len(scenes)} scenes") # Compute a hash directly import cv2 import numpy as np frame = cv2.imread("frame.jpg") h = HashDetector.hash_frame(frame, hash_size=16, factor=2) print(h.shape, h.dtype) # (16, 16) bool ``` --- ## `FrameTimecode` — Frame-accurate timestamp arithmetic Represents a timestamp as a frame number, seconds float, or `HH:MM:SS[.nnn]` string, all tied to a framerate. Supports arithmetic and comparison with mixed types. ```python from scenedetect.common import FrameTimecode fps = 29.97 t1 = FrameTimecode(timecode="00:01:00.000", fps=fps) t2 = FrameTimecode(timecode=30, fps=fps) # 30 frames t3 = FrameTimecode(timecode=5.0, fps=fps) # 5 seconds print(t1.frame_num) # 1798 print(t1.seconds) # 60.0 print(t1.get_timecode()) # '00:01:00.000' # Arithmetic with mixed types t4 = t1 + 10 # add 10 frames -> FrameTimecode t5 = t1 + 2.5 # add 2.5 seconds t6 = t1 + "00:00:05" # add 5 seconds print(t4.get_timecode()) # '00:01:00.334' print(t1 > t2) # True (60 s > ~1 s) print(t1 == "00:01:00.000") # True # Subtraction clamps to 0 result = t2 - t1 print(result.frame_num) # 0 (clamped, t1 > t2) ``` --- ## `save_images` — Extract representative frames per scene Saves a configurable number of image files for each scene, distributing them evenly across the scene duration. Encoding and disk I/O are handled in background threads for performance. ```python from scenedetect import open_video, SceneManager, ContentDetector from scenedetect.output import save_images video = open_video("video.mp4") scene_manager = SceneManager() scene_manager.add_detector(ContentDetector()) scene_manager.detect_scenes(video) scenes = scene_manager.get_scene_list(start_in_scene=True) # Save 3 images per scene (start, middle, end) as high-quality JPEGs image_files = save_images( scene_list=scenes, video=video, num_images=3, frame_margin=1, # frames to pad from scene boundary image_extension="jpg", encoder_param=95, # JPEG quality 0-100 image_name_template="$VIDEO_NAME-Scene-$SCENE_NUMBER-$IMAGE_NUMBER", output_dir="./scene_images/", show_progress=True, height=480, # rescale to 480 px height, keep aspect ratio # width=640, # or specify width instead # scale=0.5, # or a fractional scale factor ) # Returns {scene_index: [filepath, ...], ...} for scene_idx, paths in image_files.items(): print(f"Scene {scene_idx + 1}: {paths}") # Scene 1: ['scene_images/video-Scene-001-01.jpg', ..., 'video-Scene-001-03.jpg'] ``` --- ## `split_video_ffmpeg` — Split video into clips using ffmpeg Invokes `ffmpeg` to extract each detected scene as a separate video file. Supports custom encoding arguments, filename templates, and an optional custom formatter callback. ```python from scenedetect import detect, ContentDetector from scenedetect.output import split_video_ffmpeg, is_ffmpeg_available if not is_ffmpeg_available(): raise RuntimeError("ffmpeg is not installed or not on PATH") scenes = detect("video.mp4", ContentDetector()) # Default: H.264 + AAC, veryfast preset, CRF 22 ret = split_video_ffmpeg( input_video_path="video.mp4", scene_list=scenes, output_dir="./clips/", output_file_template="$VIDEO_NAME-Scene-$SCENE_NUMBER.mp4", show_progress=True, ) print(f"ffmpeg exited with code {ret}") # 0 = success # High-quality copy mode (no re-encode, frame-accurate only at keyframes) ret = split_video_ffmpeg( "video.mp4", scenes, output_file_template="$VIDEO_NAME-$START_TIME-$END_TIME.mp4", arg_override="-c copy", show_output=True, # show ffmpeg output for first clip ) # Custom formatter: include millisecond PTS in filename from scenedetect.output.video import default_formatter formatter = default_formatter("$VIDEO_NAME-$SCENE_NUMBER-pts$START_PTS.mp4") split_video_ffmpeg("video.mp4", scenes, formatter=formatter) ``` --- ## `split_video_mkvmerge` — Split video into MKV clips using mkvmerge Uses `mkvmerge` (from mkvtoolnix) for lossless, keyframe-accurate splitting into Matroska containers. Faster than ffmpeg re-encode for archival use cases. ```python from scenedetect import detect, ContentDetector from scenedetect.output import split_video_mkvmerge, is_mkvmerge_available if not is_mkvmerge_available(): raise RuntimeError("mkvmerge not found") scenes = detect("video.mp4", ContentDetector()) ret = split_video_mkvmerge( input_video_path="video.mp4", scene_list=scenes, output_dir="./mkv_clips/", output_file_template="$VIDEO_NAME.mkv", # mkvmerge appends -001, -002, ... show_output=False, ) # Returns 0 on success, non-zero on error assert ret == 0 ``` --- ## `write_scene_list` / `write_scene_list_html` — Export scene list to CSV or HTML Writes the scene list to a CSV or HTML file for review in spreadsheet editors or browsers. The HTML export can optionally embed image thumbnails from `save_images`. ```python import io from scenedetect import detect, ContentDetector from scenedetect.output import write_scene_list, write_scene_list_html, save_images from scenedetect import open_video video = open_video("video.mp4") scenes = detect("video.mp4", ContentDetector()) # Write CSV to disk with open("scenes.csv", "w", newline="") as f: write_scene_list( output_csv_file=f, scene_list=scenes, include_cut_list=True, # prepend a row with all cut timecodes col_separator=",", ) # scenes.csv columns: Scene Number, Start Frame, Start Timecode, # Start Time (seconds), End Frame, End Timecode, End Time (seconds), # Length (frames), Length (timecode), Length (seconds) # Write HTML with embedded thumbnails from scenedetect import open_video, SceneManager, ContentDetector from scenedetect.output import save_images video2 = open_video("video.mp4") sm = SceneManager() sm.add_detector(ContentDetector()) sm.detect_scenes(video2) scenes2 = sm.get_scene_list(start_in_scene=True) image_files = save_images(scenes2, video2, num_images=1, output_dir="./thumbs/") write_scene_list_html( output_html_filename="scenes.html", scene_list=scenes2, image_filenames=image_files, image_width=200, ) ``` --- ## `write_scene_list_edl` / `write_scene_list_fcpx` / `write_scene_list_fcp7` / `write_scene_list_otio` — Export to NLE formats Export the scene list to CMX 3600 EDL, Final Cut Pro X XML (FCPXML 1.9), Final Cut Pro 7 xmeml, or OpenTimelineIO JSON for direct import into professional video editors. ```python from fractions import Fraction from scenedetect import detect, ContentDetector, open_video from scenedetect.output import ( write_scene_list_edl, write_scene_list_fcpx, write_scene_list_fcp7, write_scene_list_otio, ) video = open_video("video.mp4") scenes = detect("video.mp4", ContentDetector()) fps = Fraction(video.frame_rate).limit_denominator(1001) size = video.frame_size # (width, height) # CMX 3600 EDL (compatible with Avid, DaVinci Resolve, Premiere) write_scene_list_edl( output_path="scenes.edl", scene_list=scenes, title="My Project", reel="AX", start_timecode="01:00:00:00", # optional SMPTE offset ) # FCPXML 1.9 for Final Cut Pro X write_scene_list_fcpx( output_path="scenes.fcpxml", scene_list=scenes, video_path="video.mp4", frame_rate=fps, frame_size=size, video_name="My Project", ) # Final Cut Pro 7 / DaVinci Resolve xmeml write_scene_list_fcp7( output_path="scenes.xml", scene_list=scenes, video_path="video.mp4", frame_rate=fps, frame_size=size, ) # OpenTimelineIO JSON (compatible with many NLEs via OTIO adapters) write_scene_list_otio( output_path="scenes.otio", scene_list=scenes, video_path="video.mp4", frame_rate=fps, audio=True, ) ``` --- ## `StatsManager` — Per-frame metrics persistence Provides a key-value store for detector-computed metrics (e.g. `content_val`, `adaptive_ratio`) indexed by `FrameTimecode`. Can be saved to and loaded from a CSV file, enabling threshold tuning without re-processing the video. ```python from scenedetect import open_video, SceneManager, ContentDetector, StatsManager STATS_PATH = "video.stats.csv" # --- Pass 1: compute and persist metrics --- video = open_video("video.mp4") stats = StatsManager() sm = SceneManager(stats_manager=stats) sm.add_detector(ContentDetector(threshold=27.0)) sm.detect_scenes(video) stats.save_to_csv(STATS_PATH) # --- Pass 2: reuse cached metrics (fast) --- video2 = open_video("video.mp4") stats2 = StatsManager() stats2.load_from_csv(STATS_PATH) # cached metrics loaded; detection re-uses them sm2 = SceneManager(stats_manager=stats2) sm2.add_detector(ContentDetector(threshold=20.0)) # try a different threshold sm2.detect_scenes(video2) scenes = sm2.get_scene_list(start_in_scene=True) print(f"Found {len(scenes)} scenes with threshold=20") ``` --- ## `VideoCaptureAdapter` — Wrap an existing `cv2.VideoCapture` Wraps a raw OpenCV `VideoCapture` object (or any device/pipe) as a `VideoStream` so it can be used with `SceneManager`. Useful for webcams, RTSP streams, and GStreamer pipelines. ```python import cv2 from scenedetect import SceneManager, ContentDetector from scenedetect.backends import VideoCaptureAdapter # Use webcam device 0 cap = cv2.VideoCapture(0) video = VideoCaptureAdapter(cap) scene_manager = SceneManager() scene_manager.add_detector(ContentDetector(threshold=20.0)) # Detect scenes in the first 500 frames scene_manager.detect_scenes(video=video, duration=500) scenes = scene_manager.get_scene_list(start_in_scene=True) print(f"Detected {len(scenes)} scene transitions in webcam stream") cap.release() ``` --- ## CLI — `scenedetect` command-line interface The `scenedetect` command exposes all major features without writing any Python code. Detection method, time window, output commands, and backend are composable sub-commands. ```bash # Detect cuts with adaptive detector and split into MP4 clips (requires ffmpeg) scenedetect -i video.mp4 detect-adaptive split-video # Detect threshold fades, list scenes, save 3 images per scene scenedetect -i video.mp4 detect-threshold -t 15 list-scenes save-images -n 3 # Process only 00:01:00 -> 00:03:00 with content detector, save stats CSV scenedetect -i video.mp4 -s video.stats.csv \ time -s 00:01:00 -e 00:03:00 \ detect-content -t 30 \ list-scenes # Split into high-quality MKV clips (codec copy, no re-encode) scenedetect -i video.mp4 detect-adaptive split-video --mkvmerge # Use PyAV backend and custom output directory scenedetect -i video.mp4 --backend pyav \ detect-content \ save-images --output ./frames/ --format png \ split-video --output ./clips/ --high-quality # Hash-based detection scenedetect -i video.mp4 detect-hash list-scenes # Histogram-based detection with custom threshold scenedetect -i video.mp4 detect-hist --threshold 0.04 list-scenes ``` --- ## Summary PySceneDetect is used in two primary modes: as a **Python library** embedded in video-processing pipelines, and as a **standalone CLI tool** for one-shot scene splitting and export. In library mode the typical integration pattern is: open a video with `open_video`, construct a `SceneManager` (optionally with a `StatsManager` for stats persistence), add one or more detectors, call `detect_scenes`, retrieve the `SceneList`, and then pass it to output helpers (`save_images`, `split_video_ffmpeg`, `write_scene_list_*`). The `detect` convenience function collapses this into a single call for scripts that only need the scene list. For automation pipelines the exported formats (EDL, FCPXML, FCP7 xmeml, OTIO, CSV, HTML) cover virtually every professional NLE workflow, and the `PathFormatter` callback in `split_video_ffmpeg` enables fully custom output naming. The `VideoCaptureAdapter` allows PySceneDetect to operate on live camera feeds, RTSP streams, and any other source accessible via `cv2.VideoCapture`, making it suitable for real-time segmentation and surveillance analytics. For difficult content (fast camera pans, variable lighting) `AdaptiveDetector` is generally the best first choice, while `ThresholdDetector` remains the right tool for traditionally-edited content with explicit fade-to-black transitions.