Split video by chapters (#158)

* New options `--split-chapters` and `--no-split-chapters` * The output/path of the split files can be given using the key `chapter` * Additional keys `section_title`, `section_number`, `section_start`, `section_end` are available in the output template * Alias `--split-tracks` for parity with animelover/youtube-dl * `--sponskrub-cut` and `--split-chapter` cannot work together Closes: https://github.com/blackjack4494/yt-dlc/issues/277 https://github.com/ytdl-org/youtube-dl/issues/28438 https://github.com/ytdl-org/youtube-dl/issues/12907 https://github.com/ytdl-org/youtube-dl/issues/6480 https://github.com/ytdl-org/youtube-dl/pull/25005 Rewritten from the implementation by: femaref and Wattux https://github.com/Wattux/youtube-dl/tree/split-at-timestamps https://github.com/ytdl-org/youtube-dl/pull/25005 https://github.com/femaref/youtube-dl/tree/split-track
2024-06-17 19:20:11 +02:00 · 2021-03-15 04:32:13 +05:30 · 2021-03-15 04:32:13 +05:30 · 7275535116
commit 7275535116
parent a1c5d2ca64
6 changed files with 79 additions and 6 deletions
--- a/README.md
+++ b/README.md
@ -693,6 +693,13 @@ ## Post-Processing Options:
                                     push {} /sdcard/Music/ && rm {}'
    --convert-subs FORMAT            Convert the subtitles to other format
                                     (currently supported: srt|ass|vtt|lrc)
+    --split-chapters                 Split video into multiple files based on
+                                     internal chapters. The "chapter:" prefix
+                                     can be used with "--paths" and "--output"
+                                     to set the output filename for the split
+                                     files. See "OUTPUT TEMPLATE" for details
+    --no-split-chapters              Do not split video based on chapters
+                                     (default)

 ## SponSkrub (SponsorBlock) Options:
 [SponSkrub](https://github.com/yt-dlp/SponSkrub) is a utility to
@ -810,7 +817,7 @@ # OUTPUT TEMPLATE

 The basic usage of `-o` is not to set any template arguments when downloading a single file, like in `yt-dlp -o funny_video.flv "https://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations. Date/time fields can also be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it inside the parantheses seperated from the field name using a `>`. For example, `%(duration>%H-%M-%S)s`.

-Additionally, you can set different output templates for the various metadata files seperately from the general output template by specifying the type of file followed by the template seperated by a colon ":". The different filetypes supported are `subtitle|thumbnail|description|annotation|infojson|pl_description|pl_infojson`. For example, `-o '%(title)s.%(ext)s' -o 'thumbnail:%(title)s\%(title)s.%(ext)s'`  will put the thumbnails in a folder with the same name as the video.
+Additionally, you can set different output templates for the various metadata files seperately from the general output template by specifying the type of file followed by the template seperated by a colon ":". The different filetypes supported are `subtitle`, `thumbnail`, `description`, `annotation`, `infojson`, `pl_description`, `pl_infojson`, `chapter`. For example, `-o '%(title)s.%(ext)s' -o 'thumbnail:%(title)s\%(title)s.%(ext)s'`  will put the thumbnails in a folder with the same name as the video.

 The available fields are:

@ -901,6 +908,13 @@ # OUTPUT TEMPLATE
 - `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
 - `release_year` (numeric): Year (YYYY) when the album was released

+Available when using `--split-chapters` for videos with internal chapters:
+
+ - `section_title` (string): Title of the chapter
+ - `section_number` (numeric): Number of the chapter within the file
+ - `section_start` (numeric): Start time of the chapter in seconds
+ - `section_end` (numeric): End time of the chapter in seconds
+
 Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default).

 For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `yt-dlp test video` and id `BaW_jenozKcj`, this will result in a `yt-dlp test video-BaW_jenozKcj.mp4` file created in the current directory.
--- a/yt_dlp/init.py
+++ b/yt_dlp/init.py
@ -279,9 +279,14 @@ def parse_retries(retries, name=''):

    def report_conflict(arg1, arg2):
        write_string('WARNING: %s is ignored since %s was given\n' % (arg2, arg1), out=sys.stderr)
+
    if opts.remuxvideo and opts.recodevideo:
        report_conflict('--recode-video', '--remux-video')
        opts.remuxvideo = False
+    if opts.sponskrub_cut and opts.split_chapters and opts.sponskrub is not False:
+        report_conflict('--split-chapter', '--sponskrub-cut')
+        opts.sponskrub_cut = False
+
    if opts.allow_unplayable_formats:
        if opts.extractaudio:
            report_conflict('--allow-unplayable-formats', '--extract-audio')
@ -371,11 +376,7 @@ def report_conflict(arg1, arg2):
        })
        if not already_have_thumbnail:
            opts.writethumbnail = True
-    # XAttrMetadataPP should be run after post-processors that may change file
-    # contents
-    if opts.xattrs:
-        postprocessors.append({'key': 'XAttrMetadata'})
-    # This should be below all ffmpeg PP because it may cut parts out from the video
+    # This should be below most ffmpeg PP because it may cut parts out from the video
    # If opts.sponskrub is None, sponskrub is used, but it silently fails if the executable can't be found
    if opts.sponskrub is not False:
        postprocessors.append({
@ -386,6 +387,11 @@ def report_conflict(arg1, arg2):
            'force': opts.sponskrub_force,
            'ignoreerror': opts.sponskrub is None,
        })
+    if opts.split_chapters:
+        postprocessors.append({'key': 'FFmpegSplitChapters'})
+    # XAttrMetadataPP should be run after post-processors that may change file contents
+    if opts.xattrs:
+        postprocessors.append({'key': 'XAttrMetadata'})
    # ExecAfterDownload must be the last PP
    if opts.exec_cmd:
        postprocessors.append({
--- a/yt_dlp/options.py
+++ b/yt_dlp/options.py
@ -1183,6 +1183,17 @@ def _dict_from_multiple_values_options_callback(
        '--convert-subs', '--convert-subtitles',
        metavar='FORMAT', dest='convertsubtitles', default=None,
        help='Convert the subtitles to other format (currently supported: srt|ass|vtt|lrc)')
+    postproc.add_option(
+        '--split-chapters', '--split-tracks',
+        dest='split_chapters', action='store_true', default=False,
+        help=(
+            'Split video into multiple files based on internal chapters. '
+            'The "chapter:" prefix can be used with "--paths" and "--output" to '
+            'set the output filename for the split files. See "OUTPUT TEMPLATE" for details'))
+    postproc.add_option(
+        '--no-split-chapters', '--no-split-tracks',
+        dest='split_chapters', action='store_false',
+        help='Do not split video based on chapters (default)')

    sponskrub = optparse.OptionGroup(parser, 'SponSkrub (SponsorBlock) Options', description=(
        'SponSkrub (https://github.com/yt-dlp/SponSkrub) is a utility to mark/remove sponsor segments '
--- a/yt_dlp/postprocessor/init.py
+++ b/yt_dlp/postprocessor/init.py
@ -13,6 +13,7 @@
    FFmpegVideoConvertorPP,
    FFmpegVideoRemuxerPP,
    FFmpegSubtitlesConvertorPP,
+    FFmpegSplitChaptersPP,
 )
 from .xattrpp import XAttrMetadataPP
 from .execafterdownload import ExecAfterDownloadPP
@ -31,6 +32,7 @@ def get_postprocessor(key):
    'ExecAfterDownloadPP',
    'FFmpegEmbedSubtitlePP',
    'FFmpegExtractAudioPP',
+    'FFmpegSplitChaptersPP',
    'FFmpegFixupM3u8PP',
    'FFmpegFixupM4aPP',
    'FFmpegFixupStretchedPP',
--- a/yt_dlp/postprocessor/ffmpeg.py
+++ b/yt_dlp/postprocessor/ffmpeg.py
@ -10,6 +10,7 @@

 from .common import AudioConversionError, PostProcessor

+from ..compat import compat_str
 from ..utils import (
    encodeArgument,
    encodeFilename,
@ -769,3 +770,40 @@ def run(self, info):
                }

        return sub_filenames, info
+
+
+class FFmpegSplitChaptersPP(FFmpegPostProcessor):
+
+    def _prepare_filename(self, number, chapter, info):
+        info = info.copy()
+        info.update({
+            'section_number': number,
+            'section_title': chapter.get('title'),
+            'section_start': chapter.get('start_time'),
+            'section_end': chapter.get('end_time'),
+        })
+        return self._downloader.prepare_filename(info, 'chapter')
+
+    def _ffmpeg_args_for_chapter(self, number, chapter, info):
+        destination = self._prepare_filename(number, chapter, info)
+        if not self._downloader._ensure_dir_exists(encodeFilename(destination)):
+            return
+
+        chapter['_filename'] = destination
+        self.to_screen('Chapter %03d; Destination: %s' % (number, destination))
+        return (
+            destination,
+            ['-ss', compat_str(chapter['start_time']),
+             '-to', compat_str(chapter['end_time'])])
+
+    def run(self, info):
+        chapters = info.get('chapters') or []
+        if not chapters:
+            self.report_warning('There are no tracks to extract')
+            return [], info
+
+        self.to_screen('Splitting video by chapters; %d chapters found' % len(chapters))
+        for idx, chapter in enumerate(chapters):
+            destination, opts = self._ffmpeg_args_for_chapter(idx + 1, chapter, info)
+            self.real_run_ffmpeg([(info['filepath'], opts)], [(destination, ['-c', 'copy'])])
+        return [], info
--- a/yt_dlp/utils.py
+++ b/yt_dlp/utils.py
@ -4182,8 +4182,10 @@ def q(qid):

 DEFAULT_OUTTMPL = {
    'default': '%(title)s [%(id)s].%(ext)s',
+    'chapter': '%(title)s - %(section_number)03d %(section_title)s [%(id)s].%(ext)s',
 }
 OUTTMPL_TYPES = {
+    'chapter': None,
    'subtitle': None,
    'thumbnail': None,
    'description': 'description',