Commit Graph

428 Commits

Author SHA1 Message Date
coletdjnz
c35448b7b1
[extractor/youtube] Extract more metadata for comments (#7179)
Adds new comment fields:
* `author_url` - The url to the comment author's page
* `author_is_verified` - Whether the author is verified on the platform
* `is_pinned` - Whether the comment is pinned to the top of the comments

Closes https://github.com/yt-dlp/yt-dlp/issues/5411

Authored by: coletdjnz
2023-06-01 08:43:32 +00:00
coletdjnz
18f8fba7c8
[extractor/youtube] Fix continuation loop with no comments (#7148)
Deep check the response for incomplete data.

Authored by: coletdjnz
2023-05-31 07:08:28 +00:00
coletdjnz
93e12ed76e
[extractor/youtube] Extract uploader metadata for feed/playlist items
Fixes https://github.com/yt-dlp/yt-dlp/issues/7104

Authored by: coletdjnz
2023-05-28 11:31:45 +12:00
Audrey
5caf30dbc3
[extractor/youtube] Extract heatmap data (#7100)
Closes #3888
Authored by: tntmod54321
2023-05-26 17:54:39 +05:30
pukkandan
4823ec9f46
Update to ytdl-commit-d1c6c5
[YouTube] [core] Improve platform debug log, based on yt-dlp
d1c6c5c4d6

Except:
    * 6ed34338285f722d0da312ce0af3a15a077a3e2a [jsinterp] Add short-cut evaluation for common expression
        * There was no performance improvement when tested with https://github.com/ytdl-org/youtube-dl/issues/30641
    * e8de54bce50f6f77a4d7e8e80675f7003d5bf630 [core] Handle `/../` sequences in HTTP URLs
        * We plan to implement this differently
2023-05-24 23:30:43 +05:30
kangalio
69a40e4a7f
[extractor/youtube:music:search_url] Extract title (#7102)
Authored by: kangalio
Closes #7095
2023-05-22 17:17:06 +05:30
coletdjnz
447afb9eaa
[extractor/youtube] Support podcasts and releases tabs
Closes https://github.com/yt-dlp/yt-dlp/issues/6893

Authored by: coletdjnz
2023-05-20 19:11:03 +12:00
coletdjnz
7666b93604
[extractor/youtube] Define strict uploader metadata mapping (#6384)
New mapping:
```
channel -> channel name
channel_id -> UCID
channel_url -> UCID channel url

uploader -> channel name (same as channel field)
uploader_id -> @handle
uploader_url -> @handle channel url 
```

Authored by: coletdjnz
2023-04-14 07:58:36 +00:00
pukkandan
c6786ff3ba
[extractor/youtube] Revert default formats to https 2023-04-11 21:43:31 +05:30
coletdjnz
141a8dff98
[extractor/youtube] Fix comment loop detection for pinned comments (#6714)
Pinned comments may repeat a second time - this is expected.

Fixes https://github.com/yt-dlp/yt-dlp/issues/6712

Authored by: coletdjnz
2023-04-06 07:44:22 +00:00
Nicholas Defranco
071670cbea
[extractor/youtube] Fix parsing comment_count (#6523)
Closes #5849
Authored by: nick-cd
2023-03-15 04:51:14 +05:30
coletdjnz
607510b9f2
[extractor/youtube] Handle incomplete initial data from watch page (#6510)
Authored by: coletdjnz
2023-03-13 01:43:37 +00:00
pukkandan
e389d172b6
Fix 2a23d92d9e
Closes #6517
2023-03-12 14:47:05 +05:30
pukkandan
2a23d92d9e
[extractor/youtube] Construct fragment list lazily
Building fragment list for all formats take significant time for large videos
2023-03-11 22:46:47 +05:30
pukkandan
86cb922118
[extractor/youtube] Add extractor-arg include_duplicate_formats 2023-03-11 22:34:13 +05:30
Lesmiscore
c795c39f27
[extractor/youtube] Add client name to format_note when -v (#6254)
Authored by: Lesmiscore, pukkandan
2023-03-11 22:33:23 +05:30
pukkandan
c9abebb851
[extractor/youtube] Bypass throttling for -f17
and related cleanup

Thanks @AudricV for the finding
2023-03-09 22:13:03 +05:30
pukkandan
392389b7df
[cleanup] Misc 2023-03-05 03:34:55 +05:30
mushbite
22ccd5420b
[extractor/rutube] Extract chapters from description (#6345)
Authored by: mushbite
2023-03-04 19:03:17 +05:30
coletdjnz
7f51861b18
[extractor/youtube] Detect and break on looping comments (#6301)
Fixes https://github.com/yt-dlp/yt-dlp/issues/6290

Authored by: coletdjnz
2023-03-01 07:56:53 +00:00
pukkandan
5b28cef72d
[cleanup] Misc 2023-02-28 23:51:06 +05:30
pukkandan
31e183557f
[extractor/youtube] Extract channel view_count when /about tab is passed 2023-02-28 23:51:03 +05:30
pukkandan
f34804b2f9
[extractor/youtube] Fix 5038f6d713
* [fragment] Fix `request_data`
* [youtube] Don't use POST for now. It may be easier to break in future

Authored by: bashonly, coletdjnz
2023-02-28 23:34:43 +05:30
pukkandan
5038f6d713
[extractor/youtube] Construct dash formats with range query
Closes #6369
2023-02-28 23:14:37 +05:30
pukkandan
a538772969
[cleanup] Misc
Closes #5897
2023-02-17 17:52:22 +05:30
bashonly
c61cf091a5
[extractor/youtube] uploader_id includes @ with handle
Authored by: bashonly
2023-02-17 02:14:45 -06:00
bashonly
149eb0bbf3
[extractor/youtube] Fix uploader_id extraction
Closes #6247
Authored by: bashonly
2023-02-16 08:51:45 -06:00
Bruno Guerreiro
78a78fa74d
[extractor/youtube] Add hyperpipe instances (#6020)
Authored by: Generator
2023-02-12 14:03:45 +05:30
Roland Hieber
05799a48c7
[extractor/youtube] Update invidious and piped instances (#6030)
Authored by: rohieb
2023-02-12 13:22:07 +05:30
Simon Sawicki
6839ae1f6d
[utils] traverse_obj: Fix more bugs
and cleanup uses of `default=[]`

Continued from b1bde57bef
2023-02-10 19:36:55 +05:30
pukkandan
b032ff0f03
[extractor/youtube] Handle consent.youtube 2023-02-03 23:53:42 +05:30
pukkandan
dad2210c0c
[extractor/youtube] Support /live/ URL 2023-02-03 23:53:41 +05:30
mzhou
253ac4ba6a
[extractor/youtube] Retry manifest refresh for live-from-start (#5670)
Avoids ending download early when live stream is temporarily offline.
Best used with somewhat large `--retry-sleep extractor:` and `--extractor-retries`

Authored by: mzhou
2023-01-07 01:00:42 +05:30
pukkandan
08e29b9f1f
[cleanup] Misc
Closes #5576, closes #5887
2023-01-02 19:40:15 +05:30
pukkandan
9bb856998b
[extractor/youtube] Extract DRC formats 2022-12-30 15:50:17 +05:30
Matthew
c733555106
[extractor/youtube:tab] Extract metadata from channel items (#5569)
Authored by: coletdjnz
2022-12-12 23:08:14 +00:00
pukkandan
71eb82d1b2
[extractor/youtube] Subtitles cannot be translated to und
Closes #5674
2022-11-30 05:18:18 +05:30
Bnyro
bc87dac75f
[extractor/youtube] Add piped.video (#5571)
Closes #5518
Authored by: Bnyro
2022-11-17 18:45:38 +05:30
pukkandan
9f14daf22b
[extractor] Deprecate _sort_formats 2022-11-17 11:40:17 +05:30
pukkandan
6368e2e639
[cleanup] Misc
Closes #5541
2022-11-16 06:57:07 +05:30
pukkandan
a4894d3e25
[extractor/youtube] Consider language in format de-duplication 2022-11-15 05:23:46 +05:30
pukkandan
171a31dbe8
[extractor] Add a way to distinguish IEs that returns only videos 2022-11-13 10:56:04 +05:30
pukkandan
a8c754cc00
[extractor/youtube] Fix bug in handling of music URLs
Bug in bd7e919a75
Closes #5502
2022-11-12 00:02:13 +05:30
pukkandan
08270da5c3
[extractor/youtube] Fix ytuser: 2022-11-11 16:29:52 +05:30
pukkandan
bd7e919a75
[extractor/youtube:tab] Improvements to tab handling (#5487)
* Better handling of direct channel URLs - See https://github.com/yt-dlp/yt-dlp/pull/5439#issuecomment-1309322019
* Prioritize tab id from URL slug - Closes #5486
* Add metadata for the wrapping playlist
* Simplify redirect for music playlists
2022-11-11 13:52:40 +05:30
Matthew
e72e48c53f
[extractor/youtube] Ignore incomplete data error for comment replies (#5490)
When --ignore-errors is used.
Closes https://github.com/yt-dlp/yt-dlp/issues/4669
Authored by: coletdjnz
2022-11-10 06:35:22 +00:00
Matthew
0cf643b234
[extractor/youtube] Differentiate between no and disabled comments (#5491)
`comments` and `comment_count` will be set to None, as opposed to 
an empty list and 0, respectively.

Fixes https://github.com/yt-dlp/yt-dlp/issues/5068

Authored by: coletdjnz, pukkandan
2022-11-10 03:33:03 +00:00
Matthew
4dc23a8051
[extractor/youtube:tab] Fix video metadata from tabs (#5489)
Closes #5488
Authored by: coletdjnz
2022-11-10 08:14:12 +05:30
Matthew
86973308cd
[extractor/youtube:tab] Update tab handling for redesign (#5439)
Closes #5432, #5430, #5419
Authored by: coletdjnz, pukkandan
2022-11-09 14:28:44 +05:30
Bruno Guerreiro
e14ea7fbd9
[extractor/youtube] Update piped instances (#5441)
Closes #5286
Authored by: Generator
2022-11-06 23:12:23 +05:30
Matthew
6141346d18
[extractor/youtube] Update playlist metadata extraction for new layout (#5376)
Fixes https://github.com/yt-dlp/yt-dlp/issues/5373

Authored by: coletdjnz
2022-11-06 05:25:31 +00:00
pukkandan
2e30b46fe4
[extractor/youtube] Improve chapter parsing from description
Closes #5448
2022-11-05 15:34:53 +05:30
nosoop
9da6612b0f
[extractor/youtube] Fix duration for premieres (#5382)
Closes #5378
Authored by: nosoop
2022-10-29 00:00:33 +05:30
coletdjnz
e63faa101c
[extractor/youtube] Fix live_status extraction for playlist videos
Regression in 867c66ff97

Authored by: coletdjnz
2022-10-27 17:36:54 +13:00
bsun0000
5318156f1c
[extractor/youtube] Mark videos as fully watched
Closes #2555
Authored by: bsun0000
2022-10-19 00:07:47 +05:30
pukkandan
d5d1df8afd
[cleanup Misc
Closes #5162
2022-10-18 23:52:44 +05:30
pukkandan
6678a4f0b3
[extractor/youtube] Fix live_status
Bug in 4d37720a0c
2022-10-14 07:41:53 +05:30
pukkandan
5225df50cf
[extractor/youtube:tab] Let approximate_date return timestamp 2022-10-13 15:30:15 +05:30
pukkandan
0468a3b325
[jsinterp] Improve separating regex
Fixes https://github.com/yt-dlp/yt-dlp/issues/4635#issuecomment-1273974909
2022-10-11 08:02:26 +05:30
Matthew
867c66ff97
[extractor/youtube] Extract concurrent view count for livestreams (#5152)
Adds new field `concurrent_view_count`
Closes https://github.com/yt-dlp/yt-dlp/issues/4843

Authored by: coletdjnz
2022-10-07 07:00:40 +00:00
Lesmiscore
4d37720a0c
[extractor/youtube] Download post_live videos from start (#5091)
* The fragments are generated as a `LazyList`. So only the required formats are expanded during download, but all fragment lists are printed/written in infojson.
* The m3u8 formats which cannot be downloaded from start are not extracted by default, but can be enabled with an extractor-arg. The extractor-arg `include_live_dash` is renamed to `include_incomplete_formats` to account for this new use-case.

Closes #1564
Authored by: Lesmiscore, pukkandan
2022-10-04 08:18:31 +05:30
pukkandan
a057779d5e
[cleanup] Minor fixes
Closes #5129, Closes #4982
2022-10-04 01:48:14 +05:30
pukkandan
7a32c70d13
[cleanup] Fix flake8 and minor refactor
Issues from ab029d7e92, 1fb53b946c
2022-09-27 08:32:57 +05:30
pukkandan
709ee21417
[extractor/youtube] Do not warn on duplicate chapters
Eg: vYbaM8w8yzw
2022-09-27 08:26:26 +05:30
pukkandan
1fb53b946c
[extractor/youtube:tab] Improve continuation items extraction 2022-09-27 04:44:54 +05:30
pukkandan
1dd18a8808
[extractor/YoutubeShortsAudioPivot] Support source URLs
`ytshortsap:` is no longer needed
2022-09-27 04:44:50 +05:30
pukkandan
0a5095fe8d
[extractor/youtube:tab] Support reporthistory page
Closes #4929
2022-09-27 04:44:50 +05:30
coletdjnz
0ca0f88121
[extractor/heise] Fix extractor (#5029)
Fixes https://github.com/yt-dlp/yt-dlp/issues/1520
Authored by: coletdjnz
2022-09-26 00:58:06 +00:00
coletdjnz
80eb0bd9b9
[extractor/youtube] Add support for Shorts audio pivot feed (#4932)
This feed shows Shorts using the audio of a given video. 

ytshortsap: prefix can be used as a shortcut until YouTube
implements an official view. 

Closes #4911
Authored by: coletdjnz
2022-09-22 05:39:02 +00:00
coletdjnz
c26f9b991a
[extractor/youtube] Support changing extraction language (#4470)
Adds `--extractor-args youtube:lang=<supported lang code>` extractor arg to prefer translated fields (e.g. title and description) of that language, if available, for all YouTube extractors. See README or error message for list of supported language codes.

Closes https://github.com/yt-dlp/yt-dlp/issues/387

Authored by: coletdjnz
2022-09-09 05:16:46 +00:00
coletdjnz
3ffb2f5bea
[extractor/youtube] Fix video like count extraction
Support new combined button layout
Authored by: coletdjnz
2022-09-09 12:34:39 +12:00
pukkandan
17ffed1842
[docs] Improvements
* Move detailed installation instructions to https://github.com/yt-dlp/yt-dlp/wiki/Installation
* Link to wiki where applicable
* Fix some mistakes. Closes #4853, Closes #4855, Closes #4852
* Improve some error messages
2022-09-07 17:38:05 +05:30
pukkandan
7c6eb424d3
[extractor/youtube] Detect lazy-load-for-videos embeds
Closes #4812
2022-09-02 02:01:57 +05:30
pukkandan
05deb747bb
[jsinterp] Fix escape in regex 2022-09-01 16:46:32 +05:30
pukkandan
b505e8517a
[extractor/youtube] Fallback regex for nsig code extraction 2022-09-01 16:46:32 +05:30
coletdjnz
1ff88b7aec
[extractor/youtube] Add no-youtube-prefer-utc-upload-date compat option (#4771)
This option reverts 992f9a730b and 17322130a9 to prefer the non-UTC upload date in microformats.

Authored by: coletdjnz, pukkandan
2022-09-01 10:02:28 +00:00
pukkandan
da4db748fa
[utils] Add deprecation_warning
See https://github.com/yt-dlp/yt-dlp/pull/2173#issuecomment-1097021515
2022-08-30 21:03:07 +05:30
pukkandan
d81ba7d491
[jsinterp, extractor/youtube] Minor fixes 2022-08-30 18:13:37 +05:30
pukkandan
c4b2df872d
[jsinterp] Fix _separate
Ref: https://github.com/yt-dlp/yt-dlp/issues/4635#issuecomment-1231126941
2022-08-30 16:06:40 +05:30
Samantaz Fox
224b5a35f7
[extractor/youtube] Update iOS Innertube clients (#4792)
Authored by: SamantazFox
2022-08-29 03:36:55 +00:00
coletdjnz
50ac0e5416
[extractor/youtube] Use device-specific user agent (#4770)
Thwart latest fingerprinting attempt (see https://github.com/iv-org/invidious/issues/3230#issuecomment-1226887639)

Authored by: coletdjnz
2022-08-28 22:59:54 +00:00
pukkandan
5e01315aa1
[cache, extractor/youtube] Invalidate old cache 2022-08-27 07:25:14 +05:30
pukkandan
992dc6b486
[jsinterp] Implement timeout
Workaround for #4716
2022-08-22 06:19:06 +05:30
pukkandan
b25cac650f
[extractor/youtube] Fix bug in format sorting 2022-08-21 00:56:27 +05:30
pukkandan
90a1df305b
[test] Fix test_youtube_signature 2022-08-21 00:51:03 +05:30
pukkandan
a831c2ea90
[cleanup] Misc 2022-08-19 05:08:21 +05:30
pukkandan
25836db6be
[extractor/youtube] Add fallback to phantomjs
Related #4635
2022-08-18 21:35:18 +05:30
pukkandan
580ce00782
[youtube] Improve signature caching
and refactor related functions
2022-08-18 21:33:30 +05:30
pukkandan
f6ca640b12
[jsinterp] Fix for youtube player 1f7d5369
Closes #4635 again
2022-08-18 16:38:35 +05:30
pukkandan
3ce2933693
[youtube] Fix error reporting of "Incomplete data"
Related: #4669
2022-08-16 22:01:48 +05:30
pukkandan
5c6d2ef9d1
[youtube] Improve format sorting for IOS formats
When no itag/resolution is available for reference, use the closest resolution
2022-08-15 14:04:05 +05:30
Lesmiscore
62b58c0936
[docs] Consistent use of e.g. (#4643)
Authored by: Lesmiscore
2022-08-14 17:34:13 +05:30
pukkandan
8f53dc44a0
[jsinterp] Handle new youtube signature functions
Closes #4635
2022-08-14 05:12:32 +05:30
pukkandan
7e798d725e
[extractor] Fix format sorting of channels 2022-08-11 07:23:46 +05:30
coletdjnz
c7dcf0b31e
[extractor/youtube] Add androidSdkVersion parameter to Android Innertube clients
Required to prevent YouTube returning a bad player response in some cases.

See: https://github.com/yt-dlp/yt-dlp/pull/4593, https://github.com/TeamNewPipe/NewPipe/issues/8713, https://github.com/iv-org/invidious/issues/3230, https://github.com/Tyrrrz/YoutubeExplode/issues/647

Authored by: coletdjnz
2022-08-08 12:03:10 +12:00
pukkandan
a416623436
[extractor/youtube] Extract more format info 2022-08-08 01:47:07 +05:30
coletdjnz
a3e9642116
[extractor/youtube] Prevent redirect to unwanted videos (#4593)
Example: https://www.youtube.com/watch?v=aQvGIIdgFDM

Authored by: coletdjnz
2022-08-07 19:13:20 +05:30
coletdjnz
a0c830f488
[extractor/youtube] Bump Innertube client versions
YouTube may be requiring new versions soon. See https://github.com/iv-org/invidious/issues/3230, https://github.com/TeamNewPipe/NewPipe/issues/8713

Authored by: coletdjnz
2022-08-02 19:02:05 +12:00
pukkandan
be5c1ae862
Standardize retry mechanism (#1649)
* [utils] Create `RetryManager`
* Migrate all retries to use the manager
* [extractor] Add wrapper methods for convenience
* Standardize console messages for retries
* Add `--retry-sleep` for extractors
2022-08-02 01:43:18 +05:30
pukkandan
bfd973ece3 [extractors] Use new framework for existing embeds (#4307)
`Brightcove` is difficult to migrate because it's subclasses may depend
on the signature of the current functions. So it is left as-is for now

Note: Tests have not been migrated
2022-08-02 01:08:16 +05:30