Compare commits

...

65 Commits

Author SHA1 Message Date
coletdjnz
f1fdb20476
Merge e42c1e67e9 into 28d485714f 2024-07-27 18:52:48 +12:00
bashonly
28d485714f
[ie/tva] Fix extractor (#10567)
Closes #10555
Authored by: bashonly
2024-07-25 22:30:00 +00:00
bashonly
0b77286184
[ie/DiscoveryPlus] Support olympics URLs (#10566)
Closes #10564
Authored by: bashonly
2024-07-25 22:00:58 +00:00
github-actions[bot]
6b1e430d8e Release 2024.07.25
Created by: bashonly

:ci skip all :ci run dl
2024-07-25 03:29:27 +00:00
coletdjnz
e42c1e67e9
Remove accidental debug 2024-07-14 13:40:50 +12:00
coletdjnz
ccd7d28680
skip tests if using pypy 2024-07-14 13:18:25 +12:00
coletdjnz
bcf8436c68
bump some timeouts 2024-07-14 13:06:50 +12:00
coletdjnz
34ef90d1ba
try this 2024-07-14 12:40:27 +12:00
coletdjnz
55540caf6b
try this 2024-07-14 12:24:04 +12:00
coletdjnz
a0ca9dc2e4
fix 2024-07-14 12:09:33 +12:00
coletdjnz
4b3fd5f833
what about this? 2024-07-14 12:03:25 +12:00
coletdjnz
6c585b9de8
test 2024-07-14 11:50:17 +12:00
coletdjnz
86b32ee0da
Merge remote-tracking branch 'upstream/master' into networking/websockets-http-proxy
# Conflicts:
#	yt_dlp/networking/_websockets.py
2024-07-14 11:34:49 +12:00
coletdjnz
cdd8e33141
Merge without githooks 2024-07-14 11:33:01 +12:00
coletdjnz
443133dca2
Merge remote-tracking branch 'coletdjnz/networking/websockets-http-proxy' into networking/websockets-http-proxy 2024-07-14 11:25:53 +12:00
bashonly
556aa5161e
Merge branch 'master' into networking/websockets-http-proxy 2024-06-12 02:01:26 -05:00
coletdjnz
31c7479bef
Merge remote-tracking branch 'upstream/master' into networking/websockets-http-proxy 2024-06-03 11:41:09 +12:00
coletdjnz
e1192b1bb3
initialize fake logger 2024-06-03 11:38:06 +12:00
coletdjnz
aef80ad64c
Update yt_dlp/networking/_websockets.py
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
2024-06-03 11:20:02 +12:00
coletdjnz
f1f1af9c67
Always send to stdout 2024-05-19 10:49:24 +12:00
coletdjnz
2c7ddb6234
Try this 2024-05-19 10:40:40 +12:00
coletdjnz
72bb0969bf
Missed something 2024-05-19 10:18:47 +12:00
coletdjnz
976cec10c9
Fix debug traffic for websockets proxy 2024-05-19 10:17:28 +12:00
coletdjnz
6dcc99ee4a
Apply for legacy framing too for safe measure 2024-05-18 18:24:27 +12:00
coletdjnz
c2718e7ff5
misc cleanup 2024-05-18 18:07:29 +12:00
coletdjnz
a275b5e65a
cleanup 2024-05-18 17:44:10 +12:00
coletdjnz
d98ab542f6
Skip WSS in TLS tests for PyPy 2024-05-18 17:42:11 +12:00
coletdjnz
0b255c531b
catch fix 2024-05-18 17:34:00 +12:00
coletdjnz
4cfce3b22f
minor fix 2024-05-18 17:28:25 +12:00
coletdjnz
d274eb1f53
Only use SSLTransport where tls-in-tls will be used 2024-05-18 17:18:21 +12:00
coletdjnz
82cceaed31
Only skip wss tests 2024-05-18 16:44:40 +12:00
coletdjnz
6282570bb2
import 2024-05-18 16:26:24 +12:00
coletdjnz
66a8530617
Skip HTTP Connect proxy tests for websockets if using PyPy 2024-05-18 16:24:36 +12:00
coletdjnz
8c0d5041df
revert 2024-05-18 15:54:38 +12:00
coletdjnz
44da2e1323
reset socket timeout before handing over to websockets 2024-05-18 15:43:43 +12:00
coletdjnz
5078692aa2
Merge remote-tracking branch 'upstream/master' into networking/websockets-http-proxy 2024-05-18 14:52:39 +12:00
coletdjnz
b4e0d5ac16
Disable apply_mask C implementation 2024-05-18 14:38:44 +12:00
coletdjnz
0423915e24
fix 2024-05-18 14:13:44 +12:00
coletdjnz
f5cfe9e00a
test: always use SSLTransport if available
(so it is used for both ends of tls-in-tls)
2024-05-18 14:02:26 +12:00
coletdjnz
c01179b581
cleanup 2024-05-18 13:54:01 +12:00
coletdjnz
101d9f53b4
always use WebSocketsSSLTransport if available
this ensures it is used for both sockets in TLS-in-TLS
2024-05-18 13:46:55 +12:00
coletdjnz
1b96519a35
No I think this is requests/urllib3 again 2024-05-18 12:53:46 +12:00
coletdjnz
018acbb93c
does this resolve unclosed socket? 2024-05-18 12:48:50 +12:00
coletdjnz
3350bdeb87
refactoring and add http erro test 2024-05-18 12:23:22 +12:00
coletdjnz
0efd83b31a
patch SSLTransport to return b'' instead of 0 as EOF
Websockets only treats b'' as EOF
2024-05-18 11:41:36 +12:00
coletdjnz
db14294b5c
cleanup after merge 2024-05-11 11:11:39 +12:00
coletdjnz
51e99b0759
Merge remote-tracking branch 'upstream/master' into networking/websockets-http-proxy
# Conflicts:
#	test/test_http_proxy.py
#	test/test_networking.py
2024-05-11 11:09:44 +12:00
coletdjnz
0d520bc008
misc fixes 2024-05-03 17:09:26 +12:00
coletdjnz
f964b72450
change docstring 2024-05-03 17:04:50 +12:00
coletdjnz
8ea52ec344
Update yt_dlp/networking/_websockets.py 2024-05-03 17:04:33 +12:00
coletdjnz
b41348b988
Fix validation tests 2024-04-06 15:59:34 +13:00
coletdjnz
833862cfbc
misc cleanup 2024-04-06 15:50:48 +13:00
coletdjnz
eecdc5870c
Merge remote-tracking branch 'coletdjnz/networking/add-http-proxy-tests' into networking/websockets-http-proxy
# Conflicts:
#	test/test_http_proxy.py
2024-04-06 15:48:12 +13:00
coletdjnz
a40e0f6c5f
misc cleanup 2024-04-06 15:47:39 +13:00
coletdjnz
01fe8e8fa6
Handle urllib3 not being available 2024-04-06 15:40:29 +13:00
coletdjnz
3999a510f7
Working websockets HTTP/S proxy 2024-04-06 15:14:59 +13:00
coletdjnz
fddf9e0577
Merge remote-tracking branch 'coletdjnz/networking/add-http-proxy-tests' into networking/websockets-http-proxy 2024-04-06 12:32:59 +13:00
coletdjnz
6c3140a8c1
try this 2024-04-04 19:34:28 +13:00
coletdjnz
41add1d7af
be gone unclosed socket 2024-04-01 14:37:21 +13:00
coletdjnz
bff727c043
Fix unclosed socket errors 2024-04-01 14:02:15 +13:00
coletdjnz
39a45d48f9
somewhat working implementation 2024-03-31 22:04:21 +13:00
coletdjnz
a14bb53ab5
remove debug 2024-03-31 16:29:44 +13:00
coletdjnz
14505063ec
cleanup 2024-03-31 16:25:07 +13:00
coletdjnz
e565e45a6f
[rh:curl_cffi] Fix HTTPS proxy support 2024-03-31 14:50:52 +13:00
coletdjnz
b44e0f8b98
[test] Add http proxy tests 2024-03-31 14:50:29 +13:00
15 changed files with 372 additions and 98 deletions

View File

@ -4,6 +4,19 @@ # Changelog
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master # To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
--> -->
### 2024.07.25
#### Extractor changes
- **abematv**: [Adapt key retrieval to request handler framework](https://github.com/yt-dlp/yt-dlp/commit/a3bab4752a2b3d56e5a59b4e0411bb8f695c010b) ([#10491](https://github.com/yt-dlp/yt-dlp/issues/10491)) by [bashonly](https://github.com/bashonly)
- **facebook**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/1a34a802f44a1dab8f642c79c3cc810e21541d3b) ([#10531](https://github.com/yt-dlp/yt-dlp/issues/10531)) by [bashonly](https://github.com/bashonly)
- **mlbtv**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f0993391e6052ec8f7aacc286609564f226943b9) ([#10515](https://github.com/yt-dlp/yt-dlp/issues/10515)) by [bashonly](https://github.com/bashonly)
- **tiktok**: [Fix and deprioritize JSON subtitles](https://github.com/yt-dlp/yt-dlp/commit/2f97779f335ac069ecccd9c7bf81abf4a83cfe7a) ([#10516](https://github.com/yt-dlp/yt-dlp/issues/10516)) by [bashonly](https://github.com/bashonly)
- **vimeo**: [Fix chapters extraction](https://github.com/yt-dlp/yt-dlp/commit/a0a1bc3d8d8e3bb9a48a06e835815a0460e90e77) ([#10544](https://github.com/yt-dlp/yt-dlp/issues/10544)) by [bashonly](https://github.com/bashonly)
- **youtube**: [Fix `n` function name extraction for player `3400486c`](https://github.com/yt-dlp/yt-dlp/commit/713b4cd18f00556771af8cfdd9cea6cc1a09e948) ([#10542](https://github.com/yt-dlp/yt-dlp/issues/10542)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **build**: [Pin `setuptools` version](https://github.com/yt-dlp/yt-dlp/commit/e046db8a116b1c320d4785daadd48ea0b22a3987) ([#10493](https://github.com/yt-dlp/yt-dlp/issues/10493)) by [bashonly](https://github.com/bashonly)
### 2024.07.16 ### 2024.07.16
#### Core changes #### Core changes

View File

@ -23,7 +23,7 @@ class HandlerWrapper(handler):
RH_KEY = handler.RH_KEY RH_KEY = handler.RH_KEY
def __init__(self, **kwargs): def __init__(self, **kwargs):
super().__init__(logger=FakeLogger, **kwargs) super().__init__(logger=FakeLogger(), **kwargs)
return HandlerWrapper return HandlerWrapper

View File

@ -7,10 +7,12 @@
import random import random
import ssl import ssl
import threading import threading
import time
from http.server import BaseHTTPRequestHandler from http.server import BaseHTTPRequestHandler
from socketserver import ThreadingTCPServer from socketserver import BaseRequestHandler, ThreadingTCPServer
import pytest import pytest
import platform
from test.helper import http_server_port, verify_address_availability from test.helper import http_server_port, verify_address_availability
from test.test_networking import TEST_DIR from test.test_networking import TEST_DIR
@ -46,6 +48,11 @@ def do_proxy_auth(self, username, password):
except Exception: except Exception:
return self.proxy_auth_error() return self.proxy_auth_error()
if auth_username == 'http_error':
self.send_response(404)
self.end_headers()
return False
if auth_username != (username or '') or auth_password != (password or ''): if auth_username != (username or '') or auth_password != (password or ''):
return self.proxy_auth_error() return self.proxy_auth_error()
return True return True
@ -119,6 +126,16 @@ def _io_refs(self, value):
def shutdown(self, *args, **kwargs): def shutdown(self, *args, **kwargs):
self.socket.shutdown(*args, **kwargs) self.socket.shutdown(*args, **kwargs)
def _wrap_ssl_read(self, *args, **kwargs):
res = super()._wrap_ssl_read(*args, **kwargs)
if res == 0:
# Websockets does not treat 0 as an EOF, rather only b''
return b''
return res
def getsockname(self):
return self.socket.getsockname()
else: else:
SSLTransport = None SSLTransport = None
@ -128,7 +145,40 @@ def __init__(self, request, *args, **kwargs):
certfn = os.path.join(TEST_DIR, 'testcert.pem') certfn = os.path.join(TEST_DIR, 'testcert.pem')
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER) sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
sslctx.load_cert_chain(certfn, None) sslctx.load_cert_chain(certfn, None)
if isinstance(request, ssl.SSLSocket): if SSLTransport:
request = SSLTransport(request, ssl_context=sslctx, server_side=True)
else:
request = sslctx.wrap_socket(request, server_side=True)
super().__init__(request, *args, **kwargs)
class WebSocketProxyHandler(BaseRequestHandler):
def __init__(self, *args, proxy_info=None, **kwargs):
self.proxy_info = proxy_info
super().__init__(*args, **kwargs)
def handle(self):
import websockets.sync.server
self.request.settimeout(None)
protocol = websockets.ServerProtocol()
connection = websockets.sync.server.ServerConnection(socket=self.request, protocol=protocol, close_timeout=10)
try:
connection.handshake(timeout=5.0)
for message in connection:
if message == 'proxy_info':
connection.send(json.dumps(self.proxy_info))
except Exception as e:
print(f'Error in websocket proxy: {e}')
finally:
connection.close(code=1001)
class WebSocketSecureProxyHandler(WebSocketProxyHandler):
def __init__(self, request, *args, **kwargs):
certfn = os.path.join(TEST_DIR, 'testcert.pem')
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
sslctx.load_cert_chain(certfn, None)
if isinstance(request, ssl.SSLSocket) and SSLTransport:
request = SSLTransport(request, ssl_context=sslctx, server_side=True) request = SSLTransport(request, ssl_context=sslctx, server_side=True)
else: else:
request = sslctx.wrap_socket(request, server_side=True) request = sslctx.wrap_socket(request, server_side=True)
@ -197,7 +247,7 @@ def proxy_server(proxy_server_class, request_handler, bind_ip=None, **proxy_serv
finally: finally:
server.shutdown() server.shutdown()
server.server_close() server.server_close()
server_thread.join(2.0) server_thread.join()
class HTTPProxyTestContext(abc.ABC): class HTTPProxyTestContext(abc.ABC):
@ -205,7 +255,9 @@ class HTTPProxyTestContext(abc.ABC):
REQUEST_PROTO = None REQUEST_PROTO = None
def http_server(self, server_class, *args, **kwargs): def http_server(self, server_class, *args, **kwargs):
return proxy_server(server_class, self.REQUEST_HANDLER_CLASS, *args, **kwargs) server = proxy_server(server_class, self.REQUEST_HANDLER_CLASS, *args, **kwargs)
time.sleep(1)
return server
@abc.abstractmethod @abc.abstractmethod
def proxy_info_request(self, handler, target_domain=None, target_port=None, **req_kwargs) -> dict: def proxy_info_request(self, handler, target_domain=None, target_port=None, **req_kwargs) -> dict:
@ -234,9 +286,30 @@ def proxy_info_request(self, handler, target_domain=None, target_port=None, **re
return json.loads(handler.send(request).read().decode()) return json.loads(handler.send(request).read().decode())
class HTTPProxyWebSocketTestContext(HTTPProxyTestContext):
REQUEST_HANDLER_CLASS = WebSocketProxyHandler
REQUEST_PROTO = 'ws'
def proxy_info_request(self, handler, target_domain=None, target_port=None, **req_kwargs):
request = Request(f'{self.REQUEST_PROTO}://{target_domain or "127.0.0.1"}:{target_port or "40000"}', **req_kwargs)
handler.validate(request)
ws = handler.send(request)
ws.send('proxy_info')
proxy_info = ws.recv()
ws.close()
return json.loads(proxy_info)
class HTTPProxyWebSocketSecureTestContext(HTTPProxyWebSocketTestContext):
REQUEST_HANDLER_CLASS = WebSocketSecureProxyHandler
REQUEST_PROTO = 'wss'
CTX_MAP = { CTX_MAP = {
'http': HTTPProxyHTTPTestContext, 'http': HTTPProxyHTTPTestContext,
'https': HTTPProxyHTTPSTestContext, 'https': HTTPProxyHTTPSTestContext,
'ws': HTTPProxyWebSocketTestContext,
'wss': HTTPProxyWebSocketSecureTestContext,
} }
@ -272,6 +345,14 @@ def test_http_bad_auth(self, handler, ctx):
assert exc_info.value.response.status == 407 assert exc_info.value.response.status == 407
exc_info.value.response.close() exc_info.value.response.close()
def test_http_error(self, handler, ctx):
with ctx.http_server(HTTPProxyHandler, username='http_error', password='test') as server_address:
with handler(proxies={ctx.REQUEST_PROTO: f'http://http_error:test@{server_address}'}) as rh:
with pytest.raises(HTTPError) as exc_info:
ctx.proxy_info_request(rh)
assert exc_info.value.response.status == 404
exc_info.value.response.close()
def test_http_source_address(self, handler, ctx): def test_http_source_address(self, handler, ctx):
with ctx.http_server(HTTPProxyHandler) as server_address: with ctx.http_server(HTTPProxyHandler) as server_address:
source_address = f'127.0.0.{random.randint(5, 255)}' source_address = f'127.0.0.{random.randint(5, 255)}'
@ -314,7 +395,13 @@ def test_http_with_idn(self, handler, ctx):
'handler,ctx', [ 'handler,ctx', [
('Requests', 'https'), ('Requests', 'https'),
('CurlCFFI', 'https'), ('CurlCFFI', 'https'),
('Websockets', 'ws'),
('Websockets', 'wss'),
], indirect=True) ], indirect=True)
@pytest.mark.skip_handler_if(
'Websockets', lambda request:
platform.python_implementation() == 'PyPy',
'Tests are flaky with PyPy, unknown reason')
class TestHTTPConnectProxy: class TestHTTPConnectProxy:
def test_http_connect_no_auth(self, handler, ctx): def test_http_connect_no_auth(self, handler, ctx):
with ctx.http_server(HTTPConnectProxyHandler) as server_address: with ctx.http_server(HTTPConnectProxyHandler) as server_address:
@ -341,6 +428,16 @@ def test_http_connect_bad_auth(self, handler, ctx):
with pytest.raises(ProxyError): with pytest.raises(ProxyError):
ctx.proxy_info_request(rh) ctx.proxy_info_request(rh)
@pytest.mark.skip_handler(
'Requests',
'bug in urllib3 causes unclosed socket: https://github.com/urllib3/urllib3/issues/3374',
)
def test_http_connect_http_error(self, handler, ctx):
with ctx.http_server(HTTPConnectProxyHandler, username='http_error', password='test') as server_address:
with handler(verify=False, proxies={ctx.REQUEST_PROTO: f'http://http_error:test@{server_address}'}) as rh:
with pytest.raises(ProxyError):
ctx.proxy_info_request(rh)
def test_http_connect_source_address(self, handler, ctx): def test_http_connect_source_address(self, handler, ctx):
with ctx.http_server(HTTPConnectProxyHandler) as server_address: with ctx.http_server(HTTPConnectProxyHandler) as server_address:
source_address = f'127.0.0.{random.randint(5, 255)}' source_address = f'127.0.0.{random.randint(5, 255)}'

View File

@ -407,7 +407,7 @@ def test_percent_encode(self, handler):
'/redirect_dotsegments_absolute', '/redirect_dotsegments_absolute',
]) ])
def test_remove_dot_segments(self, handler, path): def test_remove_dot_segments(self, handler, path):
with handler(verbose=True) as rh: with handler() as rh:
# This isn't a comprehensive test, # This isn't a comprehensive test,
# but it should be enough to check whether the handler is removing dot segments in required scenarios # but it should be enough to check whether the handler is removing dot segments in required scenarios
res = validate_and_send(rh, Request(f'http://127.0.0.1:{self.http_port}{path}')) res = validate_and_send(rh, Request(f'http://127.0.0.1:{self.http_port}{path}'))
@ -1224,8 +1224,8 @@ class HTTPSupportedRH(ValidationRH):
('socks5h', False), ('socks5h', False),
]), ]),
('Websockets', 'ws', [ ('Websockets', 'ws', [
('http', UnsupportedRequest), ('http', False),
('https', UnsupportedRequest), ('https', False),
('socks4', False), ('socks4', False),
('socks4a', False), ('socks4a', False),
('socks5', False), ('socks5', False),
@ -1318,8 +1318,8 @@ class HTTPSupportedRH(ValidationRH):
('Websockets', False, 'ws'), ('Websockets', False, 'ws'),
], indirect=['handler']) ], indirect=['handler'])
def test_no_proxy(self, handler, fail, scheme): def test_no_proxy(self, handler, fail, scheme):
run_validation(handler, fail, Request(f'{scheme}://', proxies={'no': '127.0.0.1,github.com'})) run_validation(handler, fail, Request(f'{scheme}://example.com', proxies={'no': '127.0.0.1,github.com'}))
run_validation(handler, fail, Request(f'{scheme}://'), proxies={'no': '127.0.0.1,github.com'}) run_validation(handler, fail, Request(f'{scheme}://example.com'), proxies={'no': '127.0.0.1,github.com'})
@pytest.mark.parametrize('handler,scheme', [ @pytest.mark.parametrize('handler,scheme', [
('Urllib', 'http'), ('Urllib', 'http'),

View File

@ -216,7 +216,9 @@ def handle(self):
protocol = websockets.ServerProtocol() protocol = websockets.ServerProtocol()
connection = websockets.sync.server.ServerConnection(socket=self.request, protocol=protocol, close_timeout=0) connection = websockets.sync.server.ServerConnection(socket=self.request, protocol=protocol, close_timeout=0)
connection.handshake() connection.handshake()
connection.send(json.dumps(self.socks_info)) for message in connection:
if message == 'socks_info':
connection.send(json.dumps(self.socks_info))
connection.close() connection.close()

View File

@ -4174,15 +4174,15 @@ def urlopen(self, req):
'Use --enable-file-urls to enable at your own risk.', cause=ue) from ue 'Use --enable-file-urls to enable at your own risk.', cause=ue) from ue
if ( if (
'unsupported proxy type: "https"' in ue.msg.lower() 'unsupported proxy type: "https"' in ue.msg.lower()
and 'requests' not in self._request_director.handlers and 'Requests' not in self._request_director.handlers
and 'curl_cffi' not in self._request_director.handlers and 'CurlCFFI' not in self._request_director.handlers
): ):
raise RequestError( raise RequestError(
'To use an HTTPS proxy for this request, one of the following dependencies needs to be installed: requests, curl_cffi') 'To use an HTTPS proxy for this request, one of the following dependencies needs to be installed: requests, curl_cffi')
elif ( elif (
re.match(r'unsupported url scheme: "wss?"', ue.msg.lower()) re.match(r'unsupported url scheme: "wss?"', ue.msg.lower())
and 'websockets' not in self._request_director.handlers and 'Websockets' not in self._request_director.handlers
): ):
raise RequestError( raise RequestError(
'This request requires WebSocket support. ' 'This request requires WebSocket support. '

View File

@ -2169,10 +2169,7 @@
TV5UnisVideoIE, TV5UnisVideoIE,
) )
from .tv24ua import TV24UAVideoIE from .tv24ua import TV24UAVideoIE
from .tva import ( from .tva import TVAIE
TVAIE,
QubIE,
)
from .tvanouvelles import ( from .tvanouvelles import (
TVANouvellesArticleIE, TVANouvellesArticleIE,
TVANouvellesIE, TVANouvellesIE,

View File

@ -934,7 +934,7 @@ class TLCIE(DiscoveryPlusBaseIE):
class DiscoveryPlusIE(DiscoveryPlusBaseIE): class DiscoveryPlusIE(DiscoveryPlusBaseIE):
_VALID_URL = r'https?://(?:www\.)?discoveryplus\.com/(?!it/)(?:(?P<country>[a-z]{2})/)?video(?:/sport)?' + DPlayBaseIE._PATH_REGEX _VALID_URL = r'https?://(?:www\.)?discoveryplus\.com/(?!it/)(?:(?P<country>[a-z]{2})/)?video(?:/sport|/olympics)?' + DPlayBaseIE._PATH_REGEX
_TESTS = [{ _TESTS = [{
'url': 'https://www.discoveryplus.com/video/property-brothers-forever-home/food-and-family', 'url': 'https://www.discoveryplus.com/video/property-brothers-forever-home/food-and-family',
'info_dict': { 'info_dict': {
@ -958,6 +958,9 @@ class DiscoveryPlusIE(DiscoveryPlusBaseIE):
}, { }, {
'url': 'https://www.discoveryplus.com/gb/video/sport/eurosport-1-british-eurosport-1-british-sport/6-hours-of-spa-review', 'url': 'https://www.discoveryplus.com/gb/video/sport/eurosport-1-british-eurosport-1-british-sport/6-hours-of-spa-review',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.discoveryplus.com/gb/video/olympics/dplus-sport-dplus-sport-sport/rugby-sevens-australia-samoa',
'only_matching': True,
}] }]
_PRODUCT = None _PRODUCT = None

View File

@ -1,60 +1,29 @@
import functools import functools
import re import re
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import float_or_none, int_or_none, smuggle_url, strip_or_none from ..utils import float_or_none, int_or_none, smuggle_url, strip_or_none
from ..utils.traversal import traverse_obj from ..utils.traversal import traverse_obj
class TVAIE(InfoExtractor): class TVAIE(InfoExtractor):
_VALID_URL = r'https?://videos?\.tva\.ca/details/_(?P<id>\d+)' IE_NAME = 'tvaplus'
IE_DESC = 'TVA+'
_VALID_URL = r'https?://(?:www\.)?tvaplus\.ca/(?:[^/?#]+/)*[\w-]+-(?P<id>\d+)(?:$|[#?])'
_TESTS = [{ _TESTS = [{
'url': 'https://videos.tva.ca/details/_5596811470001', 'url': 'https://www.tvaplus.ca/tva/alerte-amber/saison-1/episode-01-1000036619',
'info_dict': {
'id': '5596811470001',
'ext': 'mp4',
'title': 'Un extrait de l\'Ă©pisode du dimanche 8 octobre 2017 !',
'uploader_id': '5481942443001',
'upload_date': '20171003',
'timestamp': 1507064617,
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'HTTP Error 404: Not Found',
}, {
'url': 'https://video.tva.ca/details/_5596811470001',
'only_matching': True,
}]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/5481942443001/default_default/index.html?videoId=%s'
def _real_extract(self, url):
video_id = self._match_id(url)
return {
'_type': 'url_transparent',
'id': video_id,
'url': smuggle_url(self.BRIGHTCOVE_URL_TEMPLATE % video_id, {'geo_countries': ['CA']}),
'ie_key': 'BrightcoveNew',
}
class QubIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?qub\.ca/(?:[^/]+/)*[0-9a-z-]+-(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.qub.ca/tvaplus/tva/alerte-amber/saison-1/episode-01-1000036619',
'md5': '949490fd0e7aee11d0543777611fbd53', 'md5': '949490fd0e7aee11d0543777611fbd53',
'info_dict': { 'info_dict': {
'id': '6084352463001', 'id': '6084352463001',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Ép 01. Mon dernier jour', 'title': 'Mon dernier jour',
'uploader_id': '5481942443001', 'uploader_id': '5481942443001',
'upload_date': '20190907', 'upload_date': '20190907',
'timestamp': 1567899756, 'timestamp': 1567899756,
'description': 'md5:9c0d7fbb90939420c651fd977df90145', 'description': 'md5:9c0d7fbb90939420c651fd977df90145',
'thumbnail': r're:https://.+\.jpg', 'thumbnail': r're:https://.+\.jpg',
'episode': 'Ép 01. Mon dernier jour', 'episode': 'Mon dernier jour',
'episode_number': 1, 'episode_number': 1,
'tags': ['alerte amber', 'alerte amber saison 1', 'surdemande'], 'tags': ['alerte amber', 'alerte amber saison 1', 'surdemande'],
'duration': 2625.963, 'duration': 2625.963,
@ -64,23 +33,36 @@ class QubIE(InfoExtractor):
'channel': 'TVA', 'channel': 'TVA',
}, },
}, { }, {
'url': 'https://www.qub.ca/tele/video/lcn-ca-vous-regarde-rev-30s-ap369664-1009357943', 'url': 'https://www.tvaplus.ca/tva/le-baiser-du-barbu/le-baiser-du-barbu-886644190',
'only_matching': True, 'info_dict': {
'id': '6354448043112',
'ext': 'mp4',
'title': 'Le Baiser du barbu',
'uploader_id': '5481942443001',
'upload_date': '20240606',
'timestamp': 1717694023,
'description': 'md5:025b1219086c1cbf4bc27e4e034e8b57',
'thumbnail': r're:https://.+\.jpg',
'episode': 'Le Baiser du barbu',
'tags': ['fullepisode', 'films'],
'duration': 6053.504,
'series': 'Le Baiser du barbu',
'channel': 'TVA',
},
}] }]
# reference_id also works with old account_id(5481942443001) _BC_URL_TMPL = 'https://players.brightcove.net/5481942443001/default_default/index.html?videoId={}'
# BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/5813221784001/default_default/index.html?videoId=ref:%s'
def _real_extract(self, url): def _real_extract(self, url):
entity_id = self._match_id(url) entity_id = self._match_id(url)
webpage = self._download_webpage(url, entity_id) webpage = self._download_webpage(url, entity_id)
entity = self._search_nextjs_data(webpage, entity_id)['props']['initialProps']['pageProps']['fallbackData'] entity = self._search_nextjs_data(webpage, entity_id)['props']['pageProps']['staticEntity']
video_id = entity['videoId'] video_id = entity['videoId']
episode = strip_or_none(entity.get('name')) episode = strip_or_none(entity.get('name'))
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'url': f'https://videos.tva.ca/details/_{video_id}', 'url': smuggle_url(self._BC_URL_TMPL.format(video_id), {'geo_countries': ['CA']}),
'ie_key': TVAIE.ie_key(), 'ie_key': BrightcoveNewIE.ie_key(),
'id': video_id, 'id': video_id,
'title': episode, 'title': episode,
'episode': episode, 'episode': episode,

View File

@ -49,6 +49,7 @@ class KnownDRMIE(UnsupportedInfoExtractor):
r'amazon\.(?:\w{2}\.)?\w+/gp/video', r'amazon\.(?:\w{2}\.)?\w+/gp/video',
r'music\.amazon\.(?:\w{2}\.)?\w+', r'music\.amazon\.(?:\w{2}\.)?\w+',
r'(?:watch|front)\.njpwworld\.com', r'(?:watch|front)\.njpwworld\.com',
r'qub\.ca/vrai',
) )
_TESTS = [{ _TESTS = [{
@ -149,6 +150,9 @@ class KnownDRMIE(UnsupportedInfoExtractor):
}, { }, {
'url': 'https://front.njpwworld.com/p/s_series_00563_16_bs', 'url': 'https://front.njpwworld.com/p/s_series_00563_16_bs',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.qub.ca/vrai/l-effet-bocuse-d-or/saison-1/l-effet-bocuse-d-or-saison-1-bande-annonce-1098225063',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -1,5 +1,6 @@
from __future__ import annotations from __future__ import annotations
import base64
import contextlib import contextlib
import functools import functools
import os import os
@ -9,8 +10,9 @@
import typing import typing
import urllib.parse import urllib.parse
import urllib.request import urllib.request
from http.client import HTTPConnection, HTTPResponse
from .exceptions import RequestError, UnsupportedRequest from .exceptions import ProxyError, RequestError, UnsupportedRequest
from ..dependencies import certifi from ..dependencies import certifi
from ..socks import ProxyType, sockssocket from ..socks import ProxyType, sockssocket
from ..utils import format_field, traverse_obj from ..utils import format_field, traverse_obj
@ -285,3 +287,65 @@ def create_connection(
# Explicitly break __traceback__ reference cycle # Explicitly break __traceback__ reference cycle
# https://bugs.python.org/issue36820 # https://bugs.python.org/issue36820
err = None err = None
class NoCloseHTTPResponse(HTTPResponse):
def begin(self):
super().begin()
# Revert the default behavior of closing the connection after reading the response
if not self._check_close() and not self.chunked and self.length is None:
self.will_close = False
def create_http_connect_connection(
proxy_host,
proxy_port,
connect_host,
connect_port,
timeout=None,
ssl_context=None,
source_address=None,
username=None,
password=None,
debug=False,
):
proxy_headers = dict()
if username is not None or password is not None:
proxy_headers['Proxy-Authorization'] = 'Basic ' + base64.b64encode(
f'{username or ""}:{password or ""}'.encode()).decode('utf-8')
conn = HTTPConnection(proxy_host, port=proxy_port, timeout=timeout)
conn.set_debuglevel(int(debug))
conn.response_class = NoCloseHTTPResponse
if hasattr(conn, '_create_connection'):
conn._create_connection = create_connection
if source_address is not None:
conn.source_address = (source_address, 0)
try:
conn.connect()
if ssl_context:
conn.sock = ssl_context.wrap_socket(conn.sock, server_hostname=proxy_host)
conn.request(
method='CONNECT',
url=f'{connect_host}:{connect_port}',
headers=proxy_headers)
response = conn.getresponse()
except OSError as e:
conn.close()
raise ProxyError('Unable to connect to proxy', cause=e) from e
if response.status == 200:
sock = conn.sock
conn.sock = None
response.fp = None
return sock
else:
conn.close()
response.close()
raise ProxyError(f'Got HTTP Error {response.status} with CONNECT: {response.reason}')

View File

@ -243,14 +243,14 @@ def __init__(self, logger, *args, **kwargs):
def emit(self, record): def emit(self, record):
try: try:
msg = self.format(record) msg = self.format(record)
except Exception:
self.handleError(record)
else:
if record.levelno >= logging.ERROR: if record.levelno >= logging.ERROR:
self._logger.error(msg) self._logger.error(msg)
else: else:
self._logger.stdout(msg) self._logger.stdout(msg)
except Exception:
self.handleError(record)
@register_rh @register_rh
class RequestsRH(RequestHandler, InstanceStoreMixin): class RequestsRH(RequestHandler, InstanceStoreMixin):

View File

@ -5,10 +5,11 @@
import io import io
import logging import logging
import ssl import ssl
import sys import urllib.parse
from ._helper import ( from ._helper import (
create_connection, create_connection,
create_http_connect_connection,
create_socks_proxy_socket, create_socks_proxy_socket,
make_socks_proxy_opts, make_socks_proxy_opts,
select_proxy, select_proxy,
@ -21,9 +22,10 @@
RequestError, RequestError,
SSLError, SSLError,
TransportError, TransportError,
UnsupportedRequest,
) )
from .websocket import WebSocketRequestHandler, WebSocketResponse from .websocket import WebSocketRequestHandler, WebSocketResponse
from ..dependencies import websockets from ..dependencies import urllib3, websockets
from ..socks import ProxyError as SocksProxyError from ..socks import ProxyError as SocksProxyError
from ..utils import int_or_none from ..utils import int_or_none
@ -36,6 +38,20 @@
if websockets_version < (12, 0): if websockets_version < (12, 0):
raise ImportError('Only websockets>=12.0 is supported') raise ImportError('Only websockets>=12.0 is supported')
urllib3_supported = False
urllib3_version = tuple(int_or_none(x, default=0) for x in urllib3.__version__.split('.')) if urllib3 else None
if urllib3_version and urllib3_version >= (1, 26, 17):
urllib3_supported = True
# Disable apply_mask C implementation
# Seems to help reduce "Fatal Python error: Aborted" in CI
with contextlib.suppress(Exception):
import websockets.frames
import websockets.legacy.framing
import websockets.utils
websockets.frames.apply_mask = websockets.legacy.framing = websockets.utils.apply_mask
import websockets.sync.client import websockets.sync.client
from websockets.uri import parse_uri from websockets.uri import parse_uri
@ -53,6 +69,22 @@
websockets.sync.connection.Connection.recv_events_exc = None websockets.sync.connection.Connection.recv_events_exc = None
class WebsocketsLoggingHandler(logging.Handler):
"""Redirect websocket logs to our logger"""
def __init__(self, logger, *args, **kwargs):
super().__init__(*args, **kwargs)
self._logger = logger
def emit(self, record):
try:
msg = self.format(record)
except Exception:
self.handleError(record)
else:
self._logger.stdout(msg)
class WebsocketsResponseAdapter(WebSocketResponse): class WebsocketsResponseAdapter(WebSocketResponse):
def __init__(self, ws: websockets.sync.client.ClientConnection, url): def __init__(self, ws: websockets.sync.client.ClientConnection, url):
@ -98,7 +130,7 @@ class WebsocketsRH(WebSocketRequestHandler):
https://github.com/python-websockets/websockets https://github.com/python-websockets/websockets
""" """
_SUPPORTED_URL_SCHEMES = ('wss', 'ws') _SUPPORTED_URL_SCHEMES = ('wss', 'ws')
_SUPPORTED_PROXY_SCHEMES = ('socks4', 'socks4a', 'socks5', 'socks5h') _SUPPORTED_PROXY_SCHEMES = ('socks4', 'socks4a', 'socks5', 'socks5h', 'http', 'https')
_SUPPORTED_FEATURES = (Features.ALL_PROXY, Features.NO_PROXY) _SUPPORTED_FEATURES = (Features.ALL_PROXY, Features.NO_PROXY)
RH_NAME = 'websockets' RH_NAME = 'websockets'
@ -107,13 +139,24 @@ def __init__(self, *args, **kwargs):
self.__logging_handlers = {} self.__logging_handlers = {}
for name in ('websockets.client', 'websockets.server'): for name in ('websockets.client', 'websockets.server'):
logger = logging.getLogger(name) logger = logging.getLogger(name)
handler = logging.StreamHandler(stream=sys.stdout) handler = WebsocketsLoggingHandler(logger=self._logger)
handler.setFormatter(logging.Formatter(f'{self.RH_NAME}: %(message)s')) handler.setFormatter(logging.Formatter(f'{self.RH_NAME}: [{name}] %(message)s'))
self.__logging_handlers[name] = handler self.__logging_handlers[name] = handler
logger.addHandler(handler) logger.addHandler(handler)
if self.verbose: if self.verbose:
logger.setLevel(logging.DEBUG) logger.setLevel(logging.DEBUG)
def _validate(self, request):
super()._validate(request)
proxy = select_proxy(request.url, self._get_proxies(request))
if (
proxy
and urllib.parse.urlparse(proxy).scheme.lower() == 'https'
and urllib.parse.urlparse(request.url).scheme.lower() == 'wss'
and not urllib3_supported
):
raise UnsupportedRequest('WSS over HTTPS proxy requires a supported version of urllib3')
def _check_extensions(self, extensions): def _check_extensions(self, extensions):
super()._check_extensions(extensions) super()._check_extensions(extensions)
extensions.pop('timeout', None) extensions.pop('timeout', None)
@ -126,6 +169,41 @@ def close(self):
for name, handler in self.__logging_handlers.items(): for name, handler in self.__logging_handlers.items():
logging.getLogger(name).removeHandler(handler) logging.getLogger(name).removeHandler(handler)
def _make_sock(self, proxy, url, timeout):
create_conn_kwargs = {
'source_address': (self.source_address, 0) if self.source_address else None,
'timeout': timeout,
}
parsed_url = parse_uri(url)
parsed_proxy_url = urllib.parse.urlparse(proxy)
if proxy:
if parsed_proxy_url.scheme.startswith('socks'):
socks_proxy_options = make_socks_proxy_opts(proxy)
return create_connection(
address=(socks_proxy_options['addr'], socks_proxy_options['port']),
_create_socket_func=functools.partial(
create_socks_proxy_socket, (parsed_url.host, parsed_url.port), socks_proxy_options),
**create_conn_kwargs,
)
elif parsed_proxy_url.scheme in ('http', 'https'):
return create_http_connect_connection(
proxy_port=parsed_proxy_url.port,
proxy_host=parsed_proxy_url.hostname,
connect_port=parsed_url.port,
connect_host=parsed_url.host,
timeout=timeout,
ssl_context=self._make_sslcontext() if parsed_proxy_url.scheme == 'https' else None,
source_address=self.source_address,
username=parsed_proxy_url.username,
password=parsed_proxy_url.password,
debug=self.verbose,
)
return create_connection(
address=(parsed_url.host, parsed_url.port),
**create_conn_kwargs,
)
def _send(self, request): def _send(self, request):
timeout = self._calculate_timeout(request) timeout = self._calculate_timeout(request)
headers = self._merge_headers(request.headers) headers = self._merge_headers(request.headers)
@ -135,35 +213,21 @@ def _send(self, request):
if cookie_header: if cookie_header:
headers['cookie'] = cookie_header headers['cookie'] = cookie_header
wsuri = parse_uri(request.url)
create_conn_kwargs = {
'source_address': (self.source_address, 0) if self.source_address else None,
'timeout': timeout,
}
proxy = select_proxy(request.url, self._get_proxies(request)) proxy = select_proxy(request.url, self._get_proxies(request))
try: try:
if proxy: ssl_context = None
socks_proxy_options = make_socks_proxy_opts(proxy) sock = self._make_sock(proxy, request.url, timeout)
sock = create_connection( if parse_uri(request.url).secure:
address=(socks_proxy_options['addr'], socks_proxy_options['port']), ssl_context = WebsocketsSSLContext(self._make_sslcontext(legacy_ssl_support=request.extensions.get('legacy_ssl')))
_create_socket_func=functools.partial(
create_socks_proxy_socket, (wsuri.host, wsuri.port), socks_proxy_options),
**create_conn_kwargs,
)
else:
sock = create_connection(
address=(wsuri.host, wsuri.port),
**create_conn_kwargs,
)
ssl_ctx = self._make_sslcontext(legacy_ssl_support=request.extensions.get('legacy_ssl'))
conn = websockets.sync.client.connect( conn = websockets.sync.client.connect(
sock=sock, sock=sock,
uri=request.url, uri=request.url,
additional_headers=headers, additional_headers=headers,
open_timeout=timeout, open_timeout=timeout,
user_agent_header=None, user_agent_header=None,
ssl_context=ssl_ctx if wsuri.secure else None, ssl_context=ssl_context,
close_timeout=0, # not ideal, but prevents yt-dlp hanging close_timeout=0.1, # not ideal, but prevents yt-dlp hanging
) )
return WebsocketsResponseAdapter(conn, url=request.url) return WebsocketsResponseAdapter(conn, url=request.url)
@ -187,3 +251,43 @@ def _send(self, request):
) from e ) from e
except (OSError, TimeoutError, websockets.exceptions.WebSocketException) as e: except (OSError, TimeoutError, websockets.exceptions.WebSocketException) as e:
raise TransportError(cause=e) from e raise TransportError(cause=e) from e
if urllib3_supported:
from urllib3.util.ssltransport import SSLTransport
class WebsocketsSSLTransport(SSLTransport):
"""
Modified version of urllib3 SSLTransport to support additional operations used by websockets
"""
def setsockopt(self, *args, **kwargs):
self.socket.setsockopt(*args, **kwargs)
def shutdown(self, *args, **kwargs):
self.unwrap()
self.socket.shutdown(*args, **kwargs)
def _wrap_ssl_read(self, *args, **kwargs):
res = super()._wrap_ssl_read(*args, **kwargs)
if res == 0:
# Websockets does not treat 0 as an EOF, rather only b''
return b''
return res
else:
WebsocketsSSLTransport = None
class WebsocketsSSLContext:
"""
Dummy SSL Context for websockets which returns a WebsocketsSSLTransport instance
for wrap socket when using TLS-in-TLS.
"""
def __init__(self, ssl_context: ssl.SSLContext):
self.ssl_context = ssl_context
def wrap_socket(self, sock, server_hostname=None):
if isinstance(sock, ssl.SSLSocket) and WebsocketsSSLTransport:
return WebsocketsSSLTransport(sock, self.ssl_context, server_hostname=server_hostname)
return self.ssl_context.wrap_socket(sock, server_hostname=server_hostname)

View File

@ -1,8 +1,9 @@
from __future__ import annotations from __future__ import annotations
import abc import abc
import urllib.parse
from .common import RequestHandler, Response from .common import RequestHandler, Response, register_preference
class WebSocketResponse(Response): class WebSocketResponse(Response):
@ -21,3 +22,10 @@ def recv(self):
class WebSocketRequestHandler(RequestHandler, abc.ABC): class WebSocketRequestHandler(RequestHandler, abc.ABC):
pass pass
@register_preference(WebSocketRequestHandler)
def websocket_preference(_, request):
if urllib.parse.urlparse(request.url).scheme in ('ws', 'wss'):
return 200
return 0

View File

@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py # Autogenerated by devscripts/update-version.py
__version__ = '2024.07.16' __version__ = '2024.07.25'
RELEASE_GIT_HEAD = '89a161e8c62569a662deda1c948664152efcb6b4' RELEASE_GIT_HEAD = 'f0993391e6052ec8f7aacc286609564f226943b9'
VARIANT = None VARIANT = None
@ -12,4 +12,4 @@
ORIGIN = 'yt-dlp/yt-dlp' ORIGIN = 'yt-dlp/yt-dlp'
_pkg_version = '2024.07.16' _pkg_version = '2024.07.25'