Motivation:
Currently, Netty cannot handle HTTP/2 Preface messages if the client used the Prior knowledge technique. In Prior knowledge, the client sends an HTTP/2 preface message immediately after finishing TLS Handshake. But in Netty, when TLS Handshake is finished, ALPNHandler is triggered to configure the pipeline. And between these 2 operations, if an HTTP/2 preface message arrives, it gets dropped.
Modification:
Buffer messages until we are done with the ALPN handling.
Result:
Fixes#11403.
Co-authored-by: Norman Maurer <norman_maurer@apple.com>
__Motivation__
Add support for GMSSL protocol to SslUtils.
__Modification__
Modify `SslUtils.getEncryptedPacketLength(ByteBuf buffer, int offset)` to get packet length when protocol is GMSSL.
Modify `SslUtils.getEncryptedPacketLength(ByteBuffer buffer)` to get packet length when protocol is GMSSL.
__Result__
`SslUtils.getEncryptedPacketLength` now supports GMSSL protocol. Fixes https://github.com/netty/netty/issues/11406
__Motivation__
`LoggingHandler` misses a constructor variant that only takes `ByteBufFormat`
__Modification__
Added the missing constructor variant.
__Result__
`LoggingHandler` can be constructed with `ByteBufFormat` only.
Co-authored-by: Nitesh Kant <nitesh_kant@apple.com>
Motivation:
We need to ensure we always "consumed" all alerts etc via SSLEngine.wrap(...) before we teardown the engine. Failing to do so may lead to a situation where the remote peer will not be able to see the actual cause of the handshake failure but just see the connection being closed.
Modifications:
Correctly return HandshakeStatus.NEED_WRAP when we need to wrap some data first before we shutdown the engine because of a handshake failure.
Result:
Fixes https://github.com/netty/netty/issues/11388
Motivation:
At the moment BoringSSL doesnt support explicit set the TLSv1.3 ciphers that should be used. If TLSv1.3 should be used it just enables all ciphers. We should better log if the user tries to explicit set a specific ciphers and using BoringSSL to inform the user that what is tried doesnt really work.
Modifications:
Log if the user tries to not use all TLSv1.3 ciphers and use BoringSSL
Result:
Easier for the user to understand why always all TLSv1.3 ciphers are enabled when using BoringSSL
Co-authored-by: Trustin Lee <trustin@gmail.com>
Motivation:
This special case implementation of Promise / Future requires the implementations responsible for completing the promise to have knowledge of this class to provide value. It also requires that the implementations are able to provide intermediate status while the work is being done. Even throughout the core of Netty it is not really supported most of the times and so just brings more complexity without real gain.
Let's remove it completely which is better then only support it sometimes.
Modifications:
Remove Progressive* API
Result:
Code cleanup.... Fixes https://github.com/netty/netty/issues/8519
Motivation:
Sometime in the past we introduced the concept of Void*Promise. As it turned out this was not a good idea at all as basically each handler in the pipeline need to be very careful to correctly handle this. We should better just remove this "optimization".
Modifications:
- Remove Void*Promise and all the related APIs
- Remove tests which were related to Void*Promise
Result:
Less error-prone API
Motivation:
In this issue(https://github.com/netty/netty/issues/11349 ),IpSubnetFilterRule needs to support ipv6 reserved addresses, such as 8000::, but the current implementation does not support
Modification:
Added support for default rule
Result:
Fixes https://github.com/netty/netty/issues/11349
Signed-off-by: xingrufei <xingrufei@sogou-inc.com>
Motivation:
We have currently two test-failures in master. Let's disable these and then open a PR with a fix once we know why. This way we can make progress in master
Modifications:
Disable the two failing tests
Result:
Master builds again
Motivation:
We should have a default case in every switch block.
Modification:
Add default block in IdleStateHandler
Result:
Cleanup
Signed-off-by: xingrufei <xingrufei@sogou-inc.com>
Motivation:
We should better fail the build if we can't load the OpenSSL library to ensure we not introduce a regression at some point related to native library loading
Modifications:
Remove usages of assumeTrue and let the tests fail if we cant load the native lib
Result:
Ensure we not regress
Motivation:
`SelfSignedCertificate` creates a certificate and private key files and store them in a temporary directory. However, if the certificate uses a wildcard hostname that uses asterisk *, e.g. `*.shieldblaze.com`, it'll throw an error because * is not a valid character in the file system.
Modification:
Replace the asterisk with 'x'
Result:
Fixes#11240
Motivation:
We introduced the ability to offload certain operations to an executor that may take some time to complete. At the moment this is not enabled by default when using the openssl based SSL provider. Let's enable it by default as we have this support for some while now and didnt see any issues yet. This will also make things less confusing and more consistent with the JDK based provider.
Modifications:
Use true as default value for io.netty.handler.ssl.openssl.useTasks.
Result:
Offloading works with openssl based SSL provider as well by default
Motivation:
TLSv1 and TLSv1.1 is considered insecure. Let's follow the JDK and disable these by default
Modifications:
- Disable TLSv1 and TLSv1.1 by default when using OpenSSL.
- Add unit tests
Result:
Use only strong TLS versions by default when using OpenSSL
Motivation:
Conscrypt not correctly filters out non support TLS versions which may lead to test failures.
Related to https://github.com/google/conscrypt/issues/1013
Modifications:
- Bump up to latest patch release
- Add workaround
Result:
No more test failures caused by conscrypt
Motivation:
`PlatformDependent#normalizedOs()` already caches normalized variant of
the value of `os.name` system property. Instead of inconsistently
normalizing it in every case, use the utility method.
Modifications:
- `PlatformDependent`: `isWindows0()` and `isOsx0()` use `NORMALIZED_OS`;
- `PlatformDependent#normalizeOs(String)` define `darwin` as `osx`;
- `OpenSsl#loadTcNative()` does not require `equalsIgnoreCase` bcz `os`
is already normalized;
- Epoll and KQueue: `Native#loadNativeLibrary()` use `normalizedOs()`;
- Use consistent `Locale.US` for lower case conversion of `os.name`;
- `MacOSDnsServerAddressStreamProvider#loadNativeLibrary()` uses
`PlatformDependent.isOsx()`;
Result:
Consistent approach for `os.name` parsing.
Motivation:
We should add explicit null checks so its easier for people to understand why it throws.
Modification:
Add explicit checkNotNull(...)
Result:
Easier to understand for users why it fails.
Signed-off-by: xingrufei <xingrufei@sogou-inc.com>
Co-authored-by: xingrufei <xingrufei@sogou-inc.com>
Motivation:
We've seen (very rare) flaky test failures due to timeouts.
They are too rare to analyse properly, but a theory is that on overloaded, small cloud CI instances, it can sometimes take a surprising amount of time to start a thread.
It could be that the event loop thread is getting an unlucky schedule, and takes seconds to start, causing the timeouts to elapse.
Modification:
Increase the initial timeouts in the SSLEngineTest, that could end up waiting for the event loop thread to start.
Also fix a few simple warnings from Intellij.
Result:
Hopefully we will not see these tests be flaky again.
Motivation:
ReferenceCountedOpenSslEngine may unwrap data and complete the handshake
in a single unwrap() call. However it may return HanshakeStatus of
HandshakeStatus of NEED_UNWRAP instead of FINISHED. This may result in
the SslHandler sending the unwrapped data up the pipeline before
notifying that the handshake has completed, and result in out-of-order
events.
Modifications:
- if ReferenceCountedOpenSslEngine handshake status is NEED_UNWRAP and
produced data, or NEED_WRAP and consumed some data, we should call
handshake() to get the current state.
Result:
ReferenceCountedOpenSslEngine correctly indicates when the handshake has
finished if at the same time data was produced or consumed.
Motivation:
It turned out we didnt run the openssl tests on the CI when we used the non-static version of netty-tcnative.
Modifications:
- Upgrade netty-tcnative to fix segfault when using shared openssl
- Adjust tests to only run session cache tests when openssl supports it
- Fix some more tests to only depend on KeyManager if the underlying openssl version supports it
Result:
Run all openssl test on the CI even when shared library is used
Motivation:
In the latest version of BouncyCastle, BCJSSE:'TLSv1.3' is now a supported protocol for both client and server. So should consider enabling TLSv1.3 when TLSv1.3 is available
Modification:
This pr is to enable TLSv1.3 when using BouncyCastle ALPN support, please review this pr,thanks
Result:
Enable TLSv1.3 when using BouncyCastle ALPN support
Signed-off-by: xingrufei <xingrufei@sogou-inc.com>
Co-authored-by: xingrufei <xingrufei@sogou-inc.com>
Motivation:
NullChecks resulting in a NullPointerException or IllegalArgumentException, numeric ranges (>0, >=0) checks, not empty strings/arrays checks must never be anonymous but with the parameter or variable name which is checked. They must be specific and should not be done with an "OR-Logic" (if a == null || b == null) throw new NullPointerEx.
Modifications:
* import static relevant checks
* Replace manual checks with ObjectUtil methods
Result:
All checks needed are done with ObjectUtil, some exception texts are improved.
Fixes#11170
Motivation:
Under Android it was not possible to load a specific web page. It might be related to the (missing?) ALPN of the internal TLS implementation. BouncyCastle as a replacement works but this was not supported so far by Netty.
BouncyCastle also has the benefit to be a pure Java solution, all the other providers (OpenSSL, Conscrypt) require native libraries which are not available under Android at least.
Modification:
BouncyCastleAlpnSslEngine.java and support classes have been added. It is relying on the JDK code, hence some support classes had to be opened to prevent code duplication.
Result:
BouncyCastle can be used as TLS provider.
Co-authored-by: Norman Maurer <norman_maurer@apple.com>
Motivation:
We are increasingly running in environments where Unsafe, setAccessible, etc. are not available.
When debug logging is enabled, we log a complete stack trace every time one of these initialisations fail.
Seeing these stack traces can cause people unnecessary concern.
For instance, people might have alerts that are triggered by a stack trace showing up in logs, regardless of its log level.
Modification:
We continue to print debug log messages on the result of our initialisations, but now we only include the full stack trace is _trace_ logging (or FINEST, or equivalent in whatever logging framework is configured) is enabled.
Result:
We now only log these initialisation stack traces when the lowest possible log level is enabled.
Fixes#7817
Motivation:
SslHandler invokes channel.read() during the handshake process. For some
channel implementations (e.g. LocalChannel) this may result in re-entry
conditions into unwrap. Unwrap currently defers updating the input
buffer indexes until the unwrap method returns to avoid intermediate
updates if not necessary, but this may result in unwrapping the same
contents multiple times which leads to handshake failures [1][2].
[1] ssl3_get_record:decryption failed or bad record mac
[2] ssl3_read_bytes:sslv3 alert bad record mac
Modifications:
- SslHandler#unwrap updates buffer indexes on each iteration so that if
reentry scenario happens the correct indexes will be visible.
Result:
Fixes https://github.com/netty/netty/issues/11146
Motivation:
SslHandler has many independent boolean member variables. They can be
collapsed into a single variable to save memory.
Modifications:
- SslHandler boolean state consolidated into a single short variable.
Result:
Savings of 8 bytes per SslHandler (which is per connection) observed on
OpenJDK.
Motivation:
`SslHandler#unwrap` may produce `SslHandshakeCompletionEvent` if it
receives `close_notify` alert. This alert indicates that the engine is
closed and no more data are expected in the pipeline. However, it fires
the event before the last data chunk. As the result, further handlers
may loose data if they handle `SslHandshakeCompletionEvent`.
This issue was not visible before #11133 because we did not write
`close_notify` alert reliably.
Modifications:
- Add tests to reproduce described behavior;
- Move `notifyClosePromise` after fire of the last `decodeOut`;
Result:
`SslHandshakeCompletionEvent` correctly indicates that the engine is
closed and no more data are expected on the pipeline.
Motivation:
We should avoid blocking in the event loop as much as possible.
The InputStream.read() is a blocking method, and we don't need to call it if available() returns a positive number.
Modification:
Bypass calling InputStream.read() if available() returns a positive number.
Result:
Fewer blocking calls in the event loop, in general, when ChunkedStream is used.
Motivation:
SslHandler's wrap method notifies the handshakeFuture and sends a
SslHandshakeCompletionEvent user event down the pipeline before writing
the plaintext that has just been wrapped. It is possible the application
may write as a result of these events and re-enter into wrap to write
more data. This will result in out of sequence data and result in alerts
such as SSLV3_ALERT_BAD_RECORD_MAC.
Modifications:
- SslHandler wrap should write any pending data before notifying
promises, generating user events, or anything else that may create a
re-entry scenario.
Result:
SslHandler will wrap/write data in the same order.
Motivation:
SslHandler owns the responsibility to flush non-application data
(e.g. handshake, renegotiation, etc.) to the socket. However when
TCP Fast Open is supported but the client_hello cannot be written
in the SYN the client_hello may not always be flushed. SslHandler
may not wrap/flush previously written/flushed data in the event
it was not able to be wrapped due to NEED_UNWRAP state being
encountered in wrap (e.g. peer initiated renegotiation).
Modifications:
- SslHandler to flush in channelActive() if TFO is enabled and
the client_hello cannot be written in the SYN.
- SslHandler to wrap application data after non-application data
wrap and handshake status is FINISHED.
- SocketSslEchoTest only flushes when writes are done, and waits
for the handshake to complete before writing.
Result:
SslHandler flushes handshake data for TFO, and previously flushed
application data after peer initiated renegotiation finishes.
Motivation:
At the moment we don't support session caching on the client side at all when using the native SSL implementation. We should at least allow to enable it.
Modification:
Allow to enable session cache for client side but disable ti by default due a JDK bug atm.
Result:
Be able to cache sessions on the client side when using native SSL implementation .
Motivation:
Creating certificates from a byte[] while lazy parse it is general useful and is also needed by https://github.com/netty/netty-incubator-codec-quic/pull/141
Modifications:
Move classes, rename these and make them public
Result:
Be able to reuse code
Motivation:
Some of the features we want to support can only be supported by some of the SslContext implementations. We should allow to configure these in a consistent way the same way as we do it with Channel / ChannelOption
Modifications:
- Add SslContextOption and add builder methods that take these
- Add OpenSslContextOption and define two options there which are specific to openssl
Result:
More flexible configuration and implementation of SslContext
Motivation:
In WriteTimeoutHandler we did make the assumption that the executor which is used to schedule the timeout is the same that is backing the write promise. This may not be true which will cause concurrency issues
Modifications:
Ensure we are on the right thread when try to modify the doubly-linked-list and if not schedule it on the right thread.
Result:
Fixes https://github.com/netty/netty/issues/11053
Motivation:
We need to ensure that we call queue.remove() before we cal writeAndFlush() as this operation may cause an event that also touches the queue and remove from it. If we miss to do so we may see NoSuchElementExceptions.
Modifications:
- Call queue.remove() before calling writeAndFlush(...)
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/11046
This brings forward the build and release automation changes from 4.1 (#10879, #10883, #10884, #10886, #10888, #10889, #10893, #10900, #10933, #10945, #10966, #10968, #11002, and #11019) to 5.0.
Details are as follows:
* Use Github workflows for CI (#10879)
Motivation:
We should just use GitHub Actions for the CI
Modifications:
- Adjust docker / docker compose files
- Add different workflows and jobs to deploy and build the project
Result:
Don't depend on external CI services
* Fix non leak build condition
* Only use build and deploy workflows for 4.1 for now
* Add deploy job for cross compiled aarch64 (#10883)
Motivation:
We should also deploy snapshots for our cross compiled native jars.
Modifications:
- Add job and docker files for deploying cross compiled native jars
- Ensure we map the maven cache into our docker containers
Result:
Deploy aarch64 jars and re-use cache
* Use correct docker-compose file to deploy cross compiled artifacts
* Use correct docker-compose task to deploy for cross compiled artifacts
* Split pr and normal build (#10884)
Motivation:
We should better use seperate workflows for PR and normal builds
Modifications:
- Split workflows
- Better cache reuse
Result:
Cleanup
* Only deploy snapshots for one arch
Motivation:
We need to find a way to deploy SNAPSHOTS for different arch with the same timestamp. Otherwise it will cause problems.
See https://github.com/netty/netty/issues/10887
Modification:
Skip all other deploys then x86_64
Result:
Users are able to use SNAPSHOTS for x86_6
* Use maven cachen when running analyze job (#10888)
Motivation:
To prevent failures to problems while downloading dependencies we shoud cache these
Modifications:
Add maven cache
Result:
No more failures due problems while downloading dependencies
* Also include one PR job that uses boringssl (#10886)
Motivation:
When validating PRs we should also at least run one job that uses boringssl
Modifications:
- Add job that uses boringssl
- Cleanup docker compose files
- Fix buffer leak in test
Result:
Also run with boringssl when PRs are validated
* Use matrix for job configurations (#10889)
Motivation:
We can use the matrix feature to define our jobs. This reduces a lot of config
Modification:
Use job matrix
Result:
Easier to maintain
* Correctly deploy artifacts that are build on different archs (#10893)
Motivation:
We need to take special care when deploying snapshots as we need to generate the jars in multiple steps
Modifications:
- Use the nexus staging pluging to stage jars locally in multiple steps
- Add extra job that will merge these staged jars and deploy these
Result:
Fixes https://github.com/netty/netty/issues/10887
* Dont use cron for PRs
Motivation:
It doesnt make sense to use cron for PRs
Modifications:
Remove cron config
Result:
Cleanup
* We run all combinations when validate the PR, let's just use one type for normal push
Motivation:
Let us just only use one build config when building the 4.1 branch.
Modifications:
As we already do a full validation when doing the PR builds we can just only use one build config for pushes to the "main" branches
Result:
Faster build times
* Update action-docker-layer-caching (#10900)
Motivation:
We are three releases behind.
Modifications:
Update to latest version
Result:
Use up-to-date action-docker-layer-caching version
* Verify we can load native modules and add job that verifies on aarch64 as well (#10933)
Motivation:
As shown in the past we need to verify we actually can load the native as otherwise we may introduce regressions.
Modifications:
- Add new maven module which tests loading of native modules
- Add job that will also test loading on aarch64
Result:
Less likely to introduce regressions related to loading native code in the future
* Let script fail if one command fail (#10945)
Motivation:
We should use `set -e` to ensure we fail the script if one command fails.
Modifications:
Add set -e to script
Result:
Fail fast
* Use action to report unit test errors (#10966)
Motivation:
To make it easier to understand why the build fails lets use an action that will report which unit test failed
Modifications:
- Replace custom script with action-surefire-report
Result:
Easier to understand test failures
* Use custom script to check for build failures (#10968)
Motivation:
It turns out we can't use the action to check for build failures as it can't be used when a PR is done from a fork. Let's just use our simple script.
Modifications:
- Replace action with custom script
Result:
Builds for PRs that are done via forks work again.
* Publish test results after PR run (#11002)
Motivation:
To make it easier to understand why a build failed let us publish the rest results
Modifications:
Use a new workflow to be able to publish the test reports
Result:
Easier to understand why a PR did fail
* Fix test reports name
* Add workflow to cut releases (#11019)
Motivation:
Doing releases manually is error-prone, it would be better if we could do it via a workflow
Modification:
- Add workflow to cut releases
- Add related scripts
Result:
Be able to easily cut a release via a workflow
* Update build for master branch
Motivation:
The build changes were brought forward from 4.1, and contain many things specific to 4.1.
Modification:
Changed baseline Java version from 8 to 11, and changed branch references from "4.1" to "master".
Result:
Builds should now work for the master branch.
Co-authored-by: Norman Maurer <norman_maurer@apple.com>
Motivation:
To make it possible to experiment with alternative buffer implementations, we need a way to abstract away the concrete buffers used throughout most of the Netty pipelines, while still having a common currency for doing IO in the end.
Modification:
- Introduce an ByteBufConvertible interface, that allow arbitrary objects to convert themselves into ByteBuf objects.
- Every place in the code, where we did an instanceof check for ByteBuf, we now do an instanceof check for ByteBufConvertible.
- ByteBuf itself implements ByteBufConvertible, and returns itself from the asByteBuf method.
Result:
It is now possible to use Netty with alternative buffer implementations, as long as they can be converted to ByteBuf.
This has been verified elsewhere, with an alternative buffer implementation.
Motivation:
The `!fastOpen` part of `active || !fastOpen` is always false.
Modification:
- Remove `!fastOpen` and keep only `active` as a `flushAtEnd` flag for
`startHandshakeProcessing`;
- Update comment;
Result:
Simplified `flushAtEnd` flag computation in `SslHandler#handlerAdded`.
Support TCP Fast Open for clients and make SslHandler take advantage
Motivation:
- TCP Fast Open allow us to send a small amount of data along side the initial SYN packet when establishing a TCP connection.
- The TLS Client Hello packet is small enough to fit in there, and is also idempotent (another requirement for using TCP Fast Open), so if we can save a round-trip when establishing TLS connections when using TFO.
Modification:
- Add support for client-side TCP Fast Open for Epoll, and also lowers the Linux kernel version requirements to 3.6.
- When adding the SslHandler to a pipeline, if TCP Fast Open is enabled for the channel (and the channel is not already active) then start the handshake early by writing it to the outbound buffer.
- An important detail to note here, is that the outbound buffer is not flushed at this point, like it would for normal handshakes. The flushing happens later as part of establishing the TCP connection.
Result:
- It is now possible for clients (on epoll) to open connections with TCP Fast Open.
- The SslHandler automatically detects when this is the case, and now send its Client Hello message as part of the initial data in the TCP Fast Open flow when available, saving a round-trip when establishing TLS connections.
Co-authored-by: Colin Godsey <crgodsey@gmail.com>
Motivation:
The testGlobalWriteThrottle is flaky and failed our build multiple times now. Lets disable it for now until we had time to investigate
Modifications:
Disable flaky test
Result:
Less failures during build
Motivation:
At the moment we always set SSL_OP_NO_TICKET when building our context. The problem with this is that this also disables resumption for TLSv1.3 in BoringSSL as it only supports stateless resumption for TLSv1.3 which uses tickets.
We should better clear this option when TLSv1.3 is enabled to be able to resume sessions. This is also inline with the OpenJDK which enables this for TLSv1.3 by default as well.
Modifications:
Check for enabled protocols and if TLSv1.3 is set clear SSL_OP_NO_TICKET.
Result:
Be able to resume sessions for TLSv1.3 when using BoringSSL.
Motivation:
File.createTempFile(String, String)` will create a temporary file in the system temporary directory if the 'java.io.tmpdir'. The permissions on that file utilize the umask. In a majority of cases, this means that the file that java creates has the permissions: `-rw-r--r--`, thus, any other local user on that system can read the contents of that file.
This can be a security concern if any sensitive data is stored in this file.
This was reported by Jonathan Leitschuh <jonathan.leitschuh@gmail.com> as a security problem.
Modifications:
Use Files.createTempFile(...) which will use safe-defaults when running on java 7 and later. If running on java 6 there isnt much we can do, which is fair enough as java 6 shouldnt be considered "safe" anyway.
Result:
Create temporary files with sane permissions by default.
Motivation:
It was not 100% clear who is responsible calling close() on the InputStream.
Modifications:
Clarify javadocs.
Result:
Related to https://github.com/netty/netty/issues/10974
Co-authored-by: Chris Vest <christianvest_hansen@apple.com>
Motivation:
TLS_FALSE_START slightly changes the "flow" during handshake which may cause suprises for the end-user. We should better disable it by default again and later add a way to enable it for the user.
Modification:
This reverts commit 514d349e1f.
Result:
Restore "old flow" during TLS handshakes.
Motivation:
We didnt correctly filter out TLSv1.3 ciphers when TLSv1.3 is not enabled.
Modifications:
- Filter out ciphers that are not supported due the selected TLS version
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/10911
Co-authored-by: Bryce Anderson <banderson@twitter.com>
Motivation:
If the given port is already bound, the PcapWriteHandlerTest will sometimes fail.
Modification:
Use a dynamic port using `0`, which is more reliable
Result:
Less Flaky
Motivation:
We should override the get*ApplicationProtocol() methods in ReferenceCountedOpenSslEngine to make it easier for users to obtain the selected application protocol
Modifications:
Add missing overrides
Result:
Easier for the user to get the selected application protocol (if any)
Motivation:
We should expose some methods as protected to make it easier to write custom SslContext implementations.
This will be reused by the code for https://github.com/netty/netty-incubator-codec-quic/issues/97
Modifications:
- Add protected to some static methods which are useful for sub-classes
- Remove some unused methods
- Move *Wrapper classes to util package and make these public
Result:
Easier to write custom SslContext implementations
Motivation:
We need to ensure we always drain the error stack when a callback throws as otherwise we may pick up the error on a different SSL instance which uses the same thread.
Modifications:
- Correctly drain the error stack if native method throws
- Add a unit test which failed before the change
Result:
Always drain the error stack
Motivation:
When using the JDKs SSLEngineImpl with TLSv1.3 it sometimes returns HandshakeResult.FINISHED multiple times. This can lead to have SslHandshakeCompletionEvents to be fired multiple times.
Modifications:
- Keep track of if we notified before and if so not do so again if we use TLSv1.3
- Add unit test
Result:
Consistent usage of events
Motivation:
We can make use of internalNioBuffer(...) if we cant access the memoryAddress. This at least will reduce the object creations.
Modifications:
Use internalNioBuffer(...) and so reduce the GC
Result:
Less object creation if we can't access the memory address.
Motivation:
https in xmlns URIs does not work and will let the maven release plugin fail:
```
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.779 s
[INFO] Finished at: 2020-11-10T07:45:21Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-release-plugin:2.5.3:prepare (default-cli) on project netty-parent: Execution default-cli of goal org.apache.maven.plugins:maven-release-plugin:2.5.3:prepare failed: The namespace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" could not be added as a namespace to "project": The namespace prefix "xsi" collides with an additional namespace declared by the element -> [Help 1]
[ERROR]
```
See also https://issues.apache.org/jira/browse/HBASE-24014.
Modifications:
Use http for xmlns
Result:
Be able to use maven release plugin
Motivation:
Sometimes it would be helpful to easily detect if an operation failed due the SSLEngine already be closed.
Modifications:
Add special exception that is used when the engine was closed
Result:
Easier to detect a failure caused by a closed exception
Motivation:
FingerprintTrustManagerFactory can only use SHA-1 that is considered
insecure.
Modifications:
- Updated FingerprintTrustManagerFactory to accept a stronger hash algorithm.
- Remove the constructors that still use SHA-1.
- Added a test for FingerprintTrustManagerFactory.
Result:
A user can now configure FingerprintTrustManagerFactory to use a
stronger hash algorithm.
Co-authored-by: Norman Maurer <norman_maurer@apple.com>
Motivation:
HTTP is a plaintext protocol which means that someone may be able
to eavesdrop the data. To prevent this, HTTPS should be used whenever
possible. However, maintaining using https:// in all URLs may be
difficult. The nohttp tool can help here. The tool scans all the files
in a repository and reports where http:// is used.
Modifications:
- Added nohttp (via checkstyle) into the build process.
- Suppressed findings for the websites
that don't support HTTPS or that are not reachable
Result:
- Prevent using HTTP in the future.
- Encourage users to use HTTPS when they follow the links they found in
the code.
Motivation:
In the master branch we fail fire* operations on the ChannelHandlerContext once the handler was removed. This is by design as it is "unspecified" what the semantics could be after the handler was removed and may lead to very hard to debug problems. Because of this we need to select the right ChannelHandlerContext for firing the event.
Modifications:
Choose a valid ChannelHandlerContext based on the state of the context of the handler
Result:
No more test failures
Motivation:
junit deprecated Assert.assertThat(...)
Modifications:
Use MatcherAssert.assertThat(...) as replacement for deprecated method
Result:
Less deprecation warnings
Motivation:
We can filter out `null` rules while initializing the instance of `RuleBasedIpFilter` so we don't have to keep checking for `null` rules while iterating through `rules` array in `for loop` which is just a waste of CPU cycles.
Modification:
Added `null` rule check inside the constructor.
Result:
No more wasting CPU cycles on check the `null` rule each time in `for loop` and makes the overall operation more faster.
Motivation:
LGTM reports multiple issues. They need to be triaged,
and real ones should be fixed.
Modifications:
- Fixed multiple issues reported by LGTM, such as redundant conditions,
resource leaks, typos, possible integer overflows.
- Suppressed false-positives.
- Added a few testcases.
Result:
Fixed several possible issues, get rid of false alarms in the LGTM report.
Motivation:
Users may want to do special actions when onComplete(...) was called and depend on these once they receive the SniCompletionEvent
Modifications:
Switch order and so call onLookupComplete(...) before we fire the event
Result:
Fixes https://github.com/netty/netty/issues/10655
Motivation:
Java 16 will come around eventually anyway, and this makes it easier for people to experiment with Early Access builds.
Modification:
- Added Maven profiles for JDK 16 to relevant pom files.
- Removed the `--add-exports java.base/sun.security.x509=ALL-UNNAMED` argument when running tests; we've not needed it since the Java11-as-baseline PR landed.
Result:
Netty now builds on JDK 16 pre-releases (provided they've not broken compatibility in some way).
Raise the Netty 5 minimum required Java version to Java 11.
Motivation:
Java 11 has been out for some time, and Netty 5 is still some ways out.
There are also many good features in Java 11 that we wish to use, such as VarHandles, var-keyword, and the module system.
There is no reason for Netty 5 to not require Java 11, since Netty 4.x will still be supported for the time being.
Modification:
Remove everything in the pom files related to Java versions older than Java 11.
Remove the animal-sniffer plug-in and rely on the `--release` compiler flag instead.
Remove docker files related to Java versions older than Java 11.
Remove the copied SCTP APIs -- we should test this commit independently on Windows.
Remove the OpenJdkSelfSignedCertGenerator.java file and just always use Bouncy Castle for generating self-signed certificates for testing.
Make netty-testsuite tests pass by including Bouncy Castle as a test dependency, so we're able to generate our self-signed certificate.
Result:
Java 11 is now the minimum required Java version.
Motivation:
We should implement the Closeable method to properly close `OutputStream` and `PcapWriteHandler`. So whenever `handlerRemoved(ChannelHandlerContext)` is called or the user wants to stop the Pcap writes into `OutputStream`, we have a proper method to close it otherwise writing data to close `OutputStream` will result in `IOException`.
Modification:
Implemented `Closeable` in `PcapWriteHandler` which calls `PcapWriter#close` and closes `OutputStream` and stops Pcap writes.
Result:
Better handling of Pcap writes.
Motivation:
We wish to use Unsafe as little as possible, and Java 8 allows us
to take some short-cuts or play some tricks with generics,
for the purpose of working around having to declare all checked
exceptions. Ideally all checked exceptions would be declared, but
the code base is not ready for that yet.
Modification:
The call to UNSAFE.throwException has been removed, so when we need
that feature, we instead use the generic exception trick.
In may cases, Java 8 allows us to throw Throwable directly. This
happens in cases where no exception is declared to be thrown in a
scope.
Finally, some warnings have also been fixed, and some imports have
been reorganised and cleaned up while I was modifying the files
anyway.
Result:
We no longer use Unsafe for throwing any exceptions.
Motivation:
We need to take the Provider into account as well when trying to detect if TLSv1.3 is used by default / supported
Modifications:
- Change utility method to respect provider as well
- Change testcode
Result:
Less error-prone tests
Motivation:
Some JDKs dissallow the usage of keysizes < 2048, so we should not use such small keysizes in tests.
This showed up on fedora 32:
```
Caused by: java.security.cert.CertPathValidatorException: Algorithm constraints check failed on keysize limits. RSA 1024bit key used with certificate: CN=tlsclient. Usage was tls client
at sun.security.util.DisabledAlgorithmConstraints$KeySizeConstraint.permits(DisabledAlgorithmConstraints.java:817)
at sun.security.util.DisabledAlgorithmConstraints$Constraints.permits(DisabledAlgorithmConstraints.java:419)
at sun.security.util.DisabledAlgorithmConstraints.permits(DisabledAlgorithmConstraints.java:167)
at sun.security.provider.certpath.AlgorithmChecker.check(AlgorithmChecker.java:326)
at sun.security.provider.certpath.PKIXMasterCertPathValidator.validate(PKIXMasterCertPathValidator.java:125)
... 23 more
```
Modifications:
Replace hardcoded keys / certs with SelfSignedCertificate
Result:
No test-failures related to small key sizes anymore.
Motivation:
We should stop as soon as we were able to set the key material on the server side as otherwise we may select keymaterial that "belongs" to a less prefered cipher. Beside this it also is just useless work.
We also need to propagate the exception when it happens during key material selection on the client side so openssl will produce the right alert.
Modifications:
- Stop once we were able to select a key material on the server side
- Ensure we not call choose*Alias more often then needed
- Propagate exceptions during selection of the keymaterial on the client side.
Result:
Less overhead and more correct behaviour
Motivation:
We need to let openssl know that we failed to find the key material so it will produce an alert for the remote peer to consume. Beside this we also need to ensure we wrap(...) until we produced everything as otherwise the remote peer may see partial data when an alert is produced in multiple steps.
Modifications:
- Correctly throw if we could not find the keymaterial
- wrap until we produced everything
- Add test
Result:
Correctly handle the case when key material could not be found
Motivation:
Calling chooseServerAlias(...) may be expensive so we should ensure we not call it multiple times for the same auth methods.
Modifications:
Remove duplicated from authMethods before trying to call chooseServerAlias(...)
Result:
Less performance overhead during key material selection
Motivation:
Following Javadoc standard
Modification:
Change from `@param KeyManager` to `@param keyManager`
Result:
The `@param` matches the actual parameter variable name
Motivation:
Write TCP and UDP packets into Pcap `OutputStream` which helps a lot in debugging.
Modification:
Added TCP and UDP Pcap writer.
Result:
New handler can write packets into an `OutputStream`, e.g. a file that can be opened with Wireshark.
Fixes#10385.
Motivation:
This reverts commit 7bf1ffb2d4 as it turns out it introduced a big performance regression.
Modifications:
Revert 7bf1ffb2d4
Result:
Performance of TLS is back to normal
Motivation:
`IpSubnetFilter` uses Binary Search for IP Address search which is fast if we have large set of IP addresses to filter.
Modification:
Added `IpSubnetFilter` which takes `IpSubnetFilterRule` for filtering.
Result:
Faster IP address filter.
Motivation:
`RuleBasedIpFilter` had JavaDoc `{@link #channelRejected(ChannelHandlerContext, SocketAddress)}` instead of `{@link AbstractRemoteAddressFilter#channelRejected(ChannelHandlerContext, SocketAddress)}`.
Modification:
Added `AbstractRemoteAddressFilter` reference.
Result:
Fixed JavaDoc error and made documentation more clear.
Motivation:
Right now after a SslMasterKeyHandler is added to the pipeline,
it also needs to be enabled via a system property explicitly. In
some environments where the handler is conditionally added to
the pipeline this is redundant and a bit confusing.
Modifications:
This changeset keeps the default behavior, but allows child
implementations to tweak the way on how it detects that it
should actually handle the event when it is being raised.
Also, removed a private static that is not used in the wireshark
handler.
Result:
Child implementations can be more flexible in deciding when
and how the handler should perform its work (without changing
any of the defaults).
Motivation:
To reduce latency and RTTs we should use TLS False Start when jdkCompatibilityMode is not required and its supported
Modifications:
Use SSL_MODE_ENABLE_FALSE_START when jdkCompatibilityMode is false
Result:
Less RTTs and so lower latency when TLS False Start is supported
Motivation:
How we init our static fields in Conscrypt was kind of error-prone and may even lead to NPE later on when methods were invoked out of order.
Modifications:
- Move all the init code to a static block
- Remove static field which is not needed anymore
Result:
Cleanup and also fixes https://github.com/netty/netty/issues/10413
Motivation:
fd0d06e introduced the usage of MethodHandles and so also introduced some usages of invokeExact(...). Unfortunally when calling this method we missed to also cast the return value in the static init block of JdkAlpnSslUtils which lead to an exception to be thrown as the JVM did assume the return value is void which is not true.
Modifications:
Correctly cast the return value of invokeExact(...)
Result:
ALPN can be used in master with JDK again
Motiviation:
When TLSv1.3 was introduced almost 2 years ago, it was decided to disable it by default, even when it's supported by the underlying TLS implementation.
TLSv13 is pretty stable now in Java (out of the box in Java 11, OpenJSSE for Java 8, BoringSSL and OpenSSL) and may be enabled by default.
Modifications:
Ensure TLSv13 is enabled by default when the underyling JDK SSLEngine implementation enables it as well
Result:
TLSv1.3 is now enabled by default, so users don't have to explicitly enable it.
Co-authored-by: Stephane Landelle <slandelle@gatling.io>
Motivation:
AlgorithmId.sha256WithRSAEncryption_oid was removed in JDK15 and later so we should not depend on it as otherwise we will see compilation errors
Modifications:
Replace AlgorithmId.sha256WithRSAEncryption_oid usage with specify the OID directly
Result:
Compiles on JDK15+
Motivation:
Replace Class.getClassLoader with io.netty.util.internal.PlatformDependent.getClassLoader in Openssl so it also works when a SecurityManager is in place
Modification:
Replace Class.getClassLoader with io.netty.util.internal.PlatformDependent.getClassLoader in Openssl
Result:
No issues when a SecurityManager is in place
Motivation:
TLSv1.3 is not strictly limited to Java11+ anymore as different vendors backported TLSv1.3 to Java8 as well. We should ensure we make the detection of if TLSv1.3 is supported not depend on the Java version that is used.
Modifications:
- Add SslProvider.isTlsv13Supported(...) and use it in tests to detect if we should run tests against TLSv1.3 as well
- Adjust testcase to work on latest JDK 8 release as well
Result:
Correct detection of TLSv1.3 support even if Java version < 11.
Motivation:
At the moment we don't support session caching for client side when using native SSLEngine implementation and our implementation of SSLSessionContext is incomplete.
Modification:
- Consume netty-tcnative changes to be able to cache session in an external cache
- Add and adjust unit tests to test session caching
- Add an in memory session cache that is hooked into native SSLEngine
Result:
Support session caching on the client and server side
Motivation:
jdk.tls.client.enableSessionTicketExtension property must be respect by OPENSSL and OPENSSL_REFCNT SslProvider to ensure a consistent behavior. Due a bug this was not the case and it only worked for OPENSSL_REFCNT but not for OPENSSL.
Modifications:
Move the property check into static method that is used by both
Result:
Correctly respect jdk.tls.client.enableSessionTicketExtension
Motivation:
At the moment we may report BUFFER_OVERFLOW when wrap(...) fails to consume data but still prodce some. This is not correct and we should better report NEED_WRAP as we already have produced some data (for example tickets). This way the user will try again without increasing the buffer size which is more correct and may reduce the number of allocations
Modifications:
Return NEED_WRAP when we produced some data but not consumed any.
Result:
Fix ReferenceCountedOpenSslEngine.wrap(...) state machine
**Motivation:**
We are interested in building Netty libraries with Ahead-of-time compilation with GraalVM. We saw there was [prior work done on this](https://github.com/netty/netty/search?q=graalvm&unscoped_q=graalvm). We want to introduce a change which will unblock GraalVM support for applications using netty and `netty-tcnative`.
This solves the error [that some others encounter](https://github.com/oracle/graal/search?q=UnsatisfiedLinkError+sslOpCipherServerPreference&type=Issues):
```
Exception in thread "main" java.lang.UnsatisfiedLinkError: io.grpc.netty.shaded.io.netty.internal.tcnative.NativeStaticallyReferencedJniMethods.sslOpCipherServerPreference()I [symbol: Java_io_grpc_netty_shaded_io_netty_internal_tcnative_NativeStaticallyReferencedJniMethods_sslOpCipherServerPreference or Java_io_grpc_netty_shaded_io_netty_internal_tcnative_NativeStaticallyReferencedJniMethods_sslOpCipherServerPreference__]
```
**Modification:**
The root cause of the issue is that in the tcnative libraries, the [`SSL.java` class](783a8b6b69/openssl-dynamic/src/main/java/io/netty/internal/tcnative/SSL.java (L67)) references a native call in the static initialization of the class - the method `sslOpCipherServerPreference()` on line 67 is used to initialize the static variable. But you see that `OpenSsl` also uses[ `SSL.class` to check if `netty-tcnative` is on the classpath before loading the native library](cbe238a95b/handler/src/main/java/io/netty/handler/ssl/OpenSsl.java (L123)).
So this is the problem because in ahead-of-time compilation, when it references the SSL class, it will try to load those static initializers and make the native library call, but it cannot do so because the native library has not been loaded yet since the `SSL` class is being referenced to check if the library should be loaded in the first place.
**Solution:** So the proposed solution here is to just choose a different class in the `tcnative` package which does not make a native library call during static initialization. I just chose `SSLContext` if this is OK.
This should have no negative effects other than unblocking the GraalVM use-case.
**Result:**
It fixes the unsatisfied link error. It will fix error for future users; it is a common error people encounter.
Motivation:
There was a new netty-tcnative release which we should use. Beside this the SSLErrorTest was quite fragile and so should be adjusted.
Modifications:
Update netty-tcnative and adjust test
Result:
Use latest netty-tcnative release
Motivation:
BoringSSL behaves differently then OpenSSL and not include any TLS1.3 ciphers in the returned array when calling SSL_get_ciphers(...). This is due the fact that it also not allow to explicit configure which are supported and which not for TLS1.3. To mimic the behaviour expected by the SSLEngine API we should workaround this.
Modifications:
- Add a unit test that verifies enabled protocols / ciphers
- Add special handling for BoringSSL and tls1.3
Result:
Make behaviour consistent
Motivation:
We did not correctly preserve the original cause of the SSLException when all the following are true:
* SSL_read failed
* handshake was finished
* some outbound data was produced durigin SSL_read (for example ssl alert) that needs to be picked up by wrap(...)
Modifications:
Ensure we correctly preserve the original cause of the SSLException and rethrow it once we produced all data via wrap(...)
Result:
Be able to understand why we see an error in all cases
Motivation:
Due a bug in our test we may dropped data on the floor which are generated during handshaking (or slightly after). This could lead to corrupt state in the engine itself and so fail tests. This is especially true for TLS1.3 which generates the sessions on the server after the "actual handshake" is done.
Modifications:
Contine with wrap / unwrap until all data was consumed
Result:
Correctly feed all data to the engine during testing
Motivation:
When ApplicationProtocolNegotiationHandler is in the pipeline we should expect that its handshakeFailure(...) method will be able to completly handle the handshake error. At the moment this is not the case as it only handled SslHandshakeCompletionEvent but not the exceptionCaught(...) that is also triggered in this case
Modifications:
- Call handshakeFailure(...) in exceptionCaught and so fix double notification.
- Add testcases
Result:
Fixes https://github.com/netty/netty/issues/10342
Motivation:
I did not correctly adjust the code before cherry-pick, causing a compilation error in the test
Modifications:
Use ChannelHandler
Result:
No more compilation error
The current FLowControlHandler keeps a flag to track whether a read() call is pending.
This could lead to a scenario where you call read multiple times when the queue is empty,
and when the FlowControlHandler Queue starts getting messages, channelRead will be fired only once,
when we should've fired x many times, once for each time the handlers downstream called read().
Modifications:
Minor change to replace the boolean flag with a counter and adding a unit test for this scenario.
Result:
I used TDD, so I wrote the test, made sure it's failing, then updated the code and re-ran the test
to make sure it's successful after the changes.
Co-authored-by: Kareem Ali <kali@localhost.localdomain>
Motivation:
9b7e091 added a special SSLHandshakeException sub-class to signal handshake timeouts but we missed to update a testcase to directly assert the type of the exception.
Modifications:
Assert directly that SslHandshakeTimeoutException is used
Result:
Test cleanup
Motivation:
It seems that `testSwallowedReadComplete(...)` may fail with an AssertionError sometimes after my tests. The relevant stack trace is as follows:
```
java.lang.AssertionError: expected:<IdleStateEvent(READER_IDLE, first)> but was:<null>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at io.netty.handler.flow.FlowControlHandlerTest.testSwallowedReadComplete(FlowControlHandlerTest.java:478)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:89)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:41)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:542)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:770)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:464)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:210)
```
Obviously the `readerIdleTime` of `IdleStateHandler` and the thread sleep time before `EmbeddedChannel.runPendingTasks` are both 100ms. And if `userEvents.poll()` happened before `userEvents.add(...)` or no `IdleStateEvent` fired at all, this test case would fail.
Modification:
Sleep for a little more time before running all pending tasks in the `EmbeddedChannel`.
Result:
Fix the problem of small probability of failure.