Motivation
A memory leak related to DNS resolution was reported in #9634,
specifically linked to the TCP retry fallback functionality that was
introduced relatively recently. Upon inspection it's apparent that there
are some error paths where the original UDP response might not be fully
released, and more significantly the TCP response actually leaks every
time on the fallback success path.
It turns out that a bug in the unit test meant that the intended TCP
fallback path was not actually exercised, so it did not expose the main
leak in question.
Modifications
- Fix DnsNameResolverTest#testTruncated0 dummy server fallback logic to
first read transaction id of retried query and use it in replayed
response
- Adjust semantic of internal DnsQueryContext#finish method to always
take refcount ownership of passed in envelope
- Reorder some logic in DnsResponseHandler fallback handling to verify
the context of the response is expected, and ensure that the query
response are either released or propagated in all cases. This also
reduces a number of redundant retain/release pairings
Result
Fixes#9634
Motivation:
Classes `AbstractHttp2StreamChannel.Http2StreamChannelConfig`
and `DnsNameResolver.AddressedEnvelopeAdapter` may be static:
it doesn't reference its enclosing instance.
Modification:
Add `static` modifier.
Result:
Prevents a possible memory leak and uses less memory per class instance.
Motivation:
We currently try to access the the domain search list via reflection on windows which will print a illegal access warning when using Java9 and later.
Modifications:
Add a guard against the used java version.
Result:
Fixes https://github.com/netty/netty/issues/9500.
Motivation:
It is possible that the user uses a too big EDNS0 setting for the MTU and so we may receive a truncated datagram packet. In this case we should try to detect this and retry via TCP if possible
Modifications:
- Fix detecting of incomplete records
- Mark response as truncated if we did not consume the whole packet
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/9365
Motivation:
We should only ever close the underlying tcp socket once we received the envelope to ensure we never race in the test.
Modifications:
- Only close socket once we received the envelope
- Set REUSE_ADDR
Result:
More robust test
Motivation:
testTruncatedWithTcpFallback was flacky as we may end up closing the socket before we could read all data. We should only close the socket after we succesfully read all data.
Modifications:
Move socket.close() to finally block
Result:
Fix flaky test and so make the CI more stable again.
Motivation:
We should only try to use reflection to access default nameservers when using Java8 and lower as otherwise we will produce an Illegal reflective access warning like:
WARNING: Illegal reflective access by io.netty.resolver.dns.DefaultDnsServerAddressStreamProvider
Modifications:
Add Java version check before try to use reflective access.
Result:
No more warning when Java9+ is used.
Motivation:
OOME is occurred by increasing suppressedExceptions because other libraries call Throwable#addSuppressed. As we have no control over what other libraries do we need to ensure this can not lead to OOME.
Modifications:
Only use static instances of the Exceptions if we can either dissable addSuppressed or we run on java6.
Result:
Not possible to OOME because of addSuppressed. Fixes https://github.com/netty/netty/issues/9151.
Motivation:
Sometimes DNS responses can be very large which mean they will not fit in a UDP packet. When this is happening the DNS server will set the TC flag (truncated flag) to tell the resolver that the response was truncated. When a truncated response was received we should allow to retry via TCP and use the received response (if possible) as a replacement for the truncated one.
See https://tools.ietf.org/html/rfc7766.
Modifications:
- Add support for TCP fallback by allow to specify a socketChannelFactory / socketChannelType on the DnsNameResolverBuilder. If this is set to something different then null we will try to fallback to TCP.
- Add decoder / encoder for TCP
- Add unit tests
Result:
Support for TCP fallback as defined by https://tools.ietf.org/html/rfc7766 when using DnsNameResolver.
Motivation:
https://github.com/netty/netty/pull/9021 did apply some changes to filter out duplicates InetAddress when calling resolveAll(...) to mimic JDK behaviour. Unfortunally this also introduced a regression as we should not filter duplicates when the user explicit calls resolveAll(DnsQuestion).
Modifications:
- Only filter duplicates if resolveAll(String) is used
- Add unit test
Result:
Fixes regressions introduces by https://github.com/netty/netty/pull/9021
Motivation:
075cf8c02e introduced a change to allow resolve(...) to notify as soon as the preferred record was resolved. This works great but we should also allow the user to configure that we want to do the same for resolveAll(...), which means we should be able to notify as soon as all records for a preferred record were resolved.
Modifications:
- Add a new DnsNameResolverBuilder method to allow configure this (use false as default to not change default behaviour)
- Add unit test
Result:
Be able to speed up resolving.
Motivation:
At the moment resolve(...) does just delegate to resolveAll(...) and so will only notify the future once all records were resolved. This is wasteful as we are only interested in the first record anyway. We should notify the promise as soon as one record that matches the preferred record type is resolved.
Modifications:
- Introduce DnsResolveContext.isCompleteEarly(...) to be able to detect once we should early notify the promise.
- Make use of this early detecting if resolve(...) is called
- Remove FutureListener which could lead to IllegalReferenceCountException due double releases
- add unit test
Result:
Be able to notify about resolved host more quickly.
Motivation:
We did not correctly calculate the new ttl as we did forget to add `this.`
Modifications:
Add .this and so correctly calculate the TTL
Result:
Use correct TTL for authoritative nameservers when updating these.
Motivation:
To closely mimic what the JDK does we should not try to resolve AAAA records if the system itself does not support IPv6 at all as it is impossible to connect to this addresses later on. In this case we need to use ResolvedAddressTypes.IPV4_ONLY.
Modifications:
Add static method to detect if IPv6 is supported and if not use ResolvedAddressTypes.IPV4_ONLY.
Result:
More consistent behaviour between JDK and our resolver implementation.
Motivation:
At the moment we basically drop all non prefered addresses when calling DnsNameResolver.resolveAll(...). This is just incorrect and was introduced by 4cd39cc4b3. More correct is to still retain these but sort the returned List to have the prefered addresses on the beginning of the List. This also ensures resolve(...) will return the correct return type.
Modifications:
- Introduce PreferredAddressTypeComperator which we use to sort the List so it will contain the preferred address type first.
- Add unit test to verify behaviour
Result:
Include not only preferred addresses in the List that is returned by resolveAll(...)
Motivation:
We should not throw check exceptions when the user calls sync*() but should better wrap it in a CompletionException to make it easier for people to reason about what happens.
Modifications:
- Change sync*() to throw CompletionException
- Adjust tests
- Add some more tests
Result:
Fixes https://github.com/netty/netty/issues/8521.
Motivation:
During investigating some other bug I noticed that we log with warn level if we fail to notify the promise due the fact that it is already full-filled. This is not correct and missleading as there is nothing wrong with it in general. A promise may already been fullfilled because we did multiple queries and one of these was successful.
Modifications:
- Change log level to trace
- Add unit test which before did log with warn level but now does with trace level.
Result:
Less missleading noise in the log.
Motivation:
DnsNameResolver#resolveAll(String) may return duplicate results in the event that the original hostname DNS response includes an IP address X and a CNAME that ends up resolving the same IP address X. This behavior is inconsistent with the JDK’s resolver and is unexpected to retrun a List with duplicate entries from a resolveAll(..) call.
Modifications:
- Filter out duplicates
- Add unit test
Result:
More consistent and less suprising behavior
Motivation:
As we now us java8 as minimum java version we can deprecate ChannelInboundHandlerAdapter / ChannelOutboundHandlerAdapter and just move the default implementations into the interfaces. This makes things a bit more flexible for the end-user and also simplifies the class-hierarchy.
Modifications:
- Mark ChannelInboundHandlerAdapter and ChannelOutboundHandlerAdapter as deprecated
- Add default implementations to ChannelInboundHandler / ChannelOutboundHandler
- Refactor our code to not use ChannelInboundHandlerAdapter / ChannelOutboundHandlerAdapter anymore
Result:
Cleanup class-hierarchy and make things a bit more flexible.
Motivation:
We did not have any unit tests that queries for TXT records.
Modifications:
Add unit test to query TXT records.
Result:
More test-coverage.
Motivation:
We can just use Objects.requireNonNull(...) as a replacement for ObjectUtil.checkNotNull(....)
Modifications:
- Use Objects.requireNonNull(...)
Result:
Less code to maintain.
Motivation:
We can use lambdas now as we use Java8.
Modification:
use lambda function for all package, #8751 only migrate transport package.
Result:
Code cleanup.
Motivation:
As netty 4.x supported Java 6 we had various if statements to check for java versions < 8. We can remove these now.
Modification:
Remove unnecessary if statements that check for java versions < 8.
Result:
Cleanup code.
* Decouble EventLoop details from the IO handling for each transport to allow easy re-use of code and customization
Motiviation:
As today extending EventLoop implementations to add custom logic / metrics / instrumentations is only possible in a very limited way if at all. This is due the fact that most implementations are final or even package-private. That said even if these would be public there are the ability to do something useful with these is very limited as the IO processing and task processing are very tightly coupled. All of the mentioned things are a big pain point in netty 4.x and need improvement.
Modifications:
This changeset decoubled the IO processing logic from the task processing logic for the main transport (NIO, Epoll, KQueue) by introducing the concept of an IoHandler. The IoHandler itself is responsible to wait for IO readiness and process these IO events. The execution of the IoHandler itself is done by the SingleThreadEventLoop as part of its EventLoop processing. This allows to use the same EventLoopGroup (MultiThreadEventLoupGroup) for all the mentioned transports by just specify a different IoHandlerFactory during construction.
Beside this core API change this changeset also allows to easily extend SingleThreadEventExecutor / SingleThreadEventLoop to add custom logic to it which then can be reused by all the transports. The ideas are very similar to what is provided by ScheduledThreadPoolExecutor (that is part of the JDK). This allows for example things like:
* Adding instrumentation / metrics:
* how many Channels are registered on an SingleThreadEventLoop
* how many Channels were handled during the IO processing in an EventLoop run
* how many task were handled during the last EventLoop / EventExecutor run
* how many outstanding tasks we have
...
...
* Implementing custom strategies for choosing the next EventExecutor / EventLoop to use based on these metrics.
* Use different Promise / Future / ScheduledFuture implementations
* decorate Runnable / Callables when submitted to the EventExecutor / EventLoop
As a lot of functionalities are folded into the MultiThreadEventLoopGroup and SingleThreadEventLoopGroup this changeset also removes:
* AbstractEventLoop
* AbstractEventLoopGroup
* EventExecutorChooser
* EventExecutorChooserFactory
* DefaultEventLoopGroup
* DefaultEventExecutor
* DefaultEventExecutorGroup
Result:
Fixes https://github.com/netty/netty/issues/8514 .
Motivation:
Custom Netty ThreadLocalRandom and ThreadLocalRandomProvider classes are no longer needed and can be removed.
Modification:
Remove own ThreadLocalRandom
Result:
Less code to maintain
Motivation:
PlatformDependent.newConcurrentHashMap() is no longer needed so it could be easily removed and new ConcurrentHashMap<>() inlined instead of invoking PlatformDependent.newConcurrentHashMap().
Modification:
Use ConcurrentHashMap provided by the JDK directly.
Result:
Less code to maintain.
Motivation:
We can use the diamond operator these days.
Modification:
Use diamond operator whenever possible.
Result:
More modern code and less boiler-plate.
Motivation:
When using multiple nameservers and a nameserver respond with NXDOMAIN we should only fail the query if the nameserver in question is authoritive or no nameservers are left to try.
Modifications:
- Try next nameserver if NXDOMAIN was returned but the nameserver is not authoritive
- Adjust testcase to respect correct behaviour.
Result:
Fixes https://github.com/netty/netty/issues/8261
Motiviation:
Because of how we implemented the registration / deregistration of an EventLoop it was not possible to wrap an EventLoop implementation and use it with a Channel.
Modification:
- Introduce EventLoop.Unsafe which is responsible for the actual registration.
- Move validation of EventLoop / Channel combo to the EventLoop
- Add unit test that verifies that wrapping works
Result:
Be able to wrap an EventLoop and so add some extra functionality.
Motivation:
At the moment it’s possible to have a Channel in Netty that is not registered / assigned to an EventLoop until register(...) is called. This is suboptimal as if the Channel is not registered it is also not possible to do anything useful with a ChannelFuture that belongs to the Channel. We should think about if we should have the EventLoop as a constructor argument of a Channel and have the register / deregister method only have the effect of add a Channel to KQueue/Epoll/... It is also currently possible to deregister a Channel from one EventLoop and register it with another EventLoop. This operation defeats the threading model assumptions that are wide spread in Netty, and requires careful user level coordination to pull off without any concurrency issues. It is not a commonly used feature in practice, may be better handled by other means (e.g. client side load balancing), and therefore we propose removing this feature.
Modifications:
- Change all Channel implementations to require an EventLoop for construction ( + an EventLoopGroup for all ServerChannel implementations)
- Remove all register(...) methods from EventLoopGroup
- Add ChannelOutboundInvoker.register(...) which now basically means we want to register on the EventLoop for IO.
- Change ChannelUnsafe.register(...) to not take an EventLoop as parameter (as the EventLoop is supplied on custruction).
- Change ChannelFactory to take an EventLoop to create new Channels and introduce ServerChannelFactory which takes an EventLoop and one EventLoopGroup to create new ServerChannel instances.
- Add ServerChannel.childEventLoopGroup()
- Ensure all operations on the accepted Channel is done in the EventLoop of the Channel in ServerBootstrap
- Change unit tests for new behaviour
Result:
A Channel always has an EventLoop assigned which will never change during its life-time. This ensures we are always be able to call any operation on the Channel once constructed (unit the EventLoop is shutdown). This also simplifies the logic in DefaultChannelPipeline a lot as we can always call handlerAdded / handlerRemoved directly without the need to wait for register() to happen.
Also note that its still possible to deregister a Channel and register it again. It's just not possible anymore to move from one EventLoop to another (which was not really safe anyway).
Fixes https://github.com/netty/netty/issues/8513.
Motivation:
We do not correctly detect loops when follow CNAMEs and so may try to follow it without any success.
Modifications:
- Correctly detect CNAME loops
- Do not cache CNAME entries which point to itself
- Add unit test.
Result:
Fixes https://github.com/netty/netty/issues/8687.
Motivation:
Andoid does not contain javax.naming.* so we should not try to use it to prevent a NoClassDefFoundError on init.
Modifications:
Only try to use javax.naming.* to retrieve nameservers when not using Android.
Result:
Fixes https://github.com/netty/netty/issues/8654.
Motivation:
ByteBuf supports “marker indexes”. The intended use case for these is if a speculative operation (e.g. decode) is in process the user can “mark” and interface and refer to it later if the operation isn’t successful (e.g. not enough data). However this is rarely used in practice,
requires extra memory to maintain, and introduces complexity in the state management for derived/pooled buffer initialization, resizing, and other operations which may modify reader/writer indexes.
Modifications:
Remove support for marking and adjust testcases / code.
Result:
Fixes https://github.com/netty/netty/issues/8535.
Motivation:
Most of the maven modules do not explicitly declare their
dependencies and rely on transitivity, which is not always correct.
Modifications:
For all maven modules, add all of their dependencies to pom.xml
Result:
All of the (essentially non-transitive) depepdencies of the modules are explicitly declared in pom.xml
Motivation:
Some of transports support gathering writes when using datagrams. For example this is the case for EpollDatagramChannel. We should minimize the calls to flush() to allow making efficient usage of sendmmsg in this case.
Modifications:
- minimize flush() operations when we query for multiple address types.
- reduce GC by always directly schedule doResolveAll0(...) on the EventLoop.
Result:
Be able to use sendmmsg internally in the DnsNameResolver.
Motivation:
We should refresh the DNS configuration each 5 minutes to be able to detect changes done by the user. This is inline with what OpenJDK is doing
Modifications:
Refresh config every 5 minutes.
Result:
Be able to consume changes made by the user.
Motivation:
It should be possible to build a DnsNameResolver with a null resolvedAddressTypes, defaulting then to DEFAULT_RESOLVE_ADDRESS_TYPES (see line 309).
Sadly, `preferredAddressType` is then called on line 377 with the original parameter instead of the instance attribute, causing an NPE when it's null.
Modification:
Call preferredAddressType with instance attribuet instead of constructor parameter.
Result:
No more NPE
Motivation:
ba594bcf4a added a utility to parse searchdomains defined in /etc/resolv.conf but did not correctly handle the case when multiple are defined that are seperated by either whitespace or tab.
Modifications:
- Correctly parse multiple entries
- Add unit test.
Result:
Correctly parse multiple searchdomain entries.
* Use AuthoritativeDnsServerCache for creating the new redirect stream.
Motivation:
At the moment if a user wants to provide custom sorting of the nameservers used for redirects it needs to be implemented in two places. This is more complicated as it needs to be.
Modifications:
- Just delegate to the AuthoritativeDnsServerCache always as we fill it before we call newRedirectDnsServerStream anyway.
Result:
Easier way for the user to implement custom sorting.
* Add cache for CNAME mappings resolved during lookup of DNS entries.
Motivation:
If the CNAMEd hostname is backed by load balancing component, typically the final A or AAAA DNS records have small TTL. However, the CNAME record itself is setup with longer TTL.
For example:
* x.netty.io could be CNAMEd to y.netty.io with TTL of 5 min
* A / AAAA records for y.netty.io has a TTL of 0.5 min
In current Netty implementation, original hostname is saved in resolved cached with the TTL of final A / AAAA records. When that cache entry expires, Netty recursive resolver sends at least two queries — 1st one to be resolved as CNAME record and the 2nd one to resolve the hostname in CNAME record.
If CNAME record was cached, only the 2nd query would be needed most of the time. 1st query would be needed less frequently.
Modifications:
Add a new CnameCache that will be used to cache CNAMEs and so may reduce queries.
Result:
Less queries needed when CNAME is used.
Motivation
Applications should not depend on internal packages with Java 9 and later. This cause a warning now, but will break in future versions of Java.
Modification
This change adds methods to UnixResolverDnsServerAddressStreamProvider (following after #6844) that parse /etc/resolv.conf for domain and search entries. Then DnsNameResolver does not need to rely on sun.net.dns.ResolverConfiguration to do this.
Result
Fixes#8318. Furthermore, at least in my testing with Java 11, this also makes multiple search entries work properly (previously I was only getting the first entry).