Reputation: 11
We have an API Gateway that uses Keycloak to verify a given access token. Now everything works fine, except for when Keycloak cannot be reached. If I manually turn off Keycloak after the Gateway has started and I try to use my access token, then it tries to authenticate for 3 to 5 seconds after which it throws an error in the console and returns a 500 error response.
There are three things I would like to manage here. Firstly I want to make sure it doesn't take 3 to 5 seconds for each request, since the long duration could make a DDoS attack way easier. Secondly I want to log a custom message in the console and not the entire error that is thrown as shown below. Thirdly I would like to return a custom message to the user instead of just a 500 error, but that isn't a must.
This is the error I get in the console when Keycloak is offline:
ERROR 93 --- [gateway] [or-http-epoll-1] a.w.r.e.AbstractErrorWebExceptionHandler : [4e3e745d] 500 Server Error for HTTP GET "/user"
java.lang.IllegalStateException: Could not obtain the keys
at org.springframework.security.oauth2.jwt.NimbusReactiveJwtDecoder$JwkSetUriReactiveJwtDecoderBuilder.lambda$processor$12(NimbusReactiveJwtDecoder.java:448) ~[spring-security-oauth2-jose-6.3.1.jar:6.3.1]
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
*__checkpoint ⇢ org.springframework.security.web.server.authentication.AuthenticationWebFilter [DefaultWebFilterChain]
*__checkpoint ⇢ org.springframework.security.web.server.context.ReactorContextWebFilter [DefaultWebFilterChain]
*__checkpoint ⇢ org.springframework.security.web.server.header.HttpHeaderWriterWebFilter [DefaultWebFilterChain]
*__checkpoint ⇢ org.springframework.security.config.web.server.ServerHttpSecurity$ServerWebExchangeReactorContextWebFilter [DefaultWebFilterChain]
*__checkpoint ⇢ org.springframework.security.web.server.WebFilterChainProxy [DefaultWebFilterChain]
*__checkpoint ⇢ HTTP GET "/user" [ExceptionHandlingWebHandler]
Original Stack Trace:
at org.springframework.security.oauth2.jwt.NimbusReactiveJwtDecoder$JwkSetUriReactiveJwtDecoderBuilder.lambda$processor$12(NimbusReactiveJwtDecoder.java:448) ~[spring-security-oauth2-jose-6.3.1.jar:6.3.1]
at reactor.core.publisher.Mono.lambda$onErrorMap$28(Mono.java:3854) ~[reactor-core-3.6.7.jar:3.6.7]
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94) ~[reactor-core-3.6.7.jar:3.6.7]
...
Caused by: org.springframework.web.reactive.function.client.WebClientRequestException: Failed to resolve 'project-keycloak' [A(1)]
at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:136) ~[spring-webflux-6.1.10.jar:6.1.10]
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
*__checkpoint ⇢ Request to GET http://project-keycloak:8080/realms/test/protocol/openid-connect/certs [DefaultWebClient]
Original Stack Trace:
at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:136) ~[spring-webflux-6.1.10.jar:6.1.10]
at reactor.core.publisher.MonoErrorSupplied.subscribe(MonoErrorSupplied.java:55) ~[reactor-core-3.6.7.jar:3.6.7]
at reactor.core.publisher.Mono.subscribe(Mono.java:4568) ~[reactor-core-3.6.7.jar:3.6.7]
...
Caused by: java.net.UnknownHostException: Failed to resolve 'project-keycloak' [A(1)]
at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1151) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
at io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:1098) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
at io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:457) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
at io.netty.resolver.dns.DnsResolveContext.onResponse(DnsResolveContext.java:688) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
at io.netty.resolver.dns.DnsResolveContext.access$500(DnsResolveContext.java:69) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
at io.netty.resolver.dns.DnsResolveContext$2.operationComplete(DnsResolveContext.java:515) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:625) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:105) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at io.netty.resolver.dns.DnsQueryContext.trySuccess(DnsQueryContext.java:345) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
at io.netty.resolver.dns.DnsQueryContext.finishSuccess(DnsQueryContext.java:336) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
at io.netty.resolver.dns.DnsNameResolver$DnsResponseHandler.channelRead(DnsNameResolver.java:1384) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) ~[netty-codec-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1407) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:918) ~[netty-transport-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.epoll.EpollDatagramChannel.processPacket(EpollDatagramChannel.java:662) ~[netty-transport-classes-epoll-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.epoll.EpollDatagramChannel.recvmsg(EpollDatagramChannel.java:697) ~[netty-transport-classes-epoll-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.epoll.EpollDatagramChannel.access$300(EpollDatagramChannel.java:56) ~[netty-transport-classes-epoll-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.epoll.EpollDatagramChannel$EpollDatagramChannelUnsafe.epollInReady(EpollDatagramChannel.java:536) ~[netty-transport-classes-epoll-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:501) ~[netty-transport-classes-epoll-4.1.111.Final.jar:4.1.111.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:399) ~[netty-transport-classes-epoll-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:994) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na]
Caused by: io.netty.resolver.dns.DnsErrorCauseException: Query failed with NXDOMAIN
at io.netty.resolver.dns.DnsResolveContext.onResponse(..)(Unknown Source) ~[netty-resolver-dns-4.1.111.Final.jar:4.1.111.Final]
I tried implementing a controller advice with an exception handler, but that doesn't work since it is an API Gateway.
I also tried implementing a custom ServerAuthenticationEntryPoint, but the commence method does not seem to execute whenever an error is thrown. Doesn't matter whether I put it in the oauth2ResourceServer part of my security config or the exceptionHandling part.
Does anyone know how I can handle the scenario when Keycloak is offline or maybe give me some advice on the best way to use Keycloak with Spring Boot.
Upvotes: 0
Views: 95
Reputation: 11
A coworker of mine came with the idea of adding a health check in every request using the Keycloak API health endpoint. This could be a good idea, but personally I do not like the extra latency in every request for a scenario that does not have a high chance of happening.
After some consideration we came to the conclusion that the best way to handle this, is by making sure the cloud environment is setup in a way that allows for high availability and quick recovery so that this has a very small chance of ever being an issue.
However, if anyone knows how to handle this error, then we would still like to know so that we could handle the small chance of the Gateway not being able to use Keycloak.
Upvotes: 0