Shoner Sul
Shoner Sul

Reputation: 127

java.util.concurrent.TimeoutException in Spring Cloud Gateway Circuit Breaker

After successfully applying Route function of Spring Cloud Gateway, I tried to add Circuit Breaker for handling connection timeout to downstream services. However it behaves somewhat different from what I expect.

Version info: (have to use Java8)

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.7.3</version>
    </parent>

    <properties>
        <java.version>1.8</java.version>
        <spring-cloud.version>2021.0.5</spring-cloud.version>
    </properties>

application.yml to reproduce the situation:

spring:
  cloud:
    gateway:
      routes:
        - id: test
          order: 0
          uri: http://{ANY_SITE_CAUSING_CONNECTION_TIMEOUT}:8080/
          predicates:
            - Path=/**
          filters:
            - name: CircuitBreaker
              args:
                name: cbTest
                fallbackUri: forward:/fallback

management:
(to check circuit breaker status via actuator, omitted)

resilience4j:
  circuitbreaker:
    configs:
      default:
        registerHealthIndicator: true
        minimumNumberOfCalls: 10
    instances:
      cbTest:
        baseConfig: default

With a fallback method to see the exception:

@RestController
@RequestMapping("/fallback")
public class FallbackController {
    @GetMapping
    public Mono<Void> fallback(ServerWebExchange exchange) {
        Throwable t = exchange.getAttribute(ServerWebExchangeUtils.CIRCUITBREAKER_EXECUTION_EXCEPTION_ATTR);
        Objects.requireNonNull(t).printStackTrace();
        return null;
    }
}

This results unexpected java.util.concurrent.TimeoutException as follows, and the call finishes in 1 second instead of trying connection.

java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 1000ms in 'circuitBreaker' (and no fallback has been configured)
    at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.handleTimeout(FluxTimeout.java:295)
    at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.doTimeout(FluxTimeout.java:280)
    at reactor.core.publisher.FluxTimeout$TimeoutTimeoutSubscriber.onNext(FluxTimeout.java:419)
    at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)
    at reactor.core.publisher.MonoDelay$MonoDelayRunnable.propagateDelay(MonoDelay.java:271)
    at reactor.core.publisher.MonoDelay$MonoDelayRunnable.run(MonoDelay.java:286)
    at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)
    at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)
    at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
    at java.util.concurrent.FutureTask.run(FutureTask.java)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

And Circuit Breaker records the corresponding event.

http://localhost:8080/actuator/circuitbreakerevents

{
    "circuitBreakerEvents": [
        {
            "circuitBreakerName": "cbTest",
            "type": "ERROR",
            "creationTime": "2023-02-14T10:21:12.633+09:00[Asia/Seoul]",
            "errorMessage": "java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 1000ms in 'circuitBreaker' (and no fallback has been configured)",
            "durationInMs": 1000,
            "stateTransition": null
        }
    ]
}

Without "filters" part of route config, I can see ConnectionTimeout and client tries to connect for 30 seconds as expected. So I suspect this as a the matter of CircuitBreaker filter.

io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection timed out: no further information: /{ANY_SITE_CAUSING_CONNECTION_TIMEOUT}:8080
    Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: 
Error has been observed at the following site(s):
    *__checkpoint ⇢ org.springframework.cloud.gateway.filter.WeightCalculatorWebFilter [DefaultWebFilterChain]
    *__checkpoint ⇢ org.springframework.boot.actuate.metrics.web.reactive.server.MetricsWebFilter [DefaultWebFilterChain]
    *__checkpoint ⇢ HTTP GET "/x" [ExceptionHandlingWebHandler]
Original Stack Trace:
Caused by: java.net.ConnectException: Connection timed out: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_292]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) ~[na:1.8.0_292]
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337) ~[netty-transport-4.1.79.Final.jar:4.1.79.Final]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) ~[netty-transport-4.1.79.Final.jar:4.1.79.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:710) ~[netty-transport-4.1.79.Final.jar:4.1.79.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658) ~[netty-transport-4.1.79.Final.jar:4.1.79.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584) ~[netty-transport-4.1.79.Final.jar:4.1.79.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) ~[netty-transport-4.1.79.Final.jar:4.1.79.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.79.Final.jar:4.1.79.Final]
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.79.Final.jar:4.1.79.Final]
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.79.Final.jar:4.1.79.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_292]

I wonder why netty exception is overwritten by java.util.concurrent.TimeoutException with 1000ms interval. I tried to set spring.cloud.gateway.httpclient.connect-timeout and spring.cloud.gateway.httpclient.response-timeout to bigger values, but it didn't change the result. Per-route timeout settings also didn't work.

Upvotes: 2

Views: 2206

Answers (1)

Ahmed
Ahmed

Reputation: 81

For me I just used the following time-limiter configuration:

resilience4j:
   timelimiter:
     configs:
       default:
         timeout-duration: 5s
         cancel-running-future: false

or

@Bean
public TimeLimiterRegistry timeLimiterRegistry() {
   return TimeLimiterRegistry.of(TimeLimiterConfig.custom()
            .timeoutDuration(Duration.ofSeconds(5))
            .cancelRunningFuture(false)
            .build());
}

or

@Bean
public Customizer<ReactiveResilience4JCircuitBreakerFactory> defaultCustomizer() {
    return factory -> factory.configureDefault(id -> new Resilience4JConfigBuilder(id)
            .timeLimiterConfig(TimeLimiterConfig.custom()
                    .timeoutDuration(Duration.ofSeconds(5))
                    .cancelRunningFuture(false)
                    .build())
            .build());
}

Upvotes: 0

Related Questions