Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vertx-http event loop threads may be blocked from Camel error handler #6856

Open
janwesterkamp opened this issue Dec 6, 2024 · 18 comments
Open
Assignees
Labels
area/vertx-http bug Something isn't working

Comments

@janwesterkamp
Copy link

Bug description

I am using Camel Quarkus (latest/3.16.0) with REST DSL defined REST API in combination with some long running, blocking routes (usind JNI) when using HTTP PUT and in some configurations on HTTP POST too, but not on GET and HEAD. With (Camel) Quarkus REST switching to the reacive version of RestEasy by default, this generates warning exceptions about blocking thread(s).

I tried to use the @Blocking annotation for the REST DSL configuration/definition method or using .outType(String.class) in the fluent API to enforce the usage of a worker thread - without success.
I also could not find any documentation how to control the tread type in REST DSL, i.e. to use a worker tread or a virtual thread instead of the default I/O tread.

Is there a way to control the thread type?
If not, this should be fixed in my opinion, as potential workarounds like declaring a Jakarta EE REST API to set the return type or add the annotation and start Camel routes from there would be worse and switching back to RestEasy Classic seams to be difficult too (and would turn all my entry points to not make benefit of reactive behaviour or virtual threads).

I think, a REST DSL way to control the entry method thread type is missing and/or the documentation about how to do it or at least how to work around it in the context of Camel Quarkus.

@janwesterkamp janwesterkamp added the bug Something isn't working label Dec 6, 2024
@jamesnetherton
Copy link
Contributor

Can you share some example code of what you're trying to do?

If you're using the REST DSL in conjunction with the camel-quarkus-platform-http extension, then all Camel routing is done on a thread from the worker pool.

@janwesterkamp
Copy link
Author

In the comments you can see my two tries to enforce a worker thread:

    //@Blocking
    private void createAPIRestAdapter() {
        restConfiguration()
          .clientRequestValidation(true)
          .enableCORS(true)
          .inlineRoutes(false)
          .apiContextPath("/api/openapi")
          .apiProperty("api.title", "Data Processing API")
          .apiProperty("api.version","0.1.0")
          .apiProperty("api.description", "REST API.");
       
        rest("/api")
           .put("/{productId}").description("Creates an image from product ID")
            .id("data-processing-put")
            //.outType(String.class)
            .param().name("productId").type(RestParamType.path).description("Product ID.").dataType("String").endParam()
            .param().name("Accept").type(RestParamType.header).defaultValue(MediaType.IMAGE_PNG).description("Image type").endParam()
            .produces(MediaType.IMAGE_PNG)
            .responseMessage().code(200).message(HttpStatusCode._200_MESSAGE).endResponseMessage()
            .responseMessage().code(201).message(HttpStatusCode._201_MESSAGE).endResponseMessage()
            .responseMessage().code(204).message(HttpStatusCode._204_MESSAGE).endResponseMessage()
            .responseMessage().code(400).message(HttpStatusCode._400_MESSAGE).endResponseMessage()
            .responseMessage().code(404).message(HttpStatusCode._404_MESSAGE).endResponseMessage()
            .responseMessage().code(405).message(HttpStatusCode._405_MESSAGE).endResponseMessage()
            .responseMessage().code(406).message(HttpStatusCode._406_MESSAGE).endResponseMessage()
            .responseMessage().code(415).message(HttpStatusCode._415_MESSAGE).endResponseMessage()
            .responseMessage().code(500).message(HttpStatusCode._500_MESSAGE).endResponseMessage()
            .responseMessage().code(501).message(HttpStatusCode._501_MESSAGE).endResponseMessage()
            .to(DIRECT_PUT_MAIN_ROUTE)
          .get("/{productId}/{dataType}").description("Fetches an image from product ID.")
            .id("data-processing-get")
            .param().name("productId").type(RestParamType.path).description("Product ID.").dataType("String").endParam()
            .param().name("Accept").type(RestParamType.header).defaultValue(MediaType.IMAGE_PNG).description("Image type").endParam()
            .produces(MediaType.IMAGE_PNG)
            .responseMessage().code(200).message(HttpStatusCode._200_MESSAGE).endResponseMessage()
            .responseMessage().code(204).message(HttpStatusCode._204_MESSAGE).endResponseMessage()
            .responseMessage().code(400).message(HttpStatusCode._400_MESSAGE).endResponseMessage()
            .responseMessage().code(404).message(HttpStatusCode._404_MESSAGE).endResponseMessage()
            .responseMessage().code(405).message(HttpStatusCode._405_MESSAGE).endResponseMessage()
            .responseMessage().code(406).message(HttpStatusCode._406_MESSAGE).endResponseMessage()
            .responseMessage().code(415).message(HttpStatusCode._415_MESSAGE).endResponseMessage()
            .responseMessage().code(500).message(HttpStatusCode._500_MESSAGE).endResponseMessage()
            .to(DIRECT_GET_MAIN_ROUTE)
        ;
    }

To keep it simple, I omittet some complexitiy of my use case like POST and HEAD and other variants of the API.
But in the best case, having a way to only asign a worker thread to the PUT (not in GET) would be nice to benefit from reactivity in the GET calls.

I will test to add camel-quarkus-platform-http - instead of the camel-quarkus-vertx-http extension, right?
These are a shortend list of my enxtensions incurrently in use:

 ...
  <properties>
    <compiler-plugin.version>3.13.0</compiler-plugin.version>
    <maven.compiler.release>21</maven.compiler.release>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    <quarkus.platform.artifact-id>quarkus-bom</quarkus.platform.artifact-id>
    <quarkus.platform.group-id>io.quarkus.platform</quarkus.platform.group-id>
    <quarkus.platform.version>3.17.3</quarkus.platform.version>
    <surefire-plugin.version>3.5.0</surefire-plugin.version>
    <jandex-plugin.version>3.2.0</jandex-plugin.version>
  </properties>
  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>${quarkus.platform.group-id}</groupId>
        <artifactId>${quarkus.platform.artifact-id}</artifactId>
        <version>${quarkus.platform.version}</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
      <dependency>
        <groupId>${quarkus.platform.group-id}</groupId>
        <artifactId>quarkus-camel-bom</artifactId>
        <version>${quarkus.platform.version}</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>org.apache.camel.quarkus</groupId>
      <artifactId>camel-quarkus-log</artifactId>
    </dependency>
    <dependency>
      <groupId>org.apache.camel.quarkus</groupId>
      <artifactId>camel-quarkus-direct</artifactId>
    </dependency>
    <dependency>
      <groupId>org.apache.camel.quarkus</groupId>
      <artifactId>camel-quarkus-bean</artifactId>
    </dependency>
    <dependency>
      <groupId>org.apache.camel.quarkus</groupId>
      <artifactId>camel-quarkus-openapi-java</artifactId>
    </dependency>
    <dependency>
      <groupId>org.apache.camel.quarkus</groupId>
      <artifactId>camel-quarkus-rest</artifactId>
    </dependency>
    <dependency>
      <groupId>org.apache.camel.quarkus</groupId>
      <artifactId>camel-quarkus-vertx-http</artifactId>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-arc</artifactId>
    </dependency>
    <!-- CamelQuarkusTestSupport-->
    <dependency>
      <groupId>org.apache.camel.quarkus</groupId>
      <artifactId>camel-quarkus-junit5</artifactId>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-junit5</artifactId>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>io.rest-assured</groupId>
      <artifactId>rest-assured</artifactId>
      <scope>test</scope>
    </dependency>
  </dependencies>
 ...

@jamesnetherton
Copy link
Contributor

I will test to add camel-quarkus-platform-http - instead of the camel-quarkus-vertx-http extension, right?

Looks like you don't need to change anything. You get camel-quarkus-platform-http with camel-quarkus-rest. It's the HTTP consumer. camel-quarkus-vertx-http is the HTTP producer - E.g invoking an HTTP service.

What do the direct endpoints do? Do they use vertx-http endpoints to invoke an HTTP endpoint that is slow to respond?

@janwesterkamp
Copy link
Author

janwesterkamp commented Dec 6, 2024

I will test to add camel-quarkus-platform-http - instead of the camel-quarkus-vertx-http extension, right?

Looks like you don't need to change anything. You get camel-quarkus-platform-http with camel-quarkus-rest. It's the HTTP consumer. camel-quarkus-vertx-http is the HTTP producer - E.g invoking an HTTP service.

What do the direct endpoints do? Do they use vertx-http endpoints to invoke an HTTP endpoint that is slow to respond?

Yes, in the case of the PUT method, data is fetched via REST from another microservice and processed with a long running JNI call for a result - so I might need to keep that.

But just adding the camel-quarkus-platform-http additionally the problem with log enties like
WARN [io.ver.cor.imp.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 2098 ms, time limit is 2000 ms: io.vertx.core.VertxException: Thread blocked
remains, unfortunatelly.

@jamesnetherton
Copy link
Contributor

If possible, try swapping out camel-quarkus-vertx-http for camel-quarkus-http.

If the issue stops happening, then we know the problem is likely isolated to the vertx-http component.

@janwesterkamp
Copy link
Author

If possible, try swapping out camel-quarkus-vertx-http for camel-quarkus-http.

If the issue stops happening, then we know the problem is likely isolated to the vertx-http component.

I removed the additional camel-quarkus-platform-http extension first, then also replaced the camel-quarkus-vertx-http with the camel-quarkus-http extension - now it works! Nice shot!

So camel-quarkus-vertx-http is responsible for the warnings.

@jamesnetherton
Copy link
Contributor

So camel-quarkus-vertx-http is responsible for the warnings.

Thanks for confirming 👍.

I'll try to take a look at reproducing & fixing the problem next week.

@janwesterkamp
Copy link
Author

So camel-quarkus-vertx-http is responsible for the warnings.

Thanks for confirming 👍.

I'll try to take a look at reproducing & fixing the problem next week.

Thanks a lot so far for beeing so responsive, @jamesnetherton - have a nice weekend!

@jamesnetherton jamesnetherton self-assigned this Dec 9, 2024
@janwesterkamp
Copy link
Author

Hi @jamesnetherton,
I did further integration testing with variants using the mostly integration tests from the according test suite (+1000 tests, full test run takes about 2h):

Adding camel-quarkus-platform-http even as first Maven dependency does not change anything in any combination of extensions.

Replacing the camel-quarkus-vertx-http with the camel-quarkus-http extension removes the warnings from the blog, but the tests start to fail after approximately 10 minutes. Some of the PUT tests fail because of not having the Content-Lenght header set (null instead of the response body size) and others fail because of returning HTTP Status Code 500.

Setting the Content-Lenght header manually failed for other methods like POST, GET and HEAD in the past and setting that manually did not succeed - so I deactivated this check on these methods already and calclate the value from the response body only for now. However, for PUT it works with camel-quarkus-vertx-http out of the box, but with camel-quarkus-http it fails too. May be this is a separate issue...

I have more concerns about the internal server errors (HTTP Status Code 500) - this forces me to stay with camel-quarkus-vertx-http and a bloaded log for now.

So it looks like the log messages came from the use of camel-quarkus-vertx-http, but the issue is more complex...

@jamesnetherton
Copy link
Contributor

I tried briefly to reproduce the issue with camel-quarkus-vertx-http last Friday but was not able to.

Can you explain a bit more about your usage of vertx-http?

  • Do you send large payloads or get a large response?
  • Do you use the vertxHttpBinding option?

@janwesterkamp
Copy link
Author

Yes, i.e.in the PUT calls I do process large source images, fetched form another service, to a result image and they can have multiple 100 MB (at the moment up to ca 500 MB, may be 2 GB in the future of remote sensing data). So loading and processing them can take time, even when they run on the same node and cached locally. I am using an local object store - MinIO via camel-quarkus-minio and quarkus-minio extensions in my local environment and the corresponding GET, HEAD and in some configurations the POST methods just fetch from the object stoe the cached result and return it to the caller. I ommitted that detail for simplicity before.

I do not make use of a custom vertxHttpBinding at the moment.

@jamesnetherton
Copy link
Contributor

It'd help a lot if you could capture the stack trace for where the event loop is being blocked.

This should enable more detail to be output from the BlockedThreadChecker:

void onComponentAdd(@Observes ComponentAddEvent event) {
    if (event.getComponent() instanceof VertxHttpComponent component) {
        VertxOptions vertxOptions = new VertxOptions();
        vertxOptions.setWarningExceptionTime(TimeUnit.SECONDS.toNanos(1L));
        component.setVertxOptions(vertxOptions);
    }
}

@jamesnetherton
Copy link
Contributor

Actually, ignore the previous comment, I think it just needs a config option:

quarkus.vertx.warning-exception-time = 1s

@janwesterkamp
Copy link
Author

@jamesnetherton,
from my higer level view, it looks like the Camel Quarkus camel-quarkus-rest usage defaults to use a reactive version/configuration of RESTEasy and adding camel-quarkus-platform-http has not effect at all.

When doing REST client calls via camel-quarkus-http resources get blocked, which results in test failures when the thread limit is reached.

When doing REST client calls via camel-quarkus-vertx-http resources get blocked too, but they are handeled in some way including writing warings into the log. Blocking could result from long runing IO fetcheing the source data and/or with the JNI.

Here is the result - GDAL is the processing library getting called via JNI and there the warning exceptions starts:

2024-12-09 13:01:57,577 WARN  [io.ver.cor.imp.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 2622 ms, time limit is 2000 ms: io.vertx.core.VertxException: Thread blocked
	at org.gdal.gdal.gdalJNI.AllRegister(Native Method)
	at org.gdal.gdal.gdal.AllRegister(gdal.java:684)
	at eu.sensoraktor.platform.service.data.processing.CiUtmJp2ToWgs84.process(CiUtmJp2ToWgs84.java:31)
	at eu.sensoraktor.platform.service.data.processing.Routes$29.process(Routes.java:2566)
	at org.apache.camel.support.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:65)
	at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler$RedeliveryTask.doRun(RedeliveryErrorHandler.java:808)
	at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler$RedeliveryTask.run(RedeliveryErrorHandler.java:714)
	at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.doRun(DefaultReactiveExecutor.java:199)
	at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.executeReactiveWork(DefaultReactiveExecutor.java:189)
	at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.tryExecuteReactiveWork(DefaultReactiveExecutor.java:166)
	at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.schedule(DefaultReactiveExecutor.java:148)
	at org.apache.camel.impl.engine.DefaultReactiveExecutor.schedule(DefaultReactiveExecutor.java:54)
	at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler$RedeliveryTask.lambda$doRun$1(RedeliveryErrorHandler.java:813)
	at org.apache.camel.component.vertx.http.VertxHttpProducer.lambda$createResultHandler$1(VertxHttpProducer.java:132)
	at io.vertx.ext.web.client.impl.HttpContext.handleDispatchResponse(HttpContext.java:402)
	at io.vertx.ext.web.client.impl.HttpContext.execute(HttpContext.java:384)
	at io.vertx.ext.web.client.impl.HttpContext.next(HttpContext.java:362)
	at io.vertx.ext.web.client.impl.HttpContext.fire(HttpContext.java:329)
	at io.vertx.ext.web.client.impl.HttpContext.dispatchResponse(HttpContext.java:291)
	at io.vertx.ext.web.client.impl.HttpContext.lambda$null$7(HttpContext.java:512)
	at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:270)
	at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:252)
	at io.vertx.core.impl.ContextInternal.lambda$runOnContext$0(ContextInternal.java:50)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:566)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:1583)

It also looks like the configuration is ignored - it says 2000ms instead of 1s configured...

@jamesnetherton
Copy link
Contributor

Ok - so it's blocked from the error handler. Need to think how the component should handle that...

@jamesnetherton
Copy link
Contributor

This will be fixed in Camel 4.8.3 / Camel 4.10.0.

@janwesterkamp
Copy link
Author

Great & many thanks!

@jamesnetherton should I open a separate isssue for the Content-Length header adding problem - then I will do that soon and describe it in more detail there...

@jamesnetherton
Copy link
Contributor

should I open a separate isssue for the Content-Length header adding problem

Yes, please. Either here or over in the Camel core project (I assume it's a pure Camel issue unrelated to Quarkus):

https://issues.apache.org/

@jamesnetherton jamesnetherton changed the title Camel REST DSL misses control about reactive/virtual thread type usage vertx-http event loop threads may be blocked from Camel error handler Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/vertx-http bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants