Alejandro
Alejandro

Reputation: 213

How to properly compile/package a Task for Spring Cloud Data Flow

I compiled the following example in IntelliJ Idea 2020.1.3 from the Spring Cloud Task Samples, given that I need to use multiple datasources: https://github.com/mminella/spring-cloud-task/tree/master/spring-cloud-task-samples/multiple-datasources

I then packaged it in a JAR using mvn package, copied it to the location where I launched docker-compose (as per official SCDF (Spring Cloud Data Flow) instructions for local deployment) and added it to SCDF running the following command inside the SCDF shell: app register --type task --name multiple-datasources --uri file://root/scdf/multiple-datasources-2.3.0-SNAPSHOT.jar.

I added the task using the SCDF as shown:

Creating the multiple-datasources task in SCDF 1/2

Creating the multiple-datasources task in SCDF 2/2

When I try to run the task from the dashboard, the execution instance does not mention any start-time or end-time. Nor does it show any in the dashboard.

Consulting the log from the SCDF shell by using task execution log <instance>, many errors are shown. This seems like the most relevant part:

2020-07-14 02:38:14.403  INFO 63 --- [           main] i.spring.MultipleDataSourcesApplication  : Starting MultipleDataSourcesApplication v2.3.0-SNAPSHOT on 5856acfa7c62 with PID 63 (/root/scdf/multiple-datasources-2.3.0-SNAPSHOT.jar started by root in /tmp/289541567048/multiple-datasources-9c75a131-4ea9-40ff-ac42-44729162e6f5)
2020-07-14 02:38:14.407  INFO 63 --- [           main] i.spring.MultipleDataSourcesApplication  : No active profile set, falling back to default profiles: default
2020-07-14 02:38:17.242  INFO 63 --- [           main] o.s.j.d.e.EmbeddedDatabaseFactory        : Starting embedded database: url='jdbc:h2:mem:testdb;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=false', username='sa'
2020-07-14 02:38:18.145  INFO 63 --- [           main] o.s.j.d.e.EmbeddedDatabaseFactory        : Starting embedded database: url='jdbc:hsqldb:mem:testdb', username='sa'
2020-07-14 02:38:18.810 DEBUG 63 --- [           main] o.s.c.t.c.SimpleTaskAutoConfiguration    : Using io.spring.configuration.CustomTaskConfigurer TaskConfigurer
2020-07-14 02:38:18.823 DEBUG 63 --- [           main] o.s.c.t.c.DefaultTaskConfigurer          : No EntityManager was found, using DataSourceTransactionManager
2020-07-14 02:38:18.928 DEBUG 63 --- [           main] o.s.c.t.r.s.TaskRepositoryInitializer    : Initializing task schema for h2 database
2020-07-14 02:38:19.036 ERROR 63 --- [           main] o.s.c.t.listener.TaskLifecycleListener   : An event to end a task has been received for a task that has not yet started.
2020-07-14 02:38:19.036  WARN 63 --- [           main] s.c.a.AnnotationConfigApplicationContext : Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Failed to start bean 'taskLifecycleListener'; nested exception is java.lang.IllegalArgumentException: Invalid TaskExecution, ID 31 not found
2020-07-14 02:38:19.036  INFO 63 --- [           main] o.s.j.d.e.EmbeddedDatabaseFactory        : Shutting down embedded database: url='jdbc:h2:mem:testdb;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=false'
2020-07-14 02:38:19.245  INFO 63 --- [           main] o.s.j.d.e.EmbeddedDatabaseFactory        : Shutting down embedded database: url='jdbc:hsqldb:mem:testdb'
2020-07-14 02:38:19.258 ERROR 63 --- [           main] o.s.c.t.listener.TaskLifecycleListener   : An event to end a task has been received for a task that has not yet started.
2020-07-14 02:38:19.264  INFO 63 --- [           main] ConditionEvaluationReportLoggingListener : 

Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled.
2020-07-14 02:38:19.273 ERROR 63 --- [           main] o.s.boot.SpringApplication               : Application run failed

The following errors in particular stand out to me:

2020-07-14 02:38:19.036  WARN 63 --- [           main] s.c.a.AnnotationConfigApplicationContext : Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Failed to start bean 'taskLifecycleListener'; nested exception is java.lang.IllegalArgumentException: Invalid TaskExecution, ID 31 not found
2020-07-14 02:38:19.258 ERROR 63 --- [           main] o.s.c.t.listener.TaskLifecycleListener   : An event to end a task has been received for a task that has not yet started.
2020-07-14 02:38:19.264  INFO 63 --- [           main] ConditionEvaluationReportLoggingListener : 

Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled.

The example has the following lines in application.properties:

spring.application.name=Demo Multiple DataSources Task
logging.level.org.springframework.cloud.task=DEBUG`

so if I'm not mistaken, debug should already be enabled.

Concretely my questions are:

1) What might I be overlooking or doing wrong, given that this is an example and it's not running even without modifications?

2) What can I do to properly enable DEBUG?

Thank you

PS: The example from the repo already has the H2 database dependency shown as the answer in Registering Custom Spring Cloud Task with Spring Cloud Data Flow in its pom.xml .

I have not tried to re-create the example with a current boot initializr, however I have tried to make a simple hello-world with the most recent initializr and I get the exact same error. I do not think the error is the initializr.

I have yet to try the last suggestion, overriding. But given that this is an official example, should I really need to override the default configuration?

PPS: I know my installation of SCDF is working properly because I was able to run a pre-packaged timestamp programme from the example at: https://cloud.spring.io/spring-cloud-task-app-starters/

Upvotes: 1

Views: 1489

Answers (2)

vinhphu3000
vinhphu3000

Reputation: 81

I got the same issue: whenever I start the batch job, the TaskExecution gets increased.

I queried the Task Execution Table in the database (that I use in my batch job) and found nothing related to that Execution ID.

I tried to delete the batch job and re import the application from Docker Registry and the Task Execution still gets increased, no hope!

I reviewed the SCDF server-config.yaml file, that were used for installing the SCDF on Kubernetes and found out that the SCDF used Mysql system database for storing its stuffs, it's weird!!

https://github.com/spring-cloud/spring-cloud-Dataflow/blob/main/src/kubernetes/server/server-config.yaml

url: jdbc:mysql://${MYSQL_SERVICE_HOST}:${MYSQL_SERVICE_PORT}/mysql

I queried the mysql system database and found in there the Task Execution Table with ID information!

Problem solved, the SCDF itself saves Task Execution ID in its own database, and our batch application uses different database, so for that reason it makes the issue. I have to temporary change batch application database to SCDF's database, that's mysql system database.

From: app.batch.spring.datasource.url=jdbc:mysql://mysql:3306/task?useSSL=false

To: app.batch.spring.datasource.url=jdbc:mysql://mysql:3306/mysql?useSSL=false

Conclusion: SCDF needs its own database for managing Task Execution, and your batch application (beside batch's database) should point to that SCDF's database in order to work. Also, during the installation of the SCDF in Kubernetes, you should change the SCDF's database to somewhere else instead of mysql system database

This tutorial shows you how to make your batch application works with 2 datasources at the same time

https://github.com/spring-cloud/spring-cloud-task/tree/main/spring-cloud-task-samples/multiple-datasources

This link will add more information on the issue:

https://github.com/spring-io/dataflow.spring.io/issues/161

Background:

I followed this link to install the SCDF server (that drove me crazy)

https://dataflow.spring.io/docs/installation/kubernetes/kubectl/#deploy-data-flow-server

which points to

https://github.com/spring-cloud/spring-cloud-dataflow/blob/main/src/kubernetes/server/server-config.yaml

And this link

https://dataflow.spring.io/docs/batch-developer-guides/batch/spring-batch/

guides to make batch application with the mysql url url=jdbc:mysql://localhost:3306/task?useSSL=false (task database) But it does not tell you how SCDF works with Task Execution ID

I guess, Spring should update the document and add more explanation on the SCDF Task Execution process

Upvotes: 2

Ilayaperumal Gopinathan
Ilayaperumal Gopinathan

Reputation: 4179

This looks like the issue related to the task application not using the same database as that of Spring Cloud Data Flow. You need to make sure,

  • The datasource configuration of the task is same as that of the SCDF
  • The version of the datasource jdbc dependency in your task app is compatible with what version of SCDF you use.

You can read some relevant documentation here

Upvotes: 1

Related Questions