basu76
basu76

Reputation: 481

ETL Spring batch, Spring cloud data flow (SCDF)

We have a use case where data can be sourced from different sources (DB, FILE etc) and transformed and stored to various sinks (Cassandra, DB or File).We would want the ability to split the jobs and do parallel loads - looks like Spring Batch RemoteChunking provides that ability.

I am new to SCDF and Spring batch and wondering what is the best way to use it.

Is there a way to provide configuration for these jobs (source connection details, table and query) and can this be done through an UI (SCDF Server UI ?). Is it possible to compose the flow?

This will run on Kubernetes and our applications are deployed through Jenkins pipeline.

Upvotes: 1

Views: 416

Answers (1)

Mahmoud Ben Hassine
Mahmoud Ben Hassine

Reputation: 31710

We would want the ability to split the jobs and do parallel loads - looks like Spring Batch RemoteChunking provides that ability.

I don't think you need remote chunking, you can rather run parallel jobs, where each job handles an ETL process (for a particular file, db table).

Is there a way to provide configuration for these jobs (source connection details, table and query)

Yes, those can be configured like any regular Spring Batch job is configured.

and can this be done through an UI (SCDF Server UI ?

If you make them configurable through properties of your job, you can specify them through the UI when you run the task.

Is it possible to compose the flow?

Yes, this is possible with Composed Task.

Upvotes: 2

Related Questions