awenclaw
awenclaw

Reputation: 533

Run java -jar inside AWS Glue job

I have relatively simple task to do but struggle with best AWS service mix to accomplish that:

  1. I have simple java program (provided by 3rd party- I can't modify that, just use) that I can run anywhere with java -jar --target-location "path on local disc". The program, once executed, is creating csv file on local disc in path defied in --target-location
  2. Once file is created I need to upload it to S3

The way I'm doing it currently is by having dedicated EC2 instance with java installed and first point is covered by java -jar ... and second with aws s3 cp ... command

I'm looking for better way of doing that (preferably serverless). I'm wandering if above points can be accomplished with AWS Glue Job type Python Shell? Second point (copy local file to S3), most likely I can cover with boto3 but first (java -jar execution)- I'm not sure.


Am I force to use EC2 instance or you see smarter way with AWS Glue?
Or most effective would be to build docker image (that contains this two instructions), register in ECR and run wit AWS Batch?

Upvotes: 0

Views: 1759

Answers (1)

justthink
justthink

Reputation: 459

I'm looking for better way of doing that (preferably serverless).

I cannot tell if a serverless option is better, however, an EC2 instance will do the job just fine. Assume that you have CentOS on your instance, you may do it through

aaPanel GUI

Some useful web panels offer cron scheduled tasks, such as backing up some files from one directory to another S3 directory. I will use aaPanel as an example.

Install aaPanel

Install AWS S3 plugin

Configure the credentials in the plugin.

Cron

Add a scheduled task to back up files from "path on local disc" to AWS S3.

Rclone

A web panel goes beyond the scope of this question. Rclone is another useful tool I use to back up files from local disk to OneDrive, S3, etc.

Installation

curl https://rclone.org/install.sh | sudo bash

Sync

Sync a directory to the remote bucket, deleting any excess files in the bucket. rclone sync -i /home/local/directory remote:bucket

Upvotes: 1

Related Questions