Reputation: 1680
I want to create a Quartz job which reads .csv files and moves them when file is processed. I tried this:
@Override
public void execute(JobExecutionContext context) {
File directoryPath = new File("C:\\csv\\nov");
// Create a new subfolder called "processed" into source directory
try {
Files.createDirectory(Path.of(directoryPath.getAbsolutePath() + "/processed"));
} catch (IOException e) {
throw new RuntimeException(e);
}
FilenameFilter textFileFilter = (dir, name) -> {
String lowercaseName = name.toLowerCase();
if (lowercaseName.endsWith(".csv")) {
return true;
} else {
return false;
}
};
// List of all the csv files
File filesList[] = directoryPath.listFiles(textFileFilter);
System.out.println("List of the text files in the specified directory:");
Optional<File> csvFile = Arrays.stream(filesList).findFirst();
File file = csvFile.get();
for(File file : filesList) {
try {
List<CsvLine> beans = new CsvToBeanBuilder(new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16))
.....
.build()
.parse();
for(CsvLine item: beans){
....... sql queries
Optional<ProcessedWords> isFound = processedWordsService.findByKeyword(item.getKeyword());
......................................
}
} catch (Exception e){
e.printStackTrace();
}
// Move here file into new subdirectory when file processing is finished
Path copied = Paths.get(file.getAbsolutePath() + "/processed");
Path originalPath = file.toPath();
try {
Files.move(originalPath, copied, StandardCopyOption.REPLACE_EXISTING);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
Folder processed
is created when the job is started but I get exception:
2022-11-17 23:12:51.470 ERROR 16512 --- [cessor_Worker-4] org.quartz.core.JobRunShell : Job DEFAULT.keywordPostJobDetail threw an unhandled Exception:
java.lang.RuntimeException: java.nio.file.FileSystemException: C:\csv\nov\11_42_33.csv -> C:\csv\nov\processed\11_42_33.csv: The process cannot access the file because it is being used by another process
at com.wordscore.engine.processor.ImportCsvFilePostJob.execute(ImportCsvFilePostJob.java:127) ~[main/:na]
at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[quartz-2.3.2.jar:na]
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) ~[quartz-2.3.2.jar:na]
Caused by: java.nio.file.FileSystemException: C:\csv\nov\11_42_33.csv -> C:\csv\nov\processed\11_42_33.csv: The process cannot access the file because it is being used by another process
at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) ~[na:na]
at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) ~[na:na]
at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:403) ~[na:na]
at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) ~[na:na]
at java.base/java.nio.file.Files.move(Files.java:1432) ~[na:na]
at com.wordscore.engine.processor.ImportCsvFilePostJob.execute(ImportCsvFilePostJob.java:125) ~[main/:na]
... 2 common frames omitted
Do you know how I can release the file and move it into a sub directory?
EDIT: Update code with try-catch
@Override
public void execute(JobExecutionContext context) {
File directoryPath = new File("C:\\csv\\nov");
// Create a new subfolder called "processed" into source directory
try {
Path path = Path.of(directoryPath.getAbsolutePath() + "/processed");
if (!Files.exists(path) || !Files.isDirectory(path)) {
Files.createDirectory(path);
}
} catch (IOException e) {
throw new RuntimeException(e);
}
FilenameFilter textFileFilter = (dir, name) -> {
String lowercaseName = name.toLowerCase();
if (lowercaseName.endsWith(".csv")) {
return true;
} else {
return false;
}
};
// List of all the csv files
File filesList[] = directoryPath.listFiles(textFileFilter);
System.out.println("List of the text files in the specified directory:");
Optional<File> csvFile = Arrays.stream(filesList).findFirst();
File file = csvFile.get();
for(File file : filesList) {
try {
try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
List<CsvLine> beans = new CsvToBeanBuilder(br)
......
.build()
.parse();
for (CsvLine item : beans) {
.....
if (isFound.isPresent()) {
.........
}}
} catch (Exception e){
e.printStackTrace();
}
// Move here file into new subdirectory when file processing is finished
Path copied = Paths.get(file.getAbsolutePath() + "/processed");
Path originalPath = file.toPath();
try {
Files.move(originalPath, copied, StandardCopyOption.REPLACE_EXISTING);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
Quartz config:
@Configuration
public class SchedulerConfig {
private static final Logger LOG = LoggerFactory.getLogger(SchedulerConfig.class);
private ApplicationContext applicationContext;
@Autowired
public SchedulerConfig(ApplicationContext applicationContext) {
this.applicationContext = applicationContext;
}
@Bean
public JobFactory jobFactory() {
AutowiringSpringBeanJobFactory jobFactory = new AutowiringSpringBeanJobFactory();
jobFactory.setApplicationContext(applicationContext);
return jobFactory;
}
@Bean
public SchedulerFactoryBean schedulerFactoryBean(Trigger simpleJobTrigger) throws IOException {
SchedulerFactoryBean schedulerFactory = new SchedulerFactoryBean();
schedulerFactory.setQuartzProperties(quartzProperties());
schedulerFactory.setWaitForJobsToCompleteOnShutdown(true);
schedulerFactory.setAutoStartup(true);
schedulerFactory.setTriggers(simpleJobTrigger);
schedulerFactory.setJobFactory(jobFactory());
return schedulerFactory;
}
@Bean
public SimpleTriggerFactoryBean simpleJobTrigger(@Qualifier("keywordPostJobDetail") JobDetail jobDetail,
@Value("${simplejob.frequency}") long frequency) {
LOG.info("simpleJobTrigger");
SimpleTriggerFactoryBean factoryBean = new SimpleTriggerFactoryBean();
factoryBean.setJobDetail(jobDetail);
factoryBean.setStartDelay(1000);
factoryBean.setRepeatInterval(frequency);
factoryBean.setRepeatCount(4); // factoryBean.setRepeatCount(SimpleTrigger.REPEAT_INDEFINITELY);
return factoryBean;
}
@Bean
public JobDetailFactoryBean keywordPostJobDetail() {
JobDetailFactoryBean factoryBean = new JobDetailFactoryBean();
factoryBean.setJobClass(ImportCsvFilePostJob.class);
factoryBean.setDurability(true);
return factoryBean;
}
public Properties quartzProperties() throws IOException {
PropertiesFactoryBean propertiesFactoryBean = new PropertiesFactoryBean();
propertiesFactoryBean.setLocation(new ClassPathResource("/quartz.properties"));
propertiesFactoryBean.afterPropertiesSet();
return propertiesFactoryBean.getObject();
}
}
Quartz config:
org.quartz.scheduler.instanceName=wordscore-processor
org.quartz.scheduler.instanceId=AUTO
org.quartz.threadPool.threadCount=5
org.quartz.jobStore.class=org.quartz.simpl.RAMJobStore
As you can see I wan to have 5 threads in order to execute 5 parallel jobs. Do you know how I can process the files without this exception?
Upvotes: 1
Views: 3455
Reputation: 579
Assuming we have File file = new File("c:/test.txt")
, and print the the following paths:
Path copied = Paths.get(file.getAbsolutePath() + "/processed");
Path originalPath = file.toPath();
We will get the result:
copied: C:\test.txt\processed
originalPath: C:\test.txt
So its incorrect. You should try to get the parent path plus the processed folder plus the file name.
Path copied = Paths.get(file.getParentFile().getAbsolutePath() + "/processed/" + file.getName());
Path originalPath = file.toPath();
Upvotes: 1
Reputation: 4211
I’m pretty sure that the file is being locked by the file reader that you create but never close in the following line:
List<CsvLine> beans = new CsvToBeanBuilder(new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16))
Refactor your code so that you have that reader in a try finally block or close it explicitly.
The unintuitive behavior you might see is that those files are released at seemingly random times. This is because when the garbage collector frees up those readers, they will then release the files. Clean them up explicitly instead.
Upvotes: 0
Reputation: 53421
Although I agree completely with the answer and comments of @rzwitserloot, note the following in your error stack trace:
java.nio.file.FileSystemException: C:\csv\nov\07_06_26.csv -> C:\csv\nov\07_06_26.csv\processed: The process cannot access the file because it is being used by another process
You are trying moving your file to the backup directory, but note you are doing it to the wrong path, C:\csv\nov\07_06_26.csv\processed
, in the example.
Please, try the following:
@Override
public void execute(JobExecutionContext context) {
File directoryPath = new File("C:\\csv\\nov");
// Create a new subfolder called "processed" into source directory
// Hold a reference to the processed files directory path, we will
// use it later
Path processedDirectoryPath;
try {
processedDirectoryPath = Path.of(directoryPath.getAbsolutePath() + "/processed");
if (!Files.exists(processedDirectoryPath) || !Files.isDirectory(processedDirectoryPath)) {
Files.createDirectory(processedDirectoryPath);
}
} catch (IOException e) {
throw new RuntimeException(e);
}
FilenameFilter textFileFilter = (dir, name) -> {
String lowercaseName = name.toLowerCase();
if (lowercaseName.endsWith(".csv")) {
return true;
} else {
return false;
}
};
// List of all the csv files
File filesList[] = directoryPath.listFiles(textFileFilter);
System.out.println("List of the text files in the specified directory:");
for(File file : filesList) {
try {
try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
List<CsvLine> beans = new CsvToBeanBuilder(br)
......
.build()
.parse();
for (CsvLine item : beans) {
.....
if (isFound.isPresent()) {
.........
}}
} catch (Exception e){
e.printStackTrace();
}
// Move here file into new subdirectory when file processing is finished
// In my opinion, here is the error:
// Path copied = Paths.get(file.getAbsolutePath() + "/processed");
Path originalPath = file.toPath();
try {
// Note the use of the path we defined before
Files.move(originalPath, processedDirectoryPath.resolve(originalPath.getFileName()),
StandardCopyOption.REPLACE_EXISTING);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
If you need to increase the throughput of files processed, you could try splitting them in batches, say for certain pattern in their name like a month name or a job number, for instance. The simple solution could be to use the provided JobExecutionContext
of every job to include some split criteria. That criteria will be used in your FilenameFilter
causing every job to process only a certain portion of the whole amount of files that need to be processed. I think the solution is preferable to any kind of locking or similar mechanism..
For example, consider the following:
@Override
public void execute(JobExecutionContext context) {
File directoryPath = new File("C:\\csv\\nov");
// Create a new subfolder called "processed" into source directory
// Hold a reference to the processed files directory path, we will
// use it later
Path processedDirectoryPath;
try {
processedDirectoryPath = Path.of(directoryPath.getAbsolutePath() + "/processed");
if (!Files.exists(processedDirectoryPath) || !Files.isDirectory(processedDirectoryPath)) {
Files.createDirectory(processedDirectoryPath);
}
} catch (IOException e) {
throw new RuntimeException(e);
}
// We obtain the file processing criteria using a job parameter
JobDataMap data = context.getJobDetail().getJobDataMap();
String filenameProcessingCriteria = data.getString("FILENAME_PROCESSING_CRITERIA");
// Use the provided criteria to restrict the files that this job
// will process
FilenameFilter textFileFilter = (dir, name) -> {
String lowercaseName = name.toLowerCase();
if (lowercaseName.endsWith(".csv") && lowercaseName.indexOf(filenameProcessingCriteria) > 0) {
return true;
} else {
return false;
}
};
// List of all the csv files
File filesList[] = directoryPath.listFiles(textFileFilter);
System.out.println("List of the text files in the specified directory:");
for(File file : filesList) {
try {
try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
List<CsvLine> beans = new CsvToBeanBuilder(br)
......
.build()
.parse();
for (CsvLine item : beans) {
.....
if (isFound.isPresent()) {
.........
}}
} catch (Exception e){
e.printStackTrace();
}
// Move here file into new subdirectory when file processing is finished
// In my opinion, here is the error:
// Path copied = Paths.get(file.getAbsolutePath() + "/processed");
Path originalPath = file.toPath();
try {
// Note the use of the path we defined before
Files.move(originalPath, processedDirectoryPath.resolve(originalPath.getFileName()),
StandardCopyOption.REPLACE_EXISTING);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
You need to pass the required parameter to your jobs:
JobDetail job1 = ...;
job1.getJobDataMap().put("FILENAME_PROCESSING_CRITERIA", "job1pattern");
An even simpler approach, based on the same idea, could be splitting the files in different folders and pass the folder name that need to be processed as a job parameter:
@Override
public void execute(JobExecutionContext context) {
// We obtain the directory path as a job parameter
JobDataMap data = context.getJobDetail().getJobDataMap();
String directoryPathName = data.getString("DIRECTORY_PATH_NAME");
File directoryPath = new File(directoryPathName);
// Create a new subfolder called "processed" into source directory
// Hold a reference to the processed files directory path, we will
// use it later
Path processedDirectoryPath;
try {
processedDirectoryPath = Path.of(directoryPath.getAbsolutePath() + "/processed");
if (!Files.exists(processedDirectoryPath) || !Files.isDirectory(processedDirectoryPath)) {
Files.createDirectory(processedDirectoryPath);
}
} catch (IOException e) {
throw new RuntimeException(e);
}
FilenameFilter textFileFilter = (dir, name) -> {
String lowercaseName = name.toLowerCase();
if (lowercaseName.endsWith(".csv")) {
return true;
} else {
return false;
}
};
// List of all the csv files
File filesList[] = directoryPath.listFiles(textFileFilter);
System.out.println("List of the text files in the specified directory:");
for(File file : filesList) {
try {
try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
List<CsvLine> beans = new CsvToBeanBuilder(br)
......
.build()
.parse();
for (CsvLine item : beans) {
.....
if (isFound.isPresent()) {
.........
}}
} catch (Exception e){
e.printStackTrace();
}
// Move here file into new subdirectory when file processing is finished
// In my opinion, here is the error:
// Path copied = Paths.get(file.getAbsolutePath() + "/processed");
Path originalPath = file.toPath();
try {
// Note the use of the path we defined before
Files.move(originalPath, processedDirectoryPath.resolve(originalPath.getFileName()),
StandardCopyOption.REPLACE_EXISTING);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
And pass a different folder to every different job:
JobDetail job1 = ...;
job1.getJobDataMap().put("DIRECTORY_PATH_NAME", "C:\\csv\\nov");
Please, consider refactor your code and define methods for file processing, file backup, etc, it will make your code easy to understand and handle.
Upvotes: 2
Reputation: 3948
The line in error message
Caused by: java.lang.RuntimeException: java.nio.file.FileSystemException: C:\csv\nov\07_06_26.csv -> C:\csv\nov\07_06_26.csv\processed: The process cannot access the file because it is being used by another process
I think you want to move the file from C:\csv\nov
to C:\csv\nov\processed
, so
you have to change following line:
Path copied = Paths.get(file.getAbsolutePath() + "/processed");
to
Path copied = Paths.get(file.getParent() + "/processed");
because file.getAbsolutePath()
returns the complete path, include the name of file.
Upvotes: 1
Reputation: 103254
new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)
This parts creates a resource. A resource is an object that represents an underlying heavy thing - a thing that you can have very few of. In this case, it represents an underlying OS file handle.
You must always safely close these. There are really only 2 ways to do it correctly:
AutoClosable
so the code that uses of instances of this class can use try-with-resourcestry (var br = new FileReader(file, StandardCharsets.UTF_16)) {
List<CsvLine> beans = new CsvToBeanBuilder(br)
.....
.build()
.parse();
}
Is the answer.
Upvotes: 3