Peter Penzov
Peter Penzov

Reputation: 1680

The process cannot access the file because it is being used by another process when file is moved

I want to create a Quartz job which reads .csv files and moves them when file is processed. I tried this:

@Override
public void execute(JobExecutionContext context) {

    File directoryPath = new File("C:\\csv\\nov");
    // Create a new subfolder called "processed" into source directory
    try {
        Files.createDirectory(Path.of(directoryPath.getAbsolutePath() + "/processed"));
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv")) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");

    Optional<File> csvFile = Arrays.stream(filesList).findFirst();
    File file = csvFile.get();
  
    for(File file : filesList) {

        try {
            List<CsvLine> beans = new CsvToBeanBuilder(new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16))
                    .....
                    .build()
                    .parse();

            for(CsvLine item: beans){

                    ....... sql queries

                    Optional<ProcessedWords> isFound = processedWordsService.findByKeyword(item.getKeyword());

                    ......................................
            }

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            Files.move(originalPath, copied, StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

Folder processed is created when the job is started but I get exception:

        2022-11-17 23:12:51.470 ERROR 16512 --- [cessor_Worker-4] org.quartz.core.JobRunShell              : Job DEFAULT.keywordPostJobDetail threw an unhandled Exception: 

java.lang.RuntimeException: java.nio.file.FileSystemException: C:\csv\nov\11_42_33.csv -> C:\csv\nov\processed\11_42_33.csv: The process cannot access the file because it is being used by another process
    at com.wordscore.engine.processor.ImportCsvFilePostJob.execute(ImportCsvFilePostJob.java:127) ~[main/:na]
    at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[quartz-2.3.2.jar:na]
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) ~[quartz-2.3.2.jar:na]
Caused by: java.nio.file.FileSystemException: C:\csv\nov\11_42_33.csv -> C:\csv\nov\processed\11_42_33.csv: The process cannot access the file because it is being used by another process
    at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) ~[na:na]
    at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) ~[na:na]
    at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:403) ~[na:na]
    at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) ~[na:na]
    at java.base/java.nio.file.Files.move(Files.java:1432) ~[na:na]
    at com.wordscore.engine.processor.ImportCsvFilePostJob.execute(ImportCsvFilePostJob.java:125) ~[main/:na]
    ... 2 common frames omitted

Do you know how I can release the file and move it into a sub directory?

EDIT: Update code with try-catch

@Override
public void execute(JobExecutionContext context) {

    File directoryPath = new File("C:\\csv\\nov");
    // Create a new subfolder called "processed" into source directory
    try {
        Path path = Path.of(directoryPath.getAbsolutePath() + "/processed");
        if (!Files.exists(path) || !Files.isDirectory(path)) {
            Files.createDirectory(path);
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv")) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");
    
    Optional<File> csvFile = Arrays.stream(filesList).findFirst();
    File file = csvFile.get();
     
    for(File file : filesList) {

        try {
            try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
                List<CsvLine> beans = new CsvToBeanBuilder(br)
                        ......
                        .build()
                        .parse();

            for (CsvLine item : beans) {

                .....
                if (isFound.isPresent()) {
                    .........
        }}

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            Files.move(originalPath, copied, StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    
}

Quartz config:

@Configuration
public class SchedulerConfig {

    private static final Logger LOG = LoggerFactory.getLogger(SchedulerConfig.class);

    private ApplicationContext applicationContext;

    @Autowired
    public SchedulerConfig(ApplicationContext applicationContext) {
        this.applicationContext = applicationContext;
    }

    @Bean
    public JobFactory jobFactory() {
        AutowiringSpringBeanJobFactory jobFactory = new AutowiringSpringBeanJobFactory();
        jobFactory.setApplicationContext(applicationContext);
        return jobFactory;
    }

    @Bean
    public SchedulerFactoryBean schedulerFactoryBean(Trigger simpleJobTrigger) throws IOException {

        SchedulerFactoryBean schedulerFactory = new SchedulerFactoryBean();
        schedulerFactory.setQuartzProperties(quartzProperties());
        schedulerFactory.setWaitForJobsToCompleteOnShutdown(true);
        schedulerFactory.setAutoStartup(true);
        schedulerFactory.setTriggers(simpleJobTrigger);
        schedulerFactory.setJobFactory(jobFactory());
        return schedulerFactory;
    }

    @Bean
    public SimpleTriggerFactoryBean simpleJobTrigger(@Qualifier("keywordPostJobDetail") JobDetail jobDetail,
                                                     @Value("${simplejob.frequency}") long frequency) {
        LOG.info("simpleJobTrigger");

        SimpleTriggerFactoryBean factoryBean = new SimpleTriggerFactoryBean();
        factoryBean.setJobDetail(jobDetail);
        factoryBean.setStartDelay(1000);
        factoryBean.setRepeatInterval(frequency);
        factoryBean.setRepeatCount(4); //         factoryBean.setRepeatCount(SimpleTrigger.REPEAT_INDEFINITELY);
        return factoryBean;
    }

    @Bean
    public JobDetailFactoryBean keywordPostJobDetail() {
        JobDetailFactoryBean factoryBean = new JobDetailFactoryBean();
        factoryBean.setJobClass(ImportCsvFilePostJob.class);
        factoryBean.setDurability(true);
        return factoryBean;
    }

    public Properties quartzProperties() throws IOException {
        PropertiesFactoryBean propertiesFactoryBean = new PropertiesFactoryBean();
        propertiesFactoryBean.setLocation(new ClassPathResource("/quartz.properties"));
        propertiesFactoryBean.afterPropertiesSet();
        return propertiesFactoryBean.getObject();
    }
}

Quartz config:

org.quartz.scheduler.instanceName=wordscore-processor
org.quartz.scheduler.instanceId=AUTO
org.quartz.threadPool.threadCount=5
org.quartz.jobStore.class=org.quartz.simpl.RAMJobStore

As you can see I wan to have 5 threads in order to execute 5 parallel jobs. Do you know how I can process the files without this exception?

Upvotes: 1

Views: 3455

Answers (5)

Melron
Melron

Reputation: 579

Assuming we have File file = new File("c:/test.txt"), and print the the following paths:

Path copied = Paths.get(file.getAbsolutePath() + "/processed");
Path originalPath = file.toPath();

We will get the result:

copied: C:\test.txt\processed
originalPath: C:\test.txt

So its incorrect. You should try to get the parent path plus the processed folder plus the file name.

Path copied = Paths.get(file.getParentFile().getAbsolutePath() + "/processed/" + file.getName());
Path originalPath = file.toPath();

Upvotes: 1

Luke Machowski
Luke Machowski

Reputation: 4211

I’m pretty sure that the file is being locked by the file reader that you create but never close in the following line:

List<CsvLine> beans = new CsvToBeanBuilder(new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16))

Refactor your code so that you have that reader in a try finally block or close it explicitly.

The unintuitive behavior you might see is that those files are released at seemingly random times. This is because when the garbage collector frees up those readers, they will then release the files. Clean them up explicitly instead.

Upvotes: 0

jccampanero
jccampanero

Reputation: 53421

Although I agree completely with the answer and comments of @rzwitserloot, note the following in your error stack trace:

java.nio.file.FileSystemException: C:\csv\nov\07_06_26.csv -> C:\csv\nov\07_06_26.csv\processed: The process cannot access the file because it is being used by another process

You are trying moving your file to the backup directory, but note you are doing it to the wrong path, C:\csv\nov\07_06_26.csv\processed, in the example.

Please, try the following:

@Override
public void execute(JobExecutionContext context) {

    File directoryPath = new File("C:\\csv\\nov");
    // Create a new subfolder called "processed" into source directory
    // Hold a reference to the processed files directory path, we will
    // use it later
    Path processedDirectoryPath;
    try {
        processedDirectoryPath = Path.of(directoryPath.getAbsolutePath() + "/processed");
        if (!Files.exists(processedDirectoryPath) || !Files.isDirectory(processedDirectoryPath)) {
            Files.createDirectory(processedDirectoryPath);
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv")) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");
    for(File file : filesList) {

        try {
            try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
                List<CsvLine> beans = new CsvToBeanBuilder(br)
                        ......
                        .build()
                        .parse();

            for (CsvLine item : beans) {

                .....
                if (isFound.isPresent()) {
                    .........
        }}

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        // In my opinion, here is the error:
        // Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            // Note the use of the path we defined before
            Files.move(originalPath, processedDirectoryPath.resolve(originalPath.getFileName()),
                StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

If you need to increase the throughput of files processed, you could try splitting them in batches, say for certain pattern in their name like a month name or a job number, for instance. The simple solution could be to use the provided JobExecutionContext of every job to include some split criteria. That criteria will be used in your FilenameFilter causing every job to process only a certain portion of the whole amount of files that need to be processed. I think the solution is preferable to any kind of locking or similar mechanism..

For example, consider the following:

@Override
public void execute(JobExecutionContext context) {

    File directoryPath = new File("C:\\csv\\nov");
    // Create a new subfolder called "processed" into source directory
    // Hold a reference to the processed files directory path, we will
    // use it later
    Path processedDirectoryPath;
    try {
        processedDirectoryPath = Path.of(directoryPath.getAbsolutePath() + "/processed");
        if (!Files.exists(processedDirectoryPath) || !Files.isDirectory(processedDirectoryPath)) {
            Files.createDirectory(processedDirectoryPath);
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    // We obtain the file processing criteria using a job parameter
    JobDataMap data = context.getJobDetail().getJobDataMap();
    String filenameProcessingCriteria = data.getString("FILENAME_PROCESSING_CRITERIA");
    // Use the provided criteria to restrict the files that this job
    // will process 
    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv") && lowercaseName.indexOf(filenameProcessingCriteria) > 0) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");
    for(File file : filesList) {

        try {
            try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
                List<CsvLine> beans = new CsvToBeanBuilder(br)
                        ......
                        .build()
                        .parse();

            for (CsvLine item : beans) {

                .....
                if (isFound.isPresent()) {
                    .........
        }}

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        // In my opinion, here is the error:
        // Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            // Note the use of the path we defined before
            Files.move(originalPath, processedDirectoryPath.resolve(originalPath.getFileName()),
                StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

You need to pass the required parameter to your jobs:

JobDetail job1 = ...;
job1.getJobDataMap().put("FILENAME_PROCESSING_CRITERIA", "job1pattern");

An even simpler approach, based on the same idea, could be splitting the files in different folders and pass the folder name that need to be processed as a job parameter:

@Override
public void execute(JobExecutionContext context) {

    // We obtain the directory path as a job parameter
    JobDataMap data = context.getJobDetail().getJobDataMap();
    String directoryPathName = data.getString("DIRECTORY_PATH_NAME");

    File directoryPath = new File(directoryPathName);
    // Create a new subfolder called "processed" into source directory
    // Hold a reference to the processed files directory path, we will
    // use it later
    Path processedDirectoryPath;
    try {
        processedDirectoryPath = Path.of(directoryPath.getAbsolutePath() + "/processed");
        if (!Files.exists(processedDirectoryPath) || !Files.isDirectory(processedDirectoryPath)) {
            Files.createDirectory(processedDirectoryPath);
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv")) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");
    for(File file : filesList) {

        try {
            try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
                List<CsvLine> beans = new CsvToBeanBuilder(br)
                        ......
                        .build()
                        .parse();

            for (CsvLine item : beans) {

                .....
                if (isFound.isPresent()) {
                    .........
        }}

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        // In my opinion, here is the error:
        // Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            // Note the use of the path we defined before
            Files.move(originalPath, processedDirectoryPath.resolve(originalPath.getFileName()),
                StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

And pass a different folder to every different job:

JobDetail job1 = ...;
job1.getJobDataMap().put("DIRECTORY_PATH_NAME", "C:\\csv\\nov");

Please, consider refactor your code and define methods for file processing, file backup, etc, it will make your code easy to understand and handle.

Upvotes: 2

Reporter
Reporter

Reputation: 3948

The line in error message

Caused by: java.lang.RuntimeException: java.nio.file.FileSystemException: C:\csv\nov\07_06_26.csv -> C:\csv\nov\07_06_26.csv\processed: The process cannot access the file because it is being used by another process

I think you want to move the file from C:\csv\nov to C:\csv\nov\processed, so you have to change following line:

Path copied = Paths.get(file.getAbsolutePath() + "/processed");

to

 Path copied = Paths.get(file.getParent() + "/processed");

because file.getAbsolutePath() returns the complete path, include the name of file.

Upvotes: 1

rzwitserloot
rzwitserloot

Reputation: 103254

new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)

This parts creates a resource. A resource is an object that represents an underlying heavy thing - a thing that you can have very few of. In this case, it represents an underlying OS file handle.

You must always safely close these. There are really only 2 ways to do it correctly:

  • Use try-with-resources
  • Save it to a field, and make yourself AutoClosable so the code that uses of instances of this class can use try-with-resources
try (var br = new FileReader(file, StandardCharsets.UTF_16)) {
  List<CsvLine> beans = new CsvToBeanBuilder(br)
                    .....
                    .build()
                    .parse();
}

Is the answer.

Upvotes: 3

Related Questions