CodeGame
CodeGame

Reputation: 1

Flat file Item reader, custom record delimiter

I have a requirement to parse flat file of

column1|column2|column3$#
data1|data2|data3$#

Where

| - pipe line delimiter 
$# - record delimiter

Flat file item reader: I tried to use the custom record separator policy, in which I tried to override isEndofRecord and SuffixRecordSeparatorPolicy setSuffix() but with no luck. It's not recognizing the $# as the record delimiter and I received a flatfile parse exeception.

I have got a parser univocity parser to add custom a record delimiter. However, I am not sure how to add CSV Parser settings to my flat file reader method.

private CsvParser csvParserSetting(BeanListProcessor<Employee> rowProcessor) {
    CsvParserSettings settings = new CsvParserSettings();
    settings.getFormat().setLineSeparator("$#");//$#
    settings.getFormat().setDelimiter("|");//|
    settings.setIgnoreLeadingWhitespaces(true);
    settings.setNumberOfRowsToSkip(1);
    settings.setProcessor(rowProcessor);
    CsvParser parser = new CsvParser(settings);
    return parser;
}

@Bean
@StepScope
public FlatFileItemReader<Employee> myReader() throws FileNotFoundException {BeanListProcessor<Employee> rowProcessor = new BeanListProcessor<Employee>(Employee.class);
    CsvParser parser = csvParserSetting(rowProcessor);
    Request request=MappingUtil.requestMap.get("myRequest");
    InputStream inputStream = awsClient.getInputStreamObject(request.getFileKeyPath());
    CustomRecordSeparatorPolicy customRecordSeparatorPolicy=new CustomRecordSeparatorPolicy();
    //stomRecordSeparatorPolicy.isEndOfRecord(record)
     FlatFileItemReader<Employee> reader = new FlatFileItemReader<>();
    reader.setResource(new InputStreamResource(inputStream));

     reader.setName("filreader");
        reader.setLinesToSkip(1);
       // customRecordSeparatorPolicy.setSuffix("$#");
      //  reader.setRecordSeparatorPolicy(customRecordSeparatorPolicy);

        //reader.setRecordSeparatorPolicy(recordSeparatorPolicy);
        reader.setLineMapper(new DefaultLineMapper<Employee>() {{
          setLineTokenizer(new DelimitedLineTokenizer() {{
            setNames(MyConstats.FIELDS);
            setDelimiter("|");
          }});
          setFieldSetMapper(new BeanWrapperFieldSetMapper<Employee>() {{
            setTargetType(Employee.class);
          }});
        }});
        return reader;
        }

import org.springframework.batch.item.file.separator.RecordSeparatorPolicy;

public class CustomSuffixRecordSeparatorPolicy implements RecordSeparatorPolicy {

public static final String DEFAULT_SUFFIX = "$#";
private String suffix = DEFAULT_SUFFIX;
private boolean ignoreWhitespace = false;

 public void setSuffix(String suffix) {
    this.suffix = suffix;
}
public void setIgnoreWhitespace(boolean ignoreWhitespace) {
    this.ignoreWhitespace = ignoreWhitespace;
}
/*@Override
public boolean isEndOfRecord(String record) {
    int fieldCount = record.split("|").length;
   // String recordvalue[] =record.split("\\|");
    if(fieldCount == 126) {
        return true;
    } else {
        return false;
    }
}*/
public boolean isEndOfRecord(String line) {
    if (line == null) {
        return true;
    }
    String trimmed = ignoreWhitespace ? line.trim() : line;
    return trimmed.endsWith(suffix);
}

public String postProcess(String record) {
    if (record==null) {
        return null;
    }
    return record.substring(0, record.lastIndexOf(suffix));
}
@Override
public String preProcess(String record) {
    return record;
}

}

header1|header2|header3$#
value1|value2|value3$#value11|value22|value33

header1|header2|header3$#value1|value2|value3$#value11|value22|value33

header1|header2|header3$#
value1|value2|value3$#
value11|value22|value33$#

in the iteration 1 , it parses the line headers correctly and when it goes to 2nd time, it tries to reads the line value1|value2|value3$#value11|value22|value33 and records are not getting splitted to distinguish record by record.

finally it fails

private String applyRecordSeparatorPolicy(String line) throws IOException {
    String record = line;
    while (line != null && !recordSeparatorPolicy.isEndOfRecord(record)) {
        line = this.reader.readLine();
        if (line == null) {
            if (StringUtils.hasText(record)) {
                // A record was partially complete since it hasn't ended but
                // the line is null
                throw new FlatFileParseException("Unexpected end of file before record complete", record, lineCount);
            }
            else {
                // Record has no text but it might still be post processed
                // to something (skipping preProcess since that was already
                // done)
                break;
            }
        }
        else {
            lineCount++;
        }
        record = recordSeparatorPolicy.preProcess(record) + line;
    }
    return recordSeparatorPolicy.postProcess(record);

}

my end of record method that i tried now. looks like this fails if the header1|hearders...|$#values|| in the same line it fails.in my case there are 126 headers$#values-126$#values-126$#etc.

private int getPipeCount(String s){ 
        String tmp = s;
        int index = -1;
        int count = 0;
        while ((index=tmp.indexOf("|"))!=-1) {
        tmp = tmp.substring(index+1);
        count++;
        }
        return count;
    }

     public boolean isEndOfRecord(String line) {
        return getPipeCount(line)==126;
    }

Upvotes: 0

Views: 1049

Answers (0)

Related Questions