Emax
Emax

Reputation: 51

How can you create an map or list of indexes of a substring position-Java?

I am parsing many lines from a text file. The file lines are fixed length width but depending on beginning of the line ex "0301...." the file data structure is split. there are lines example beginning with 11, 34 etc, and based on that the line is split differently.

Example: if start of line contains "03", then the line would be split on

name = line.substring(2, 10);
surname = line.substring(11, 21);
id = line.substring(22, 34);
adress = line.substring (35, 46); 

Another Example: if start of line contains "24", then the line would be split on

name = line.substring(5, 15);
salary = line.substring(35, 51);
empid = line.substring(22, 34);
department = line.substring (35, 46); 

So I have many substrings are added to many strings, then written to a new file in csv.

My question would be is there any easy method for storing the coordinates (indexes) of a substring and calling them later easier? Example

name = (2,10);
surname = (11,21);

... etc.

Or probably any alternative of using substrings? thank you!

Upvotes: 0

Views: 938

Answers (3)

SKumar
SKumar

Reputation: 2030

We can also use regex pattern and streams to achieve the results.

Say, we have a text file like this -

03SomeNameSomeSurname
24SomeName10000

The regex pattern has group name for assigning the attribute name to the parsed text. So, the pattern for the first line is -

^03(?<name>.{8})(?<surname>.{11})

The code is -

public static void main(String[] args) {

        // Fixed Width File Lines
        List<String> fileLines = List.of(
                "03SomeNameSomeSurname",
                "24SomeName10000"
        );
        // List all regex patterns for the specific file
        List<Pattern> patternList = List.of(
                Pattern.compile("^03(?<name>.{8})(?<surname>.{11})"), // Regex for String - 03SomeNameSomeSurname
                Pattern.compile("^24(?<name>.{8})(?<salary>.{5})")); // Regex For String - 24SomeName10000

        // Pattern for finding Group Names
        Pattern groupNamePattern = Pattern.compile("\\?<([a-zA-Z0-9]*)>");

        List<List<String>> output  = fileLines.stream().map(
                line -> patternList.stream() // Stream over the pattern list
                        .map(pattern -> pattern.matcher(line)) // Create a matcher for the fixed width line and regex pattern
                        .filter(matcher -> matcher.find()) // Filter matcher which matches correctly
                        .map( // Transform matcher results into String (Group Name = Matched Value
                                matcher ->
                                        groupNamePattern.matcher(matcher.pattern().toString()).results() // Find Group Names for the regex pattern
                                                .map(groupNameMatchResult -> groupNameMatchResult.group(1) + "=" + matcher.group(groupNameMatchResult.group(1))) // Transform into String (Group Name = Matched Value)
                                .collect(Collectors.joining(","))) // Join results delimited with ,
                        .collect(Collectors.toList())
        ).collect(Collectors.toList());

        System.out.println(output);
    }

The output result has parsed the attribute name and attribute value as a List of String.

[[name=SomeName,surname=SomeSurname], [name=SomeName,salary=10000]]

Upvotes: 0

Gryphon
Gryphon

Reputation: 384

You could try something like this. I'll leave the bounds checking and optimization to you, but as a first pass...

public static void main( String[] args ) {

    Map<String, Map<String,IndexDesignation>> substringMapping = new HashMap<>();

    // Put all the designations of how to map here

    substringMapping.put( "03", new HashMap<>());
    substringMapping.get( "03" ).put( "name", new IndexDesignation(2,10));
    substringMapping.get( "03" ).put( "surname", new IndexDesignation(11,21));

    // This determines which mapping value to use
    Map<String,IndexDesignation> indexDesignationMap = substringMapping.get(args[0].substring(0,2));

    // This holds the results
    Map<String, String> resultsMap = new HashMap<>();

    // Make sure we actually have a map to use
    if ( indexDesignationMap != null ) {
        // Now take this particular map designation and turn it into the resulting map of name to values

        for ( Map.Entry<String,IndexDesignation> mapEntry : indexDesignationMap.entrySet() ) {
            resultsMap.put(mapEntry.getKey(), args[0].substring(mapEntry.getValue().startIndex,
                    mapEntry.getValue().endIndex));
        }
    }

    // Print out the results (and you can assign to another object here as needed)
    System.out.println( resultsMap );
}

// Could also just use a list of two elements instead of this
static class IndexDesignation {
    int startIndex;
    int endIndex;
    public IndexDesignation( int startIndex, int endIndex ) {
        this.startIndex = startIndex;
        this.endIndex = endIndex;
    }
}

Upvotes: 1

Aziz Sonawalla
Aziz Sonawalla

Reputation: 2502

Create a class called Line and store these objects rather than the string:

class Line {

  int[] name;
  int[] surname;
  int[] id;
  int[] address;

  String line;

  public Line(String line) {
    this.line = line;

    String startCode = line.substring(0, 3);
    switch(startCode) {
      case "03":
        this.name = new int[]{2, 10};
        this.surname = new int[]{11, 21};
        this.id = new int[]{22, 34};
        this.address = new int[]{35, 46};
        break;
      case "24":
        // same thing with different indices
        break;
      // add more cases
    }
  }

  public String getName() {
    return this.line.substring(this.name[0], this.name[1]);
  }

  public String getSurname() {
    return this.line.substring(this.surname[0], this.surname[1]);
  }

  public String getId() {
    return this.line.substring(this.id[0], this.id[1]);
  }

  public String getAddress() {
    return this.line.substring(this.address[0], this.address[1]);
  }
}

Then:

String line = "03 ..."

Line parsed = new Line(line);
parsed.getName();
parsed.getSurname();
...

If you're going to retrieve the name, surname etc. multiple times from the Line object, you can even cache it the first time so that you're not calling substring multiple times

Upvotes: 1

Related Questions