Tapashi Talukdar
Tapashi Talukdar

Reputation: 77

String parsing with space

I am reading a file where each like looks like something like this:

EmpId:6428 EmpName:Josh Classes:[Math, English, Bio, Art, comp]

I want the EmpId, EmpName and Classes. I am splitting it by space which in turn also split the classes. So basically at the end for the list of classes, I am just getting Classes:[Math,. But I want the entire list of classes. Please share your advice on how can split this. Thanks

private static class EmpResource {
        private String empId;
        private String empName;
        private List<String> classes;

        public TableResource(final String line) {
            String[] strs = line.split(" ");
            this.empId = strs[0].split(":")[1];
            this.empName = strs[1].split(":")[1];
            String classes = strs[2].split(":")[1];
            convertToClassList(classes);
        }


        void setClasses(List<String> classes) {
            this.classes = classes;
        }


        private void convertToClassList(String classes) {

            if (!"null".equals(class)) {
                String replace = indexString.replaceAll("^\\[|]$", "");
                setIndexes(new ArrayList<>(Arrays.asList(replace.split(", "))));
            }
        }
    }

Expected Result:

empId 6428
empName Josh 
List<String> classes [Math,English,Bio,Art,comp]

Actual Result:

empId 6428
empName Josh
List<String> classes [Math,

Upvotes: 0

Views: 118

Answers (4)

nishantc1527
nishantc1527

Reputation: 376

If line wasn't final, you could do line = line.replaceAll(", ", "");, but since it is you need a temp string.

String temp = String.valueOf(line.toCharArray().clone());
temp = temp.replaceAll(", ", "");

Then you could do whatever you did without any trouble, since there are not other spaces to interfere.

String[] strs = temp.split(" "); // Make sure it's temp, since temp is the one you changed.
this.empId = strs[0].split(":")[1];
this.empName = strs[1].split(":")[1];
String classes = strs[2].split(":")[1];
convertToClassList(classes);

Upvotes: 0

DodgyCodeException
DodgyCodeException

Reputation: 6123

Use String.split with a limit.

Then you only split the line into 3 strings, so the classes will all be together in the last string.

String line = "EmpId:6428 EmpName:Josh Classes:[Math, English, Bio, Art, comp]";
String[] strs = line.split(" ", 3);
System.out.println(strs[2]);

Output:

Classes:[Math, English, Bio, Art, comp]

As FedericoklezCulloca pointed out in the comment, the above doesn't work if the name contains a space (e.g. first name last name). A more robust way to do it is to look for the specific labels, as in the following code which does it using a regex:

private static Pattern LINE_PATTERN =
        Pattern.compile("EmpId:(.*) EmpName:(.*) Classes:\\[(.*)\\]");

public void test() {
    String line = "EmpId:6428 EmpName:Josh Adams Classes:[Math, English, Bio, Art, comp]";
    Matcher lineMatcher = LINE_PATTERN.matcher(line);
    if (lineMatcher.matches()) {
        System.out.println("EmpId   = " + lineMatcher.group(1));
        System.out.println("Name    = " + lineMatcher.group(2));
        System.out.println("Classes = " + lineMatcher.group(3));
    }
}

Output:

EmpId   = 6428
Name    = Josh Adams
Classes = Math, English, Bio, Art, comp

Upvotes: 1

Conffusion
Conffusion

Reputation: 4475

Apparently you know you have an empId, empName and Classes part, so why not using a single regex which matches the whole line:

public static void main(String[] args) {
    Pattern p=Pattern.compile("EmpId:(.*) EmpName:(.*) Classes:\\[(.*)\\]");
    String input="EmpId:6428 EmpName:Josh Classes:[Math, English, Bio, Art, comp]";
    Matcher m=p.matcher(input);
    if(m.matches())
    {
        System.out.println("empId:"+m.group(1));
        System.out.println("empName"+m.group(2));
        System.out.println("Classes:"+m.group(3));
        String[] classes=m.group(3).split(", ");
        System.out.println("classes:'"+classes[1]+"'");

    } else
        System.err.println("no match");
}

Upvotes: 1

Mena
Mena

Reputation: 48404

As indicated in the comment, a working but dirty fix to not split on your inner space-separated "classes" elements would be to make the initial split conditional to no comma preceding the white space.

For instance, you can use a negative lookbehind to split only if the whitespace isn't preceded by a comma.

Example

String test = "EmpId:6428 EmpName:Josh Classes:[Math, English, Bio, Art, comp]";
System.out.println(test.split("(?<!,) ")[2]);

Output

Classes:[Math, English, Bio, Art, comp]

Generally speaking though, you might want to consider implementing your own parser if the syntax gets more complicated.

Regex can only drive you so far before backfiring.

Upvotes: 1

Related Questions