dwwilson66
dwwilson66

Reputation: 7074

Java's split method has leading blank records that I can't suppress

I'm parsing an input file that has multiple keywords preceded by a +. The + is my delimiter in a split, with individual tokens being written to an array. The resulting array includes a blank record in the [0] position.

I suspect that split is taking the "nothing" before the first token and populating project[0], then moving on to subsequent tokens which all show up as correct.

Documentaion says that this method has a limit parameter:

If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

and I found this post on SO, but the solution proposed, editing out the leading delimiter (I used a substring(1) to create a temp field) yielded the same blank record for me.

Code and output appers below. Any tips would be appreciated.

import java.util.regex.*;
import java.io.*;
import java.nio.file.*;
import java.lang.*;
//
public class eadd
{
    public static void main(String args[])
    {
        String projStrTemp = "";
        String projString = "";
        String[] project = new String[10];
        int contextSOF = 0;
        int projStringSOF = 0;
        int projStringEOF = 0;
       //
        String inputLine = "foo foofoo foo foo @bar.com +foofoofoo +foo1 +foo2 +foo3";
        contextSOF = inputLine.indexOf("@");
        int tempCalc = (inputLine.indexOf("+")) ;
        if (tempCalc == -1) {
            proj StrTemp = "+Uncategorized";
        } else {
            projStringSOF = inputLine.indexOf("+",contextSOF);
            projStrTemp = inputLine.trim().substring(projStringSOF).trim();
        }
        project = projStrTemp.split("\\+");
       //
        System.out.println(projStrTemp+"\n"+projString);
        for(int j=0;j<project.length;j++) {
        System.out.println("Project["+j+"] "+project[j]);
        }
    }

CONSOLE OUTPUT: 
+foofoofoo +foo1 +foo2 +foo3

Project[0]
Project[1] foofoofoo
Project[2] foo1
Project[3] foo2
Project[4] foo3

Upvotes: 0

Views: 875

Answers (3)

Petr
Petr

Reputation: 63419

One simple solution would be to remove the first + from the string. This way, it won't split before the first keyword:

projStrTemp = inputLine.trim().substring(projStringSOF + 1).trim();

Edit: Personally, I'd go for a more robust solution using regular expressions. This finds all keywords preceded by +. It also requires that + is preceded by either a space or it's at the start of the line so that words like 3+4 aren't matched.

String inputLine = "+foo 3+4 foofoo foo foo @bar.com +foofoofoo +foo1 +foo2 +foo3";
Pattern re = Pattern.compile("(\\s|^)\\+(\\w+)");
Matcher m = re.matcher(inputLine);
while (m.find()) {
    System.out.println(m.group(2));
}

Upvotes: 1

Lo Juego
Lo Juego

Reputation: 1325

+foofoofoo +foo1 +foo2 +foo3

Splits method splits the string around matches of the given + so the array contains in the first element an empty field (with 5 elements). If you want to get the previous data get inputLine instead the processed projStrTemp that substring from the first + included.

Upvotes: 0

ᴇʟᴇvᴀтᴇ
ᴇʟᴇvᴀтᴇ

Reputation: 12781

Change:

projStrTemp = inputLine.trim().substring(projStringSOF).trim();

to:

projStrTemp = inputLine.trim().substring(projStringSOF + 1).trim();

If you have a leading delimiter, your array will start with a blank element. It might be worthwhile for you to experiment with split() without all the other baggage.

public static void main(String[] args) {
    String s = "an+example";

    String[] items = s.split("\\+");
    for (int i = 0; i < items.length; i++) {
        System.out.println(i + " = " + items[i]);
    }
}

With String s = "an+example"; it produces:

0 = an
1 = example

Whereas String s = "+an+example"; produces:

0 = 
1 = an
2 = example

Upvotes: 2

Related Questions