Reputation: 69
I am trying to do a split of String array at the i th location. with a regex for 4 or more spaces.
i found a lot of information here and other sites, hence I came up with
String[] parts = titlesAuthor[i].split(" ");
so the split can happen between the title and authors name which contains either 4 or more spaces or does not exist as all.
Example:
titleAuthor[0] = Investigational drugs for autonomic dysfunction in Parkinson's disease Perez-Lloret S
After running the above split, parts[0] is coming up as empty and part[1] has the complete string.
please help!
code :
for (int i = 0; i < nodes.getLength(); i++) { Element element = (Element) nodes.item(i); NodeList title = element.getElementsByTagName("TEXT"); line = (Element) title.item(0); titlesAuthor[i] = getCharacterDataFromElement(line); System.out.println(titlesAuthor[i]); parts = titlesAuthor[i].split(" "); System.out.println(parts[0]); System.out.println(parts[1]); }
Upvotes: 0
Views: 178
Reputation: 19237
To catch 4 or more spaces you need to indicate it with a +:
String[] parts = titlesAuthor[i].split(" +");
or:
String[] parts = titlesAuthor[i].split(" {4,}");
update: it looks like your xml doesn't look exactly as you think. In the code you provided add:
System.out.println(i + ":" + titlesAuthor[i] + ";");
and you'll see some spaces or new lines at the beginnng.
Upvotes: 0
Reputation: 704
In your example, your code is splitting when it finds four consecutive spaces. The String that you are splitting above has ten consecutive spaces between:
"disease Perez".
Thus, there is a split between the spaces. Pretend "#" is a space:
Investigational drugs for autonomic dysfunction in Parkinson's disease|SPLIT|null|SPLIT|##Perez-Lloret S
Your split will result in:
{[Investigational drugs for autonomic dysfunction in Parkinson's disease],[null], [##Perez-Lloret S]}
because your code found two instances of four spaces. The parts[1] is empty because there was nothing present in between the two splits.
Hope this helps!
Upvotes: 0