Luke Vo
Luke Vo

Reputation: 20788

String.split to split data lines doesn't work correctly

I use VB.NET to create data for my game (for Android, Java code), this is how it look like:

5;0000000100011100010000000;2;2
5;1000001100010000000000000;0,1;0,1

where each line is a level. In VB.NET, I create new line by vbNewLine constant (I think its ASCII code is 13) then use IO.File.WriteAllText to write it to the file.

In my game in Java, I use \n to split the levels:

String[] levelData = rawData.split("\n");

However, when processing throught the data, the levelData always has a "new line" after the end. For example, the levelData[0] is 5;00...2;2<new line>, which cause Integer.parseInt exception. Then I debug, and found this:

rawData.charAt(31) //It's a \r, not \n

So, I change the split line:

String[] levelData = rawData.split("\r");

But now, the levelData[1] will be <newline>5....

What exactly do I have to do to solve this problem? And please explain how "new line" work in Java String.

Upvotes: 1

Views: 1380

Answers (4)

mwuersch
mwuersch

Reputation: 256

Why don't you use Scanner to read your file and split for lines instead?

Scanner sc = new Scanner(new File("levels.text"));
while (sc.hasNextLine()) {
  String nextLine = sc.nextLine();
  if(nextLine.lenght() > 0) { // you could even use Java regexes to validate the format of every line
    String[] levelElements = nextLine.split(";");
    // ...
  }
}

Upvotes: 2

user425367
user425367

Reputation:

Most probably it is from the code you show in VB that is the problem.

I create new line by vbNewLine constant (I think its ASCII code is 13)

First verify this for certain, then look up what code 13 is! Here is a general ascii table.

code 13 is a carrige return and is represented in Java as \r

code 10 is line feed and is represented in Java as \n

A good tip would be to read up a little about NewLines, It's completely fu**ed up, Windows and Linux uses different ways of representing a new line.

  • CR+LF: Microsoft Windows, DEC TOPS-10, RT-11 and most other early non-Unix and non-IBM OSes, CP/M, MP/M, DOS (MS-DOS, PC-DOS, etc.), Atari TOS, OS/2, Symbian OS, Palm OS
  • LF: Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac OS X, FreeBSD, etc.), BeOS, Amiga, RISC OS, and others.
  • CR: Commodore 8-bit machines, Acorn BBC, TRS-80, Apple II family, Mac OS up to version 9 and OS-9

Upvotes: 3

Rune FS
Rune FS

Reputation: 21752

vbNewLine is platform dependant. on windows newline is comprissed of two characters \n and \r and not just \n

Upvotes: 1

Naved
Naved

Reputation: 4138

I suppose that vbNewLine constant put both "\r\n" at the end and hence one character is left while splitting. Try to split it by using both.

Upvotes: 6

Related Questions