cybertextron
cybertextron

Reputation: 10971

Parse without ignoring whitespaces - Java

I have the following string input (from a netstat -a command):

Proto RefCnt Flags       Type       State         I-Node   Path
unix  2      [ ]         DGRAM                    11453    /run/systemd/shutdownd
unix  2      [ ]         DGRAM                    7644     /run/systemd/notify
unix  2      [ ]         DGRAM                    7646     /run/systemd/cgroups-agent
unix  5      [ ]         DGRAM                    7657     /run/systemd/journal/socket
unix  14     [ ]         DGRAM                    7659     /dev/log
unix  3      [ ]         STREAM     CONNECTED     16620
unix  3      [ ]         STREAM     CONNECTED     16621

Meanwhile I'm attempting to parse the above string as:

// lines is an array representing each line above
for (int i = 0; i < lines.length; i++) {
    String[] tokens = lines[i].split("\\s+");
}

I want to have tokens as an array of 7 entries [Proto, RefCnt, Flag, Type, State, I-Node, Path]. Instead, I'm obtaining an array that excludes the brackets under Flags and the empty State:

["unix", "2", "[", "]", "DGRAM", "11453", "/run/systemd/shutdownd"]

instead of

["unix", "2", "[]", "DGRAM", "", "11453", "/run/systemd/shutdownd"]

How can I fix my regex to produce the correct output?

Upvotes: 0

Views: 46

Answers (1)

Aleksandr Podkutin
Aleksandr Podkutin

Reputation: 2580

You need to set minimal space length in your regular expression to 2, try split like this:

String[] tokens = lines[i].split("\\s{2,16}+");

Or like @revo suggests using lookarounds, like this:

String[] tokens = lines[i].split("(?<!\\[)\\s{2,16}+(?!\\])");

Upvotes: 1

Related Questions