Reputation: 6292
I have a csv file with the following data format
123,"12.5","0.6","15/9/2012 12:11:19"
These numbers are:
I want to extract these data from the line.
I have tried the regular expression:
String line = "123,\"12.5\",\"0.6\",\"15/9/2012 12:11:19\"";
Pattern pattern = Pattern.compile("(\\W?),\"([\\d\\.\\-]?)\",\"([\\d\\.\\-]?)\",\"([\\W\\-\\:]?)\"");
Scanner scanner = new Scanner(line);
if(scanner.hasNext(pattern)) {
...
}else{
// Alaways goes to here
}
It looks like my pattern is not correct as it always goes to the else section. What did I do wrong? Can someone suggests a solution for this?
Many thanks.
Upvotes: 0
Views: 79
Reputation: 2423
This is a possible solution to your situation:
String line = "123,\"12.5\",\"0.6\",\"15/9/2012 12:11:19\"";
Pattern pattern = Pattern.compile("([0-9]+),\\\"([0-9.]+)\\\",\\\"([0-9.]+)\\\",\\\"([0-9/:\\s]+)\\\"");
Scanner scanner = new Scanner(line);
scanner.useDelimiter("\n");
if(scanner.hasNext(pattern)) {
MatchResult result = scanner.match();
System.out.println("1st: " + result.group(1));
System.out.println("2nd: " + result.group(2));
System.out.println("3rd: " + result.group(3));
System.out.println("4th: " + result.group(4));
}else{
System.out.println("There");
}
Note that ?
means 0 or 1 occurrences, meanwhile +
means 1 or more.
Observe the use of 0-9
for digits. You can also use \d
if you like. For spaces, you must change the delimiter of the scanner with scanner.useDelimiter("\n")
, for example.
The output of this snippet is:
1st: 123
2nd: 12.5
3rd: 0.6
4th: 15/9/2012 12:11:19
Upvotes: 0
Reputation: 159754
Regular expressions are very cumbersome for this type of work.
I suggest using a CSV library such as OpenCSV instead.
The library can parse the String
entries into a String
array and individual entries can be parsed as required. Here an OpenCSV example for the specific problem:
CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
int orderNumber = Integer.parseInt(nextLine[0]);
double price = Double.parseDouble(nextLine[1]);
double discountRate = Double.parseDouble(nextLine[2]);
...
}
Full documentation and examples can be found here
Upvotes: 1
Reputation: 124215
scanner.hasNext(pattern)
from documentation
Returns true if the next complete token matches the specified pattern.
but next token is 123,"12.5","0.6","15/9/2012
because scanner tokenizes words using space.
Also there are few problems with your regex
?
which means zero or one where you should use *
- zero or more, or +
- one or more,\\W
at start but this will also exclude numbers.If you really want to use scanner and regex then try with
Pattern.compile("(\\d+),\"([^\"]+)\",\"([^\"]+)\",\"([^\"]+)\"");
and change used delimiter to new line mark with
scanner.useDelimiter(System.lineSeparator());
Upvotes: 0
Reputation: 7353
?
in regex means "zero or one occurrence". You probably wanted to use +
instead (one or more) so it could capture all the digits, points, colons, etc.
Upvotes: 0
Reputation: 11939
Seems a bit overcomplicated to specifically split, you should try splitting by the most obvious common delimiter between the elements, which is a comma. Perhaps you should try something like this:
final String info = "123,\"12.5\",\"0.6\",\"15/9/2012 12:11:19\"";
final String[] split = info.split(",");
final int orderNumber = Integer.parseInt(split[0]);
final double price = Double.parseDouble(split[1].replace("\"", ""));
final double discountRate = Double.parseDouble(split[2].replace("\"", ""));
final String date = split[3].replace("\"", "");
Upvotes: 1