Reputation: 1211
I was trying to extract my data from a string by using regular expression.
My data looks like:
12 170 0.11918
170 12 0.11918
12 182 0.06361
182 12 0.06361
12 198 0.05807
198 12 0.05807
12 242 0.08457
242 12 0.08457
11 30 0.08689
30 11 0.08689
The problems here are the different number of whitespace between two numbers.
All in all i want to extract from each line two Integers and one Double. Therefore i tried to use regular expressions.
Pattern p = Pattern.compile("(([0-9]+.[0-9]*)|([0-9]*.[0-9]+)|([0-9]+))");
Matcher m = p.matcher(" 6 7781 0.01684000");
while (m.find()) {
System.out.println(m.group(0));
}
I now my regular expression doesn't work. Has anyone some help for a suitable regular expression therefore i can work with the data or any other help for me?
Upvotes: 1
Views: 650
Reputation: 1993
String s = " 12 170 0.11918\n" + "170 12 0.11918 \n"
+ " 12 182 0.06361\n" + "182 12 0.06361 \n"
+ " 12 198 0.05807\n" + "198 12 0.05807 \n"
+ " 12 242 0.08457\n" + "242 12 0.08457 \n"
+ " 11 30 0.08689\n" + " 30 11 0.08689 \n";
String[] lines = s.split("\\n");
for( String line : lines ) {
Scanner scan = new Scanner(line);
scan.useDelimiter("\\s+");
scan.useLocale(Locale.ENGLISH);
System.out.println(scan.nextInt());
System.out.println(scan.nextInt());
System.out.println(scan.nextDouble());
}
I would use a Scanner for this problem.
Upvotes: 0
Reputation:
Something like this (fix up the float part as needed) -
# raw: (?m)^\h*(\d+)\h+(\d+)\h+(\d*\.\d+)
# quoted: "(?m)^\\h*(\\d+)\\h+(\\d+)\\h+(\\d*\\.\\d+)"
(?m) # Multi-line modifier
^ # BOL
\h* # optional, horizontal whitespace
( \d+ ) # (1), int
\h+ # required, horizontal whitespace
( \d+ ) # (2), int
\h+ # required, horizontal whitespace
( \d* \. \d+ ) # (3), float
Upvotes: 0
Reputation: 6580
check http://txt2re.com/index-java.php3?s=%2012%20170%200.11918&11&5&12&4&13&1
you're probably interested in the int1, int2 and float1 below
public static void main(String[] args)
{
String txt=" 12 170 0.11918";
String re1="(\\s+)"; // White Space 1
String re2="(\\d+)"; // Integer Number 1
String re3="(\\s+)"; // White Space 2
String re4="(\\d+)"; // Integer Number 2
String re5="(\\s+)"; // White Space 3
String re6="([+-]?\\d*\\.\\d+)(?![-+0-9\\.])"; // Float 1
Pattern p = Pattern.compile(re1+re2+re3+re4+re5+re6,Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
Matcher m = p.matcher(txt);
if (m.find())
{
String ws1=m.group(1);
String int1=m.group(2);
String ws2=m.group(3);
String int2=m.group(4);
String ws3=m.group(5);
String float1=m.group(6);
System.out.print("("+ws1.toString()+")"+"("+int1.toString()+")"+"("+ws2.toString()+")"+"("+int2.toString()+")"+"("+ws3.toString()+")"+"("+float1.toString()+")"+"\n");
}
}
Upvotes: 1
Reputation: 1248
I recommend using a Scanner
.
Scanner scanner = new Scanner(line);
scanner.useDelimiter(" ");
int int1 = scanner.nextInt()
int int2 = scanner.nextInt()
double double1 = scanner.nextDouble()
Upvotes: 1
Reputation: 195269
why not read each line and do a line.trim().split("\\s+")
? If your project has already used guava, the Splitter
could be used too.
Upvotes: 2
Reputation: 3516
Try this:
([\d.]+)
- This will get all strings containing only digits or periods (.).
Edit:
I see you're wanting three groups out of a line. This instead, will help by ignoring white space, and grabbing the three groups of numbers. The leading ^
and trailing $
ensure that you're only matching on a single line.
^\s*?([\d.]+)\s*([\d.]+)\s*?([\d.]+)\s*?$
Upvotes: 0