JavaNullPointer
JavaNullPointer

Reputation: 1211

Java : Extract numbers from a string

I was trying to extract my data from a string by using regular expression.

My data looks like:

 12 170 0.11918
170  12 0.11918
 12 182 0.06361
182  12 0.06361
 12 198 0.05807
198  12 0.05807
 12 242 0.08457
242  12 0.08457
 11  30 0.08689
 30  11 0.08689

The problems here are the different number of whitespace between two numbers.

All in all i want to extract from each line two Integers and one Double. Therefore i tried to use regular expressions.

  Pattern p = Pattern.compile("(([0-9]+.[0-9]*)|([0-9]*.[0-9]+)|([0-9]+))");
  Matcher m = p.matcher("  6    7781     0.01684000");
  while (m.find()) {
     System.out.println(m.group(0));  
  }

I now my regular expression doesn't work. Has anyone some help for a suitable regular expression therefore i can work with the data or any other help for me?

Upvotes: 1

Views: 650

Answers (6)

jvecsei
jvecsei

Reputation: 1993

String s = " 12 170 0.11918\n" + "170  12 0.11918 \n"
            + " 12 182 0.06361\n" + "182  12 0.06361 \n"
            + " 12 198 0.05807\n" + "198  12 0.05807 \n"
            + " 12 242 0.08457\n" + "242  12 0.08457 \n"
            + " 11  30 0.08689\n" + " 30  11 0.08689 \n";

    String[] lines = s.split("\\n");

    for( String line : lines ) {
        Scanner scan = new Scanner(line);
        scan.useDelimiter("\\s+");
        scan.useLocale(Locale.ENGLISH);
        System.out.println(scan.nextInt());
        System.out.println(scan.nextInt());
        System.out.println(scan.nextDouble());
    }

I would use a Scanner for this problem.

Upvotes: 0

user557597
user557597

Reputation:

Something like this (fix up the float part as needed) -

 # raw:  (?m)^\h*(\d+)\h+(\d+)\h+(\d*\.\d+)
 # quoted: "(?m)^\\h*(\\d+)\\h+(\\d+)\\h+(\\d*\\.\\d+)"

 (?m)             # Multi-line modifier
 ^                # BOL
 \h*              # optional, horizontal whitespace
 ( \d+ )          # (1), int
 \h+              # required, horizontal whitespace
 ( \d+ )          # (2), int
 \h+              # required, horizontal whitespace
 ( \d* \. \d+ )   # (3), float

Upvotes: 0

Leo
Leo

Reputation: 6580

check http://txt2re.com/index-java.php3?s=%2012%20170%200.11918&11&5&12&4&13&1

you're probably interested in the int1, int2 and float1 below

 public static void main(String[] args)
  {
    String txt=" 12 170 0.11918";

    String re1="(\\s+)";    // White Space 1
    String re2="(\\d+)";    // Integer Number 1
    String re3="(\\s+)";    // White Space 2
    String re4="(\\d+)";    // Integer Number 2
    String re5="(\\s+)";    // White Space 3
    String re6="([+-]?\\d*\\.\\d+)(?![-+0-9\\.])";  // Float 1

    Pattern p = Pattern.compile(re1+re2+re3+re4+re5+re6,Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
    Matcher m = p.matcher(txt);
    if (m.find())
    {
        String ws1=m.group(1);
        String int1=m.group(2);
        String ws2=m.group(3);
        String int2=m.group(4);
        String ws3=m.group(5);
        String float1=m.group(6);
        System.out.print("("+ws1.toString()+")"+"("+int1.toString()+")"+"("+ws2.toString()+")"+"("+int2.toString()+")"+"("+ws3.toString()+")"+"("+float1.toString()+")"+"\n");
    }
  }

Upvotes: 1

Hypino
Hypino

Reputation: 1248

I recommend using a Scanner.

Scanner scanner = new Scanner(line);
scanner.useDelimiter(" ");
int int1 = scanner.nextInt()
int int2 = scanner.nextInt()
double double1 = scanner.nextDouble()

Upvotes: 1

Kent
Kent

Reputation: 195269

why not read each line and do a line.trim().split("\\s+")? If your project has already used guava, the Splitter could be used too.

Upvotes: 2

bdx
bdx

Reputation: 3516

Try this:

([\d.]+) - This will get all strings containing only digits or periods (.).

Edit:

I see you're wanting three groups out of a line. This instead, will help by ignoring white space, and grabbing the three groups of numbers. The leading ^ and trailing $ ensure that you're only matching on a single line.

^\s*?([\d.]+)\s*([\d.]+)\s*?([\d.]+)\s*?$

Upvotes: 0

Related Questions