tzippy
tzippy

Reputation: 6648

Extract two double Values from String using RegEx in Java

I am reading a file by line and need to extract latitude and longitude from it. This how lines can looks:

DE  83543   Rott am Inn Bayern  BY  Oberbayern      Landkreis Rosenheim 47.983  12.1278 
DE  21147   Hamburg Hamburg HH          Kreisfreie Stadt Hamburg    53.55   10  

What's for sure is, there are no dots surrounded by digits except for the ones representing the doubles. Unfortunately there are Values without a dot, so it's probably best to check for numbers from the end of the String.

thanks for your help!

Upvotes: 6

Views: 4681

Answers (6)

Uros Majeric
Uros Majeric

Reputation: 456

I think this is the correct pattern for getting the latitude and longitude out of the string which must match for example (45.23423,15.23423) (with or without space after the comma [,])

Answer based on the aioobe's answer above:

Pattern p = Pattern.compile("^(\\d+\\.?\\d*),\\s?(\\d+\\.?\\d*)$");
Matcher m = p.matcher(s1);
if (m.matches()) {
    System.out.println("Long: " + Double.parseDouble(m.group(1)));
    System.out.println("Latt: " + Double.parseDouble(m.group(2)));
}

cheers

Upvotes: 0

Shervin Asgari
Shervin Asgari

Reputation: 24517

If you can use the java.lang.String#split()

//Split by tab
String values[] = myTextLineByLine.split("\t");
List<String> list = Arrays.asList(values);
//Reverse the list so that longitude and latitude are the first two elements
Collections.reverse(list);

String longitude = list.get(0);
String latitude = list.get(1);

Upvotes: 4

polygenelubricants
polygenelubricants

Reputation: 384016

This solution uses Scanner.findWithinHorizon and capturing groups:

    import java.util.*;
    import java.util.regex.*;
    //...

    String text = 
        "DE  83543 Blah blah blah 47.983  12.1278\n" +
        "DE\t21147 100% hamburger beef for 4.99 53.55 10\n";

    Scanner sc = new Scanner(text);
    Pattern p = Pattern.compile(
        "(\\w+) (\\d+) (.*) (decimal) (decimal)"
            .replace("decimal", "\\d+(?:\\.\\d+)?")
            .replace(" ", "\\s+")
    );
    while (sc.findWithinHorizon(p, 0) != null) {
        MatchResult mr = sc.match();
        System.out.printf("[%s|%s] %-30s [%.4f:%.4f]%n",
            mr.group(1),
            mr.group(2),
            mr.group(3),
            Double.parseDouble(mr.group(4)),
            Double.parseDouble(mr.group(5))
        );
    }

This prints:

[DE|83543] Blah blah blah                 [47.9830:12.1278]
[DE|21147] 100% hamburger beef for 4.99   [53.5500:10.0000]

Note the meta-regex approach of using replace to generate the "final" regex. This is done for readability of the "big picture" pattern.

Upvotes: 0

npinti
npinti

Reputation: 52205

I have tried this:

    public static void main(String[] args)
    {
        String str  ="DE 83543   Rott am Inn Bayern  BY  Oberbayern  Landkreis Rosenheim 47.983  12.1278";
        String str1  ="DE  21147   Hamburg Hamburg HH          Kreisfreie Stadt Hamburg    53.55   10  ";

        String[] tempStr1 = str1.split("[ \t]+");

        System.out.println(tempStr1.length);
        double latitude = Double.parseDouble(tempStr1[tempStr1.length - 2]);
        double longitude = Double.parseDouble(tempStr1[tempStr1.length - 1]);

        System.out.println(latitude + ", " + longitude);
    }

It splits the string whenever it encounters white spaces. Since the coordinates will always be the last two elements, it should be able to print them without any problem. Below is the output.

53.55, 10.0

47.983, 12.1278

Upvotes: 0

aioobe
aioobe

Reputation: 421310

    Pattern p = Pattern.compile(".*?(\\d+\\.?\\d*)\\s+(\\d+\\.?\\d*)");
    Matcher m = p.matcher(s1);
    if (m.matches()) {
        System.out.println("Long: " + Double.parseDouble(m.group(1)));
        System.out.println("Latt: " + Double.parseDouble(m.group(2)));
    }
  1. .*? eat characters reluctantly
  2. (\\d+\\.?\\d*) some digits, an optional decimal point, some more digits
  3. \\s+ at least one white-space character (such as a tab character)
  4. (\\d+\\.?\\d*) some digits, an optional decimal point, some more digits

Upvotes: 0

Andreas Dolk
Andreas Dolk

Reputation: 114837

Is it a tabulator separated csv table? Then I'd suggest looking at String#split and simply choosing the two last fields from the resulting String array.

... anyway, even if not csv, split on whitechars and take the two last fields of the String array - those are the lat/lon values and you can convert them with Double#parseDouble.

Upvotes: 3

Related Questions