R11
R11

Reputation: 49

RSS FEED - data parsing

How can I retrieve the location from following parsed data?

    <description>Origin date/time: Mon, 29 Mar 2021 04:23:32 ; Location: BLACKFORD,PERTH/KINROSS ; Lat/long: 56.284,-3.759 ; Depth: 7 km ; Magnitude: 1.0</description>

This details is within the description tag and description is already been parsed to an array list. How do just get the location out of it?

Upvotes: 2

Views: 303

Answers (4)

jdweng
jdweng

Reputation: 34421

Use a dictionary along with Regex :

           string pattern = @"(?'key'[^:]+):\s+(?'value'.*)";
            string input = "Origin date/time: Mon, 29 Mar 2021 04:23:32 ; Location: BLACKFORD,PERTH/KINROSS ; Lat/long: 56.284,-3.759 ; Depth: 7 km ; Magnitude: 1.0";
            string[] splitArray = input.Split(new char[] { ';' });

            Dictionary<string, string> dict = splitArray.Select(x => Regex.Match(x, pattern))
                .GroupBy(x => x.Groups["key"].Value.Trim(), y => y.Groups["value"].Value.Trim())
                .ToDictionary(x => x.Key, y => y.FirstOrDefault());

            string location = dict["Location"];

Or this

            string pattern = @"(?'key'[^:]+):\s+(?'value'[^;]+);?";
            string input = "Origin date/time: Mon, 29 Mar 2021 04:23:32 ; Location: BLACKFORD,PERTH/KINROSS ; Lat/long: 56.284,-3.759 ; Depth: 7 km ; Magnitude: 1.0";
            string[] splitArray = input.Split(new char[] { ';' });
            MatchCollection matches = Regex.Matches(input, pattern);
            Dictionary<string, string> dict = matches.Cast<Match>()
                .GroupBy(x => x.Groups["key"].Value.Trim(), y => y.Groups["value"].Value.Trim())
                .ToDictionary(x => x.Key, y => y.FirstOrDefault());

            string location = dict["Location"];

Upvotes: 0

Arvind Kumar Avinash
Arvind Kumar Avinash

Reputation: 78945

You can use the regex, (?<=Location: ).*?(?= ;) to find and extract the required match.

Solution using Stream API:

import java.util.List;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        String str = "<description>Origin date/time: Mon, 29 Mar 2021 04:23:32 ; Location: BLACKFORD,PERTH/KINROSS ; Lat/long: 56.284,-3.759 ; Depth: 7 km ; Magnitude: 1.0</description>";
        
        List<String> list = Pattern.compile("(?<=Location: ).*?(?= ;)")
                                    .matcher(str)
                                    .results()
                                    .map(MatchResult::group)
                                    .collect(Collectors.toList());
        
        System.out.println(list);
    }
}

Output:

[BLACKFORD,PERTH/KINROSS]

Non-Stream solution:

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
    public static void main(String[] args) {
        String str = "<description>Origin date/time: Mon, 29 Mar 2021 04:23:32 ; Location: BLACKFORD,PERTH/KINROSS ; Lat/long: 56.284,-3.759 ; Depth: 7 km ; Magnitude: 1.0</description>";
        Matcher matcher = Pattern.compile("(?<=Location: ).*?(?= ;)").matcher(str);

        List<String> list = new ArrayList<>();
        while (matcher.find()) {
            list.add(matcher.group());
        }

        System.out.println(list);
    }
}

Output:

[BLACKFORD,PERTH/KINROSS]

Explanation of the regex at regex101:

enter image description here

Upvotes: 4

Big Guy
Big Guy

Reputation: 334

Try

String desc = "Origin date/time: Mon, 29 Mar 2021 04:23:32 ; Location: BLACKFORD,PERTH/KINROSS ; Lat/long: 56.284,-3.759 ; Depth: 7 km ; Magnitude: 1.0";
String[] parts = desc.split(";");

for ( String part : parts )
{     
    if ( part.contains("Location") )
    {
        parts = part.split(":");
        
        System.out.println("***************** Location is: '" + parts[1].trim() + "'");

        break;
    }
}

Upvotes: -1

David Brossard
David Brossard

Reputation: 13832

If all you get is

Origin date/time: Mon, 29 Mar 2021 04:23:32 ; Location: BLACKFORD,PERTH/KINROSS ; Lat/long: 56.284,-3.759 ; Depth: 7 km ; Magnitude: 1.0

You're going to have to either (a) determine the standard that dictates this format if any or (b) do it yourself i.e. look at the structure and decide to parse based on that.

Simple way with split()

It seems you can use the split() method on a String using separator " ; ". That should give you an array of length 5.

You could then assume Location is always in the second position or simply iterate over the array until you find the string that starts with Location.

Example

public class Location {
    public static void main(String[] args) {
        String rawData = "Origin date/time: Mon, 29 Mar 2021 04:23:32 ; Location: BLACKFORD,PERTH/KINROSS ; Lat/long: 56.284,-3.759 ; Depth: 7 km ; Magnitude: 1.0\r\n";
        String[] dataArray = rawData.split(" ; ");
        System.out.println(dataArray[1]);
    }
}

The Regular Expression Way

Alternatively, you can use a regular expression that could give you the value outright without going through the steps I just described. The value you are looking for is always preceded by Location: and ends with ; Have a look at this primer to get going.

    Pattern pattern = Pattern.compile("(?<=Location: ).*?;", Pattern.CASE_INSENSITIVE);
    Matcher matcher = pattern.matcher(rawData);
    boolean matchFound = matcher.find();
    if(matchFound) {
      System.out.println("Match found: "+matcher.group());
    } else {
      System.out.println("Match not found");
    }

Upvotes: 1

Related Questions