Vicky
Vicky

Reputation: 17375

Searching for a pattern in a string

I have an input string as below:

john is a StartDate 10\11\2012 EndDate 15\11\2012 john is a boy john is StartDate john

I want to extract the two dates StartDate and EndDate from above string.

However, I can not just search for word StartDate because as seen towards the end of the string, StartDate may come as an independent word. I can not take first instance because there is no gaurantee that StartDate word with the dates will always be first.

So solution would be to search for pattern StartDate % EndDate % together. i.e. both StartDate and EndDate words together.

What would be the best way to achieve this?

One solution I can think of is for every instance of word StartDate, take the substring of next four words (including StartDate) and search for word EndDate in that subString. If its exists, we have the correct substring else go for the next instance of StartDate word and repeat the task.

Upvotes: 0

Views: 205

Answers (3)

Keppil
Keppil

Reputation: 46209

I would go for a simple regex, since your pattern is so well defined:

String input = "john is a StartDate 10\\11\\2012 EndDate 15\\11\\2012 john is a boy john is StartDate john";
Matcher matcher = Pattern.compile("StartDate (.*?) EndDate (.*?) ").matcher(input);
if (matcher.find()) {
  startDate = matcher.group(1);
  endDate = matcher.group(2);
}

Upvotes: 0

nhahtdh
nhahtdh

Reputation: 56809

A quick and dirty way to extract with regex (replaceFirst):

String input = "john is a StartDate 10\\11\\2012 EndDate 15\\11\\2012 john is a boy john is StartDate john";

String startDate = input.replaceFirst(".*(StartDate \\d{1,2}\\\\\\d{1,2}\\\\\\d{4}).*", "$1");
String endDate = input.replaceFirst(".*(EndDate \\d{1,2}\\\\\\d{1,2}\\\\\\d{4}).*", "$1");

System.out.println(startDate);
System.out.println(endDate);

If you just want the dates only:

String startDate = input.replaceFirst(".*StartDate (\\d{1,2}\\\\\\d{1,2}\\\\\\d{4}).*", "$1");
String endDate = input.replaceFirst(".*EndDate (\\d{1,2}\\\\\\d{1,2}\\\\\\d{4}).*", "$1");

Upvotes: 1

18bytes
18bytes

Reputation: 6029

Use regular expression to match the date.

Regex: .*?StartDate[ ]+(\d{2}\\\d{2}\\\d{4})[ ]+EndDate[ ]+(\d{2}\\\d{2}\\\d{4})).*

  • In the above regex first group matched is the start date, and the second group matched is the end date.

Refer the following link to know how to use regex in Java: http://docs.oracle.com/javase/tutorial/essential/regex/

Upvotes: 0

Related Questions