Reputation: 3887
I have the strings in the format of:
AM Kaplan, M Haenlein - Business horizons, 2010 - Elsevier
A Lenhart, K Purcell, A Smith, K Zickuhr - 2010 - pewinternet.org
And would like to extract the year.
I was using:
year = year.Substring(year.LastIndexOf(",") + 1, year.LastIndexOf("-") - 1).Trim();
But got length errors and this would also break when the last index needed is '-' for the start of the substring instead of ','.
How can I extract the year properly?
Upvotes: 2
Views: 140
Reputation: 1041
Looks like the year appears as the last element of the comma-delimited string, but it doesn't always fall between 2 hyphens. What it looks like it that it appears before the last hyphen. If that is always the case, this works:
int ExtractYear(string delimitedString)
{
// Only works if Year appears in the last split field of the delimitedString
// and also Year is the 2nd to last sub-field of that last field.
var fields = delimitedString.Split(new char[] {','});
var subfields = fields.Last().Split(new char[] {'-'});
int result = 0;
// -1 denotes bad value
return int.TryParse(subfields[subfields.Length - 2], out result) ? result : -1;
}
Upvotes: 0
Reputation: 236218
Following expression verifies string for authors - optionalPublisher year - site
format:
var s = "AM Kaplan, M Haenlein - Business horizons, 2010 - Elsevier";
var match = Regex.Match(s, @".+ - .*(\d{4}) - .+");
if (match.Success)
{
var year = match.Groups[1].Value;
}
Upvotes: 2
Reputation: 18521
s = 'A Lenhart, K Purcell, A Smith, K Zickuhr - 2010 - pewinternet.org'
If the year is always in the last element of the string separated by commas and is always between two hyphens, then you could do something simple like
last = s.split(',')[-1]
year = int(last.split(' - ')[1])
s.split(delimiter)
transforms the string into a list
object, where each element in the list is a substring of s
partitioned by delimiter
, which in your case are commas and hyphens.
Upvotes: 0