deepseapanda
deepseapanda

Reputation: 3887

Extracting year from a string

I have the strings in the format of:

AM Kaplan, M Haenlein - Business horizons, 2010 - Elsevier
A Lenhart, K Purcell, A Smith, K Zickuhr - 2010 - pewinternet.org

And would like to extract the year.

I was using:

year = year.Substring(year.LastIndexOf(",") + 1, year.LastIndexOf("-") - 1).Trim();

But got length errors and this would also break when the last index needed is '-' for the start of the substring instead of ','.

How can I extract the year properly?

Upvotes: 2

Views: 140

Answers (3)

Rick Davin
Rick Davin

Reputation: 1041

Looks like the year appears as the last element of the comma-delimited string, but it doesn't always fall between 2 hyphens. What it looks like it that it appears before the last hyphen. If that is always the case, this works:

    int ExtractYear(string delimitedString)
    {
        // Only works if Year appears in the last split field of the delimitedString
        // and also Year is the 2nd to last sub-field of that last field.
        var fields = delimitedString.Split(new char[] {','});
        var subfields = fields.Last().Split(new char[] {'-'});
        int result = 0; 
        // -1 denotes bad value
        return int.TryParse(subfields[subfields.Length - 2], out result) ? result : -1;
    }

Upvotes: 0

Sergey Berezovskiy
Sergey Berezovskiy

Reputation: 236218

Following expression verifies string for authors - optionalPublisher year - site format:

var s = "AM Kaplan, M Haenlein - Business horizons, 2010 - Elsevier";

var match = Regex.Match(s, @".+ - .*(\d{4}) - .+");
if (match.Success)
{
     var year = match.Groups[1].Value;
}

Upvotes: 2

wflynny
wflynny

Reputation: 18521

s = 'A Lenhart, K Purcell, A Smith, K Zickuhr - 2010 - pewinternet.org'

If the year is always in the last element of the string separated by commas and is always between two hyphens, then you could do something simple like

last = s.split(',')[-1]
year = int(last.split(' - ')[1])

s.split(delimiter) transforms the string into a list object, where each element in the list is a substring of s partitioned by delimiter, which in your case are commas and hyphens.

Upvotes: 0

Related Questions