Andy
Andy

Reputation: 3682

Get first alphanumeric or special character in string from start index

Say I have a string such as: ma, 100 or, ma, word, or even ma. , *+ etc.

How can I find the position of the first character that is not some form of punctuation (i.e full stop, comma, colon, semi-colon) or whitespace, after an index. So, in the last example above, I'd want to get the position of * when I pass in 1 as a start index (zero-based).

Upvotes: 1

Views: 4512

Answers (3)

Jim Mischel
Jim Mischel

Reputation: 133995

Create an array of the characters that you want to match and call String.IndexOfAny

For example:

const string GoodCharsStr =
    "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxy";
readonly char[] GoodChars = GoodCharsStr.ToCharArray();

string search = "ma, 100";
int position = search.IndexOfAny(GoodChars, 1);
if (position == -1)
{
    // not found
}
char foundChar = search[position];

Upvotes: 4

keyboardP
keyboardP

Reputation: 69372

You can use a method like this

public static int GetFirstNonPunctuationCharIndex(string input, int startIndex, char[] punctuation)
{
    //Move the startIndex forward one because we ignore the index user set
    startIndex = startIndex + 1 < input.Length ? startIndex + 1 : input.Length;                 

    for (int i = startIndex  ; i < input.Length; i++)
    {
        if (!punctuation.Contains(input[i]) && !Char.IsWhiteSpace(input[i]))
        {
             return i;
        }
    }

    return -1;
}

You would call it by passing in the string, starting index, and an array of characters you consider to be punctuation.

string myString = @"ma. , *+";
char[] puncArray = new char[4] { '.', ',', ';', ':' };
int index = GetFirstNonPunctuationCharIndex(myString, 1, puncArray)

Normally I'd use the Char.IsPunctuation method but apparently it considers * to be a punctuation character so you'll have to roll your own like above.

Upvotes: 1

RutledgePaulV
RutledgePaulV

Reputation: 2606

You'll need to define what exactly a special character is.

If it's a non-consecutive set (according to ASCII ordering, see http://www.asciitable.com/) then you'll need to define a new allowed character set and check against that set.

Something like this should work:

public const string allowed = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890.,";

public int RetrieveIndex(string input, int startIndex)
{
    for (var x = startIndex; x < input.length; x++)
    {
        if (allowed.IndexOf(input[x])==-1)
        {
            return x;
        }
     }

    return -1;
}

However, if it is a consecutive set as defined by the ASCII standard:

Just figure out which range is considered acceptable or special and check against that by converting the character to an integer and checking if it lies within the range. This would prove faster than the calls to allowed.IndexOf(...).

Upvotes: 3

Related Questions