Galwegian
Galwegian

Reputation: 42257

Find start index and end index of substring

I have a string like "________ _____________ occurs when a behaviour is immediately followed by the removal of an aversive ________ that increases the future frequency of the behaviour"

I want to return and array or list that has the first and last index position of each underscored area.

e.g. in my example, I'd get (0, 7, 9, 21, 101, 108)

The 6 numbers are the indexes of the start and end of the three sections of underscores - the first 'blank' starts at index 1 and ends at index 7, the second starts at position 9 and ends at 21 etc.

This is what I've done so far but I'm stuck

public List<int> GetPositions(string source, string searchString)
{
    List<int> ret = new List<int>();
    int len = searchString.Length;
    int start = -len;
    while (true)
    {
        start = source.IndexOf(searchString, start + len);
        if (start == -1)
        {
            break;
        }
        else
        {
            ret.Add(start);
        }
    }
    return ret;
}

Upvotes: 2

Views: 4411

Answers (3)

user7396598
user7396598

Reputation: 1289

You can find all of these using the various overloads of string.IndexOf().

You can get the start of the first "blank" with:

sourceString.IndexOf('_');

Then the end of the first blank with:

sourceString.IndexOf("_ ");

The start of the second "blank" with:

sourceString.IndexOf('_', endBlank1Index + 1);

The end of the second "blank" with:

sourceString.IndexOf("_ ", startBlank2Index);

Rinse and repeat until no other occurrences are found.

Upvotes: 2

Kobi
Kobi

Reputation: 138147

You can use a simple regular expression for that:

var matches = Regex.Matches(s, "_+");
var result = new List<int>();
foreach(Match m in matches)
{
    result.Add(m.Index);
    result.Add(m.Index + m.Length - 1);
}
Console.WriteLine(String.Join(", ", result));

Working example: https://dotnetfiddle.net/GX9MXR

If you want to avoid underscored within words you can also use @"\b_+\b".

Upvotes: 4

stuartd
stuartd

Reputation: 73303

This seems to do what you want, if you're averse to regular expressions:

public List<int> GetUnderscorePositions(string source)
{
   List<int> positions = new List<int>();
   bool withinUnderscore = false;

   for (int i = 0; i < source.Length; i++) {
        var c = source[i];
        if (c == '_') {
            if (withinUnderscore) {
                continue;
            }
            else {
                withinUnderscore = true;
                positions.Add(i);
            }
        }
        else if (withinUnderscore) {
            withinUnderscore = false;
            positions.Add(i - 1);   
       }
    }

    return positions;
}

Upvotes: 3

Related Questions