Reputation:
I'm wondering if there is a good way to find content, and also split the results by the found content, for example if I have string:
string str = "you androids don't exactly cover for each other in times of stress. i think you're right it would seem we lack a specific talent you humans possess i believe it's called empathy";
and the search strings, for example:
var sList = new List {"for each other", "talent", "you humans"};
The result with the found strings separated by spitting of the original string would be:
you androids don't exactly cover for each other in times of stress. i think you're right it would seem we lack a specific talent you humans possess i believe it's called empathy
In case the same string is in two different search strings (here it you):
var sList = new List {"for each other", "other in", "talent", "you humans", "you"};
The correct output should be this:
you androids don't exactly cover for each other other in times of stress. i think you're right it would seem we lack a specific talent you you humans possess i believe it's called empathy
Upvotes: 1
Views: 99
Reputation: 4860
You can use regular expressions to match a set of strings within a string, and then you need to account for gaps in between, adjusting for overlapping matched ranges:
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System.Linq;
public class Program
{
public static void Main()
{
string str = "you androids don't exactly cover for each other in times of stress. i think youre right it would seem we lack a specific talent you humans possess i believe it's called empathy";
var sList = new List<string> {"for each other", "other in", "talent", "you humans", "you"};
var chRangeMap = new bool[str.Length];
for (var i = 0; i < chRangeMap.Length; ++i) chRangeMap[i] = false;
var matchedTokenMap = sList
.Select(i => "\\b" + Regex.Escape(i) + "\\b")
.SelectMany(p => (new Regex(p)).Matches(str).OfType<Match>())
.Cast<Match>()
.Select(m => new
{
StartIndex = m.Index,
EndIndex = m.Index + m.Length,
Length = m.Length
})
.Select(r => {
for (var i = r.StartIndex; i < r.EndIndex; ++i) chRangeMap[i] = true;
return r;
});
var fullTokenized =
matchedTokenMap.Concat(
GetArrayRanges(chRangeMap, false)
.Select(r => new
{
StartIndex = r.Item1,
EndIndex = r.Item2,
Length = r.Item2 - r.Item1
})
)
.OrderBy(k => k.StartIndex).ThenBy(sk => sk.Length);
foreach(var token in fullTokenized)
{
WriteTrimmed(str.Substring(token.StartIndex, token.Length));
}
}
private static void WriteTrimmed(string str)
{
str = str.Trim();
if (!string.IsNullOrWhiteSpace(str))
{
Console.WriteLine(str);
}
}
private static IEnumerable<Tuple<int, int>> GetArrayRanges(bool[] array, bool seekValue)
{
int? rangeStart = null;
for(var i = 0; i < array.Length; ++i)
{
if (array[i] == seekValue)
{
if (!rangeStart.HasValue)
{
rangeStart = i;
}
}
else
{
if (rangeStart.HasValue)
{
yield return Tuple.Create(rangeStart.Value, i);
rangeStart = null;
}
}
}
if (rangeStart.HasValue)
{
yield return Tuple.Create(rangeStart.Value, array.Length);
}
}
}
Upvotes: 0
Reputation: 4981
Try this:
List<string> parts = new List<string> { str };
sList.ForEach(seperator => parts = parts
.SelectMany(part => Regex.Match(part, "(.*) ?(\\b" + seperator + "\\b) ?(.*)|(.+)")
.Groups
.Cast<Group>()
.Where(group => group.Success)
.Select(group => group.Value)
.Skip(1))
.ToList());
parts = parts
.Where(x => !string.IsNullOrWhiteSpace(x))
.ToList();
Output:
you
androids don't exactly cover
for each other
in times of stress. i think youre right it would seem we lack a specific
talent
you
humans
possess i believe it's called empathy
Upvotes: 1