Reputation: 1173
I am capturing mainframe screen using c# and I have to read the labels corresponds to for the text entering region from the screen. Currently Iam reading it from the captured image using tesseract ocr plugin, It returns a string, I want split that string according to some characters in it. The characters are the following.
{ '@', '<', '>', '=', '$', '%', '&' }
and for splitting a sample string is shown below
first name => saran address @> my address
Any way to split this string using regex as the following format to an array
[0]: "first name"
[1]: "=> saran"
[2]: "address"
[3]: "@> my address"
Upvotes: 0
Views: 158
Reputation: 117064
This gets you very close (but not using Regex
):
char[] splitters = new[] { '@', '<', '>', '=', '$', '%', '&' };
string text = "first name => saran address @> my address";
string[] results =
text
.Aggregate(new List<List<char>>() { new List<char>() }, (a, c) =>
{
var l = a.Last();
if (splitters.Contains(c) && !l.All(x => splitters.Contains(x)))
{
l = new List<char>() { c };
a.Add(l);
}
else
{
l.Add(c);
}
return a;
})
.Select(x => new string(x.ToArray()))
.ToArray();
There's just nothing in your description as to how to split "saran address"
. Other than that this is tested and produces this:
first name => saran address @> my address
Upvotes: 1