Reputation: 959
I have a stuck at my project and cannot get over this difficulty. I want some help from others to give me a solution for this issue:
I have a string and inside of that string there are some token texts which I want to get them out manually and put them into an array list of string. The final result could have two array list, one of normal texts and the other is of token texts. Below is a sample of string which contains some token surrounded by open tag "[[" and close tag "]]".
The first step, where the wort is prepared by mixing the starch source with hot water, is known as [[Textarea]]. Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around [[CheckBox]], during which the starches are converted to sugars, and then the sweet wort is drained off the grains. The grains are now washed in a process known as [[Radio]]. This washing allows the brewer to gather [[DropDownList]] the fermentable liquid from the grains as possible.
There are two array lists got after manipulating the string:
Result:
Normal Text ArrayList { "The first step, where the wort is prepared by mixing the starch source with hot water, is known as ", ". Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around ", ", during which the starches are converted to sugars, and then the sweet wort is drained off the grains. The grains are now washed in a process known as ", ". This washing allows the brewer to gather ", " the fermentable liquid from the grains as possible." }
Token Text ArrayList { "[[Textarea]]", "[[CheckBox]]", "[[Radio]]", "[[DropDownList]]" }
The two array lists, one is normal text array list has 5 elements which are texts before or after the token, the other is token text array list has 4 elements which are token texts inside of the string.
This works can be done which technique of cut and substring, but it is too hard for a long long text, and will be easily to get error and some time cannot get what I want. If there are some help in this issue, please post in C# because I am using C# to do this task.
Upvotes: 0
Views: 137
Reputation: 239654
This seems to do the job (although note that at the moment, my tokens
array contains the plain tokens, rather than them being wrapped with [[
and ]]
:
var inp = @"The first step, where the wort is prepared by mixing the starch source with hot water, is known as [[Textarea]]. Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around [[CheckBox]], during which the starches are converted to sugars, and then the sweet wort is drained off the grains. The grains are now washed in a process known as [[Radio]]. This washing allows the brewer to gather [[DropDownList]] the fermentable liquid from the grains as possible.";
var step1 = inp.Split(new string[] { "[[" }, StringSplitOptions.None);
//step1 should now contain one string that's due to go into normal, followed by n strings which need to be further split
var step2 = step1.Skip(1).Select(a => a.Split(new string[] { "]]" }, StringSplitOptions.None));
//step2 should now contain pairs of strings - the first of which are the tokens, the second of which are normal strings.
var normal = step1.Take(1).Concat(step2.Select(a => a[1])).ToArray();
var tokens = step2.Select(a => a[0]).ToArray();
This also assumes that there are no unbalanced [[
and ]]
sequences in the input.
The observations that went into this solution: If you were to split the string first around each [[
pair in the original text, then the first output string has already been produced. Furthermore, every string after that first one consists of a token, the ]]
pair, and a normal text. E.g. the second result in step1
is: "Textarea]]. Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around "
So, if you split these other results around the ]]
pairs, then the first result is a token, and the second result is a normal string.
Upvotes: 1