Reputation: 295
Having a bit of a problem as I have to translate a string into a table. I'd like to remove multiple spaces, but not all of them. So the data in text comes back with lots of spaces in between like so:
SESSIONNAME USERNAME ID STATE TYPE DEVICE\r\n
services 0 Disc \r\n
console 1 Conn \r\n
alinav 2 Disc \r\n
rdp-tcp 65536 Listen \r\n
I would like to still keep the \r\n\ values that will define my rows, and I want to keep the empty value which would be legit under the columns, and I want to keep the spaces to define the columns. But I want to remove the extra spaces that I don't want to be fed into the values.
I've tried:
output = Regex.Replace(output, @"\s{2,}", " ", RegexOptions.Multiline);
output = output.Replace(" ", " ");
But the first one just removes everything (things I need and don't need). And the second one still leaves too many spaces.
Thanks.
Upvotes: 1
Views: 929
Reputation: 21722
In your example the data is delimited by position, not by characters; is that correct? If so, you should extract by position; something like:
foreach (string s in output.Split())
{
var sessionName = s.Substring(0, 18).Trim();
var userName = s.Substring(18, 19).Trim();
var id = Int32.Parse(s.Substring(37, 8).Trim());
var whateverType = s.Substring(45, 12).Trim();
var device = s.Substring(57, 6).Trim();
}
Of course you need to do proper error checking, and should probably put the field widths in an array and calculate positions instead of hard-coding them as I have shown.
Upvotes: 2
Reputation: 477607
You can do two things:
Use space explicitly in the regular expression, \s
includes weird characters like (\n
, \r
, \t
,...) as well, thus:
output = Regex.Replace(output, @" +", " ", RegexOptions.Multiline);
Or apply the second method until convergence:
string s2 = output;
do {
output = s2;
s2 = s2.Replace(" "," ");
} while(output != s2);
In most cases the first method will outperform the second one. This because the first method groups all substrings with two or more spaces. Regexes are in general a bit slower than simple string replacement, but if the string contains sequences with many spaces, the first method will be faster.
Upvotes: 3