Reputation: 11326
I have a string[]
which contains code. Each line contains some leading spaces. I need to 'unindent' the code as much as possible without changing the existing formatting.
For instance the contents of my string[]
might be
public class MyClass { private bool MyMethod(string s) { return s == ""; } }
I'd like to find a reasonably elegant and efficient method (LINQ?) to transform it to
public class MyClass { private bool MyMethod(string s) { return s == ""; } }
To be clear I'm looking for
IEnumerable<string> UnindentAsMuchAsPossible(string[] content)
{
return ???;
}
Upvotes: 5
Views: 659
Reputation: 79461
Building on Tim Schmelter's answer:
static IEnumerable<string> UnindentAsMuchAsPossible(IEnumerable<string> lines, int tabWidth = 4)
{
if (!lines.Any())
{
return Enumerable.Empty<string>();
}
var minDistance = lines
.Where(line => line.Length > 0)
.Min(line => line
.TakeWhile(Char.IsWhiteSpace)
.Sum(c => c == '\t' ? tabWidth : 1));
var spaces = new string(' ', tabWidth);
return lines
.Select(line => line.Replace("\t", spaces))
.Select(line => line.Substring(Math.Min(line.Length, minDistance)));
}
This handles:
Upvotes: 4
Reputation: 25810
Use a little LINQ and Regex to find the shortest indentation, then remove that number of characters from all lines.
string[] l_lines = {
" public class MyClass",
" {",
" private bool MyMethod(string s)",
" {",
" return s == \"\";",
" }",
" }"
};
int l_smallestIndentation =
l_lines.Min( s => Regex.Match( s, "^\\s*" ).Value.Length );
string[] l_result =
l_lines.Select( s => s.Substring( l_smallestIndentation ) )
.ToArray();
foreach ( string l_line in l_result )
Console.WriteLine( l_line );
Prints:
public class MyClass
{
private bool MyMethod(string s)
{
return s == "";
}
}
This program will scan all strings in the array. If you can assume that the first line is the least indented, then you could improve performance by scanning only the first line:
int l_smallestIndentation =
Regex.Match( l_lines[0], "^\\s*" ).Value.Length;
Also note that this will handle a tab character ("\t"
) as a single character. If there is a mix of tabs and spaces, then reversing the indent may be tricky. The easiest way to handle that would be to replace all instances of tabs with the appropriate number of spaces (often 4, though individual applications can vary wildly) before running the code above.
It would also be possible to modify the code above to give additional weight to tabs. At that point, the regex is no longer of much use.
string[] l_lines = {
"\t\t\tpublic class MyClass",
" {",
" private bool MyMethod(string s)",
" {",
" \t \t\treturn s == \"\";",
" }",
"\t\t\t}"
};
int l_tabWeight = 8;
int l_smallestIndentation =
l_lines.Min
(
s => s.ToCharArray()
.TakeWhile( c => Char.IsWhiteSpace( c ) )
.Select( c => c == '\t' ? l_tabWeight : 1 )
.Sum()
);
string[] l_result =
l_lines.Select
(
s =>
{
int l_whitespaceToRemove = l_smallestIndentation;
while ( l_whitespaceToRemove > 0 )
{
l_whitespaceToRemove -= s[0] == '\t' ? l_tabWeight : 1;
s = s.Substring( 1 );
}
return s;
}
).ToArray();
Prints (assuming your console window has a tab width of 8 like mine):
public class MyClass
{
private bool MyMethod(string s)
{
return s == "";
}
}
You may need to modify this code to work with edge-case scenarios, such as zero-length lines or lines containing only whitespaces.
Upvotes: 2
Reputation: 36073
To match your desired method interface:
IEnumerable<string> UnindentAsMuchAsPossible(string[] content)
{
int minIndent = content.Select(s => s.TakeWhile(c => c == ' ').Count()).Min();
return content.Select(s => s.Substring(minIndent)).AsEnumerable();
}
This gets the minimum indent of all lines (assumes spaces only, no tabs), then strips minIndent
spaces from the start of each line and returns that as IEnumerable
.
Upvotes: 1
Reputation: 460108
This should work:
static IEnumerable<string> UnindentAsMuchAsPossible(IEnumerable<string> input)
{
int minDistance = input.Min(l => l.TakeWhile(Char.IsWhiteSpace).Count());
return input.Select(l => l.Substring(minDistance));
}
It moves the code to the left, all lines with the same number of spaces.
For example:
string testString = @"
public class MyClass
{
private bool MyMethod(string s)
{
return s == "";
}
}";
string[] lines = testString.Split(new[] { Environment.NewLine }, StringSplitOptions.None);
string[] unindentedArray = UnindentAsMuchAsPossible(lines).ToArray();
Upvotes: 3
Reputation: 203819
Just count the number of leading spaces on the first line, and then "remove" that many characters from the start of each line:
IEnumerable<string> UnindentAsMuchAsPossible(string[] content)
{
int spacesOnFirstLine = content[0].TakeWhile(c => c == ' ').Count();
return content.Select(line => line.Substring(spacesOnFirstLine));
}
Upvotes: 3
Reputation: 69260
This will first find the minimum ident and then remove that number of spaces for each line.
var code = new [] { " foo", " bar" };
var minIndent = code.Select(line => line.TakeWhile(ch => ch == ' ').Count()).Min();
var formatted = code.Select(line => line.Remove(0, minIndent));
It would be possible to write everything in one single expression, but while it is more functionally elegant I think that the minIndent
variable makes the code more readable.
Upvotes: 1