Reputation: 89
I have been battling an apparently simple problem for the past couple of hours. I know the solution will use LINQ and recursion but I just cannot get there.
My sample code is below, as is the desired output (something like it, I really do not care, it is building up the hierarchy correctly that is fundamental).
Any help would help.
using System;
using System.Collections.Generic;
namespace ConsoleApp14
{
class Program
{
static string DoSomething(KeyValuePair<string, string>[] dir)
{
return ""; //something here
}
static void Main(string[] args)
{
KeyValuePair<string, string>[] dir = new[]
{
new KeyValuePair<string, string>(@"c:\one\two\three","100.txt"),
new KeyValuePair<string, string>(@"c:\one\two\three","101.txt"),
new KeyValuePair<string, string>(@"c:\one\four\five","501.txt"),
new KeyValuePair<string, string>(@"c:\one\six\","6.txt"),
new KeyValuePair<string, string>(@"c:\one\six","7.txt"),
new KeyValuePair<string, string>(@"c:\one\","1.txt"),
new KeyValuePair<string, string>(@"c:\one\six\","8.txt"),
new KeyValuePair<string, string>(@"c:\one\two","2.txt"),
new KeyValuePair<string, string>(@"c:\one\two\three\four\five\six\seven","y.txt"),
new KeyValuePair<string, string>(@"c:\one\xxx","xxx.txt")
};
// this is the output I want, rough indentation and crlf, the ordering is not important, just the hierarchy
Console.WriteLine(DoSomething(dir));
//
// one
// (1.txt)
// two
// (2.txt)
// three
// (100.txt)
// (101.txt)
// four
// five
// six
// seven
// (y.txt)
// four
// five
// (501.txt)
// six
// (6.txt)
// (7.txt)
// (8.txt)
// xxx
// (xxx.txt)
//
}
}
}
Upvotes: 2
Views: 253
Reputation: 660463
This is a data structures problem, not an algorithm problem. Once you have the right data structures, the algorithm will be straightforward.
The data structure you need is: a node is either a file or a directory:
abstract class Node {}
sealed class Directory : Node {}
sealed class File : Node {}
OK, what do we know about a node? Just that it has a name:
abstract class Node
{
public string Name { get; private set; }
public Node(string name) { this.Name = name; }
}
What do we know about a file? Just that it has a name.
sealed class File : Node
{
public File(string name) : base(name) { }
}
What do we know about a directory? It has a name and a list of child nodes:
sealed class Directory : Node
{
public Directory(string name) : base(name) { }
public List<Node> Children { get; } = new List<Node>();
We'll want to be able to add a child:
public File WithFile(string name)
{
// TODO: Is there already a child that has this name? return it.
// TODO: Otherwise add it
}
public Directory WithDirectory(string name)
// TODO: Same.
Great, now we can take a directory and add a subdirectory or a file; if one already exists, we get it back.
Now, what is your specific problem? You have a sequence of directory names and a file name, and you want to add that file to the directory. So write that!
public Directory WithDirectory(IEnumerable<string> directories)
{
Directory current = this;
foreach(string d in directories)
current = current.WithDirectory(d);
return current;
}
public File WithFile(IEnumerable<string> directories, string name)
{
return this.WithDirectory(directories).WithFile(name);
}
Now all you have to do is break up each path into a sequence of names. So your algorithm is
Directory root = new Directory("c:");
foreach(KeyValuePair pair in dir)
{
IEnumerable<string> dirs = TODO break up the key into a sequence of strings
root.WithFile(dirs, pair.Value);
}
And when you're done you have a data structure that represents your tree.
Now that you have a tree, write a method on Node:
override string ToString() => this.ToString(0);
string ToString(int indent)
// TODO can you implement this?
The key here is get the data structure right. A directory is just a name plus a list of subdirectories and files, so write that code. Once you have the data structure right, the rest follows naturally. Notice that every method I wrote is only a couple lines long. (The ones I left as TODOs are similarly very small. Implement them.) That's what you want: do one thing in each method and do it extremely well. If you find that you are writing huge long complicated methods, stop and refactor it down into many small methods that each do one clear thing.
EXERCISE: Implement a version of ToString called ToBoxyString that produces:
c:
└─one
├─(1.txt)
├─two
│ ├─(2.txt)
│ └─three
... and so on. It is not as hard as it looks; it's just a fancier indenting. Can you figure out the pattern?
Upvotes: 5
Reputation: 26926
Using some utility extension methods I like:
public static class Ext {
public static ArraySegment<T> Slice<T>(this T[] src, int start, int? count = null) => (count.HasValue ? new ArraySegment<T>(src, start, count.Value) : new ArraySegment<T>(src, start, src.Length - start));
public static string Join(this IEnumerable<string> strings, string sep) => String.Join(sep, strings.ToArray());
public static string Join(this IEnumerable<string> strings, char sep) => String.Join(sep.ToString(), strings.ToArray());
public static string Repeat(this char ch, int n) => new String(ch, n);
}
You can use LINQ to process the paths numerically, which doesn't require any recursion but isn't terribly efficient (it makes a traversal through the original array twice for each depth in the whole tree). The code looks long but it is mostly because I put in lots of comments.
static IEnumerable<string> DoSomething(KeyValuePair<string, string>[] dir) {
char[] PathSeparators = new[] { Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar };
// some local utility functions
// split a path into an array of its components [drive,dir1,dir2,...]
string[] PathComponents(string p) => p.Split(PathSeparators, StringSplitOptions.RemoveEmptyEntries);
// Combine path components back into a canonical path
string PathCombine(IEnumerable<string> p) => p.Join(Path.DirectorySeparatorChar);
// return all distinct paths that are depth deep, truncating longer paths
IEnumerable<string> PathsAtDepth(IEnumerable<(string Path, string[] Components, string Filename)> dirs, int depth)
=> dirs.Select(pftuple => pftuple.Components)
.Where(pa => pa.Length > depth)
.Select(pa => PathCombine(pa.Slice(0, depth + 1)))
.Distinct();
// split path into components clean up trailing PathSeparators
var cleanDirs = dir.Select(kvp => (Path: kvp.Key.TrimEnd(PathSeparators), Components: PathComponents(kvp.Key), Filename: kvp.Value));
// find the longest path
var maxDepth = cleanDirs.Select(pftuple => pftuple.Components.Length).Max();
// ignoring drive, gather all paths at each length and the files beneath them
var pfl = Enumerable.Range(1, maxDepth)
.SelectMany(d => PathsAtDepth(cleanDirs, d) // get paths down to depth d
.Select(pathAtDepth => new {
Depth = d,
Path = pathAtDepth,
// gather all filenames for pathAtDepth d
Files = cleanDirs.Where(pftuple => pftuple.Path == pathAtDepth)
.Select(pftuple => pftuple.Filename)
.ToList()
}))
.OrderBy(dpef => dpef.Path); // sort into tree
// convert each entry into its directory path end followed by all files beneath that directory
var stringTree = pfl.SelectMany(dpf => dpf.Files.Select(f => ' '.Repeat(4 * (dpf.Depth - 1)) + $"({f})")
.Prepend(' '.Repeat(4 * (dpf.Depth - 1)) + Path.GetFileName(dpf.Path)));
return stringTree;
}
My version of DoSomething
returns an IEnumerable<string>
which you can Join
back into a single string on output if you like:
Console.WriteLine(DoSomething(dir).Join(Environment.NewLine));
Upvotes: 1
Reputation: 26926
Since my first try was so long, I decided to add this alternative as a separate answer. This version is more efficient as it goes through the dir array once.
Using some extension methods as before:
public static class Ext {
public static ArraySegment<T> Slice<T>(this T[] src, int start, int? count = null) => (count.HasValue ? new ArraySegment<T>(src, start, count.Value) : new ArraySegment<T>(src, start, src.Length - start));
public static string Join(this IEnumerable<string> strings, string sep) => String.Join(sep, strings.ToArray());
public static string Join(this IEnumerable<string> strings, char sep) => String.Join(sep.ToString(), strings.ToArray());
public static string Repeat(this char ch, int n) => new String(ch, n);
}
I process the dir array into a Lookup
which gathers all the files beneath each path. Then I can sort the paths into a tree, and for each path, add the path and the files beneath it. For each subset of the path, I add an empty path entry if it doesn't contain files when converting into a string tree.
static IEnumerable<string> DoSomething(KeyValuePair<string, string>[] dir) {
char[] PathSeparators = new[] { Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar };
// some local utility functions
int PathDepth(string p) => p.Count(ch => PathSeparators.Contains(ch));
string PathToDepth(string p, int d) => p.Split(PathSeparators).Slice(0, d+1).Join(Path.DirectorySeparatorChar);
// gather distinct paths (without trailing separators) and the files beneath them
var pathsWithFiles = dir.ToLookup(d => d.Key.TrimEnd(PathSeparators), d => d.Value);
// order paths with files into tree
var pfl = pathsWithFiles.Select(pfs => new {
Path = pfs.Key, // each path
Files = pfs.ToList() // the files beneath it
})
.OrderBy(dpef => dpef.Path); // sort into tree
// convert each entry into its directory path end followed by all files beneath that directory
// add entries for each directory that has no files
var stringTree = pfl.SelectMany(pf => Enumerable.Range(1, PathDepth(pf.Path))
// find directories without files
.Where(d => !pathsWithFiles.Contains(PathToDepth(pf.Path, d)))
// and add an entry for them
.Select(d => ' '.Repeat(4 * (d-1)) + Path.GetFileName(PathToDepth(pf.Path, d)))
// then add all the files
.Concat(pf.Files.Select(f => ' '.Repeat(4 * (PathDepth(pf.Path)- 1)) + $"({f})")
// and put the top dir first
.Prepend(' '.Repeat(4 * (PathDepth(pf.Path)-1)) + Path.GetFileName(pf.Path)))
);
return stringTree;
}
Again you can call it as before:
Console.WriteLine(DoSomething(dir).Join(Environment.NewLine));
Upvotes: 0