Reputation:
I would like to read in a CSV into a dictionary, however most examples I see have the first column as keys. I would like the field names to be keys.
CSV File:
Name,Type,Classification
file1.txt,text,Secondary
file2.txt,text,Primary
Output dictionary (what I would like):
dict = {
Name: [file1.txt, file2.txt],
Type: [text, text],
Classification: [Secondary, Primary]
}
I'm not sure if I can extend this somehow:
private void LoadCSV()
{
var mydict = new Dictionary<string, List<string>>();
var lines = File.ReadAllLines(@"C:\test.csv");
string[] columnHeaders = lines[0].Split(',');
foreach (string columnHeader in columnHeaders)
{
mydict[columnHeader] = new List<string>();
}
}
Edit:
This may be doing what I want, but perhaps not very efficient:
private void LoadCSV()
{
var mydict = new Dictionary<string, List<string>>();
var lines = File.ReadAllLines(@"C:\test.csv");
string[] columnHeaders = lines[0].Split(',');
string[] allLines = lines.Skip(1).ToArray();
int count = 0;
foreach (string headerPart in columnHeaders)
{
var valuelist = new List<string>();
foreach (string line in allLines)
{
List<string> l = line.Split(',').ToList();
var element = l.ElementAt(count);
valuelist.Add(element);
}
mydict[headerPart] = valuelist;
count++;
}
}
Upvotes: 1
Views: 781
Reputation: 1311
As an alternative to the imperative style you are using, here is a way to perform this using Linq:
var csvPath = @"C:\test.csv";
var separator = ",";
var lines = File.ReadLines(csvPath);
var indexedKeys = lines
.First()
.Split(separator, StringSplitOptions.None)
.Select((k, i) =>
(
Index: i,
Key: k
));
var values = lines
.Skip(1)
.Select(l => l.Split(separator, StringSplitOptions.None))
.ToList();
var result = indexedKeys
.ToDictionary(
ik => ik.Key,
ik => values
.Select(v => v[ik.Index])
.ToList());
As for your solution efficiency (the code below Edit in your post), with a basic analysis, there are two potential improvements:
File.ReadAllLines
unnecessaryly put your whole file in memory.
Using File.ReadLine
solves that. See this and this for discussion.
You are applying the .Split()
method not once on each line, but as many time as there are columns. To fix this, your variable allines
should already contains the splited lines. The solution I propose does that, see the variable values
.
Upvotes: 1
Reputation: 10078
Good start! Now that you have column headers, you can build the dictionary. Obviously, you need to keep track of matching column numbers to column names
string[] columnHeaders = lines[0].Split(',');
// do this for each line in the file
string[] columnRows = lines.Split(',');
for (int n = 0; n++; n <= columnRows.Length)
{
mydict[columnHeaders[n]] = mydict[columnHeaders[n]].Add(columnRows[n]);
}
this way you add a new value to the List
that is value of the Dictionary
at given index
Upvotes: 0