Reputation: 148
I have thousands of lines of data in a text file I want to make easily searchable by turning it into something easier to search (I am hoping an XML or another type of large data structure, though I am not sure if it will be the best for what I have in mind).
The data looks like this for each line:
Book 31, Thomas,George, 32, 34, 154
(each book is not unique, they are indexes so book will have several different entries of whom is listed in it, and the numbers are the page they are listed)
So I am kinda of lost on how to do this, I would want to read the .txt file, trim out all the spaces and commas, I basically get how to prep the data for it, but how would I programmatically make that many elements and values in xml or populate some other large data structure?
Upvotes: 0
Views: 1203
Reputation: 11032
If you want the CSV file to make easily searchable by turning it into something easier to search, you can convert it to DataTable.
if you want data , you can use LINQ to XML to search
The following class generates both DataTable or Xml data format. You can pass delimeter ,includeHeader or use the default:
class CsvUtility
{
public DataTable Csv2DataTable(string fileName, bool includeHeader = false, char separator = ',')
{
IEnumerable<string> reader = File.ReadAllLines(fileName);
var data = new DataTable("Table");
var headers = reader.First().Split(separator);
if (includeHeader)
{
foreach (var header in headers)
{
data.Columns.Add(header.Trim());
}
reader = reader.Skip(1);
}
else
{
for (int index = 0; index < headers.Length; index++)
{
var header = "Field" + index; // headers[index];
data.Columns.Add(header);
}
}
foreach (var row in reader)
{
if (row != null) data.Rows.Add(row.Split(separator));
}
return data;
}
public string Csv2Xml(string fileName, bool includeHeader = false, char separator = ',')
{
var dt = Csv2DataTable(fileName, includeHeader, separator);
var stream = new StringWriter();
dt.WriteXml(stream);
return stream.ToString();
}
}
example to use:
CsvUtility csv = new CsvUtility();
var dt = csv.Csv2DataTable("f1.txt");
// Search for string in any column
DataRow[] filteredRows = dt.Select("Field1 LIKE '%" + "Thomas" + "%'");
//search in certain field
var filtered = dt.AsEnumerable().Where(r => r.Field<string>("Field1").Contains("Thomas"));
//generate xml
var xml= csv.Csv2Xml("f1.txt");
Console.WriteLine(xml);
/*
output of xml for your sample:
<DocumentElement>
<Table>
<Field0>Book 31</Field0>
<Field1> Thomas</Field1>
<Field2>George</Field2>
<Field3> 32</Field3>
<Field4> 34</Field4>
<Field5> 154</Field5>
</Table>
</DocumentElement>
*/
Upvotes: 1
Reputation: 17400
If your csv file does not change too much and the structure is stable, you could simply parse it to a list of objects at startup
private class BookInfo {
string title {get;set;}
string person {get;set;}
List<int> pages {get;set;}
}
private List<BookInfo> allbooks = new List<BookInfo>();
public void parse() {
var lines = File.ReadAllLines(filename); //you could also read the file line by line here to avoid reading the complete file into memory
foreach (var l in lines) {
var info = l.Split(',').Select(x=>x.Trim()).ToArray();
var b = new BookInfo {
title = info[0],
person = info[1]+", " + info[2],
pages = info.Skip(3).Select(x=> int.Parse(x)).ToList()
};
allbooks.Add(b);
}
}
Then you can easily search the allbooks
list with for instance LINQ.
EDIT
Now, that you have clarified your input, I adapted the parsing a little bit to better fit your needs.
If you want to search your booklist by either the title
or the person
more easily, you can also create a lookup on each of the properties
var titleLookup = allbooks.ToLookup(x=> x.title);
var personLookup = allbooks.ToLookup(x => x.person);
So personLookup["Thomas, George"]
will give you a list of all bookinfos that mention "Thomas, George" and titleLookup["Book 31"]
will give you a list of all bookinfos for "Book 31", ie all persons mentioned in that book.
Upvotes: 2