Reputation: 799
I have a tab delimited txt file with 500K records. I'm using the code below to read data to dataset. With 50K it works fine but 500K it gives "Exception of type 'System.OutOfMemoryException' was thrown."
What is the more efficient way to read large tab delimited data? Or how to resolve this issue? Please give me an example
public DataSet DataToDataSet(string fullpath, string file)
{
string sql = "SELECT * FROM " + file; // Read all the data
OleDbConnection connection = new OleDbConnection // Connection
("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fullpath + ";"
+ "Extended Properties=\"text;HDR=YES;FMT=Delimited\"");
OleDbDataAdapter ole = new OleDbDataAdapter(sql, connection); // Load the data into the adapter
DataSet dataset = new DataSet(); // To hold the data
ole.Fill(dataset); // Fill the dataset with the data from the adapter
connection.Close(); // Close the connection
connection.Dispose(); // Dispose of the connection
ole.Dispose(); // Get rid of the adapter
return dataset;
}
Upvotes: 5
Views: 14822
Reputation: 718
I found FileHelpers
The FileHelpers are a free and easy to use .NET library to import/export data from fixed length or delimited records in files, strings or streams.
Maybe it can help.
Upvotes: 0
Reputation: 18295
You really want to enumerate the source file and process each line at a time. I use the following
public static IEnumerable<string> EnumerateLines(this FileInfo file)
{
using (var stream = File.Open(file.FullName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (var reader = new StreamReader(stream))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
Then for each line you can split it using tabs and process each line at a time. This keeps memory down really low for the parsing, you only use memory if the application needs it.
Upvotes: 3
Reputation: 2317
Have you tried the TextReader?
using (TextReader tr = File.OpenText(YourFile))
{
string strLine = string.Empty;
string[] arrColumns = null;
while ((strLine = tr.ReadLine()) != null)
{
arrColumns = strLine .Split('\t');
// Start Fill Your DataSet or Whatever you wanna do with your data
}
tr.Close();
}
Upvotes: 0
Reputation: 499062
Use a stream approach with TextFieldParser
- this way you will not load the whole file into memory in one go.
Upvotes: 8