Reputation: 153
I'm not really a programmer as you'll see here, but it would be much appreciated to get some assistance to speed up this simple search:
I have some code that reads from a 10 megabyte text file and populates the relevant text to a textbox to help people at work search for part numbers. It works on a background worker and it is populating a textbox very slow and I was wondering how I could speed it up? Something like String.Join maybe?
using (System.IO.StreamReader file = new System.IO.StreamReader(@"T:\\PARTS\\DATABASE\\PARTS.txt"))
{
while ((line = file.ReadLine()) != null)
{
if ((backgroundWorker1.CancellationPending == true))
{
e.Cancel = true;
}
else if (line.Contains(partNumbersText.Text))
{
Action action = () => matchesText.Text += (line + Environment.NewLine).ToString();
matchesText.Invoke(action); // Or use BeginInvoke
}
}
}
Thank you for reading
Upvotes: 0
Views: 2377
Reputation: 3231
It is the looking through the entire file, and the contains
that is taking a long time, you should load the text up into objects that allow you to search on the part number, like a dictionary, but you did say that it will be too big, still you must be able to cache some of the numbers even doing something like this
//If there was a way to extract the parts number from each line I would do this
//but I don't know what the format is so I can't provide the code
//cache is a Dictionary>
if(!cache.ContainsKey(partsNumber.Text))
{
//then search through the file
cache.Add(partsNumber.Text,new List());
using (System.IO.StreamReader file = new System.IO.StreamReader(@"T:\\PARTS\\DATABASE\\PARTS.txt"))
{
while ((line = file.ReadLine()) != null)
{
if ((backgroundWorker1.CancellationPending == true))
{
e.Cancel = true;
}
else if (line.Contains(partNumbersText.Text))
{
cache[partNumbersText.Text].Add(line);
Action action = () => matchesText.Text += (line + Environment.NewLine).ToString();
matchesText.Invoke(action); // Or use BeginInvoke
}
}
}
}
else //this is where you will save time
{
foreach(var line in cache[partNumbersText.Text])
{
cache[partNumbersText.Text].Add(line);
Action action = () => matchesText.Text += (line + Environment.NewLine).ToString();
matchesText.Invoke(action); // Or use BeginInvoke
}
}
This isn't going to speed you up that much there are a few ways you could make your program a lot faster, the one that would make the most difference would be to make an index into the file you are searching.
Keep track of where the part number is in the file, this isn't a quick fix. What you would do is keep the locations of the lines with associated part numbers in a separate file.
Upvotes: 0
Reputation: 415690
Don't update the textbox every time you get a result. Use a StringBuilder to build your results object, and update the textbox only every so often. It's also a good idea to use the ReportProgress mechanism, like this:
using (System.IO.StreamReader file = new System.IO.StreamReader(@"T:\\PARTS\\DATABASE\\PARTS.txt"))
{
var results = new StringBuilder();
var nextUpdate = DateTime.Now.AddMilliseconds(500);
while ((line = file.ReadLine()) != null)
{
if ((backgroundWorker1.CancellationPending == true))
{
e.Cancel = true;
break;
}
if (line.Contains(partNumbersText.Text))
{
results.AppendLine(line);
}
if (DateTime.Now > nextUpdate)
{
nextUpdate = DateTime.Now.AddMilliseconds(500);
backgroundWorker1.ReportProgress(0, results.ToString());
//move this code to the ProgressChanged event
//matchesText.Invoke(() => matchesText.Text = results.ToString()); // Or use
}
}
}
Additionally, that .Contains() check on 10Mb worth of disk data sounds expensive. You can speed it up some by keeping the file loaded in memory. 10Mb is nothing on a modern system, and as long as you're careful to keep from re-loading that data in a way that creates multiple entries on the .Net Large Object Heap, this will be by far the way to go.
Upvotes: 0
Reputation: 26058
If it's a big file you are going to want to use a StringBuilder
rather than concatenation because strings are immutable under the covers so concatenation over and over becomes very expensive. Try something like this:
using (System.IO.StreamReader file = new System.IO.StreamReader(@"T:\\PARTS\\DATABASE\\PARTS.txt"))
{
StringBuilder strBlder = new StringBuilder();
while ((line = file.ReadLine()) != null)
{
if ((backgroundWorker1.CancellationPending == true))
{
e.Cancel = true;
}
else if (line.Contains(partNumbersText.Text))
{
strBlder.Append(line + Environment.NewLine);
}
}
Action action = () => matchesText.Text = strBlder.ToString()
matchesText.Invoke(action);
}
@Jim's comment, if you want to display the text as it comes you could print it out every x number of entries, so it gains some speed, but doesn't have to read the entire file before seeing anything:
const int ITERATIONS_PER_UI_UPDATE = 20;
using (System.IO.StreamReader file = new System.IO.StreamReader(@"T:\\PARTS\\DATABASE\\PARTS.txt"))
{
int count = 0;
StringBuilder strBlder = new StringBuilder();
while ((line = file.ReadLine()) != null)
{
if ((backgroundWorker1.CancellationPending == true))
{
e.Cancel = true;
}
else if (line.Contains(partNumbersText.Text))
{
strBlder.Append(line + Environment.NewLine);
}
count++;
if ((count % ITERATIONS_PER_UI_UPDATE) == 0))
{
Action action = () => matchesText.Text = strBlder.ToString()
matchesText.Invoke(action);
}
}
Action action = () => matchesText.Text = strBlder.ToString()
matchesText.Invoke(action);
}
Upvotes: 3
Reputation: 44038
change this:
matchesText.Invoke(action);
to this:
matchesText.BeginInvoke(action); //Not sure about the winforms syntax for this.
because the first one will have your Backgroundworker needlessly waiting for the UI to refresh, whereas the second one will not.
Upvotes: 1