Reputation: 139
I have a file I've imported into the console: I wish to search it for unique entries such as:
tom
tim
tim
tom
alan
When I search the file I want to count the number of unique lines in the file.
I'm using .NET Framework 3.5
so I can't use System.Linq
. Any suggestions? Other than upgrading to .NET 4
?
Upvotes: 1
Views: 2314
Reputation: 43046
Iterate through the lines in the file.
Add each line to a HashSet<string>
.
Return the Count
property of the HashSet<string>
.
Example:
int lineCount = new HashSet<string>(File.ReadAllLines(fileName)).Count;
EDIT I originally had File.ReadLines
since that would use less memory if you have a lot of duplicate lines. That method was introduced with .NET 4, so it's not available under the stated requirements.
EDIT2 System.Core.dll is part of framework 3.5, so you really ought to be able to find it somewhere (the GAC, perhaps?). If you can't however, you could achieve your goal by loading the lines into a List<string>
, sorting it, and then counting items only if they are not the same as the previous item (note that this fails if any item in the list is null):
var list = new List<string>(File.ReadAllLines(fileName));
list.Sort();
var counter = 0;
string previousItem = null;
foreach (var item in list)
{
if (item.Equals(previousItem))
continue;
counter++;
previousItem = item;
}
return counter;
Upvotes: 3
Reputation: 838256
It's fairly simple with the LINQ extension methods Distinct
and Count
:
int numberOfUniqueLines = File.ReadAllLines(filename).Distinct().Count();
Regarding this:
im using framework 3.5 so cant use system.linq any suggestions?
LINQ is available in .NET 3.5. However if you are using .NET 2.0 you can use a dictionary instead:
Dictionary<string, object> uniqueLines = new Dictionary<string, object>();
foreach (string line in File.ReadAllLines(filename)) {
uniqueLines[line] = null;
}
int numberOfUniqueLines = uniqueLines.Keys.Count;
Upvotes: 4