Ken'ichi Matsuyama
Ken'ichi Matsuyama

Reputation: 369

Making LINQ to a dictionary into a method body

I am using two almost identical (the only exception is that one got .ToDictinct(), and the other one doesn't) calls. Is it possible to make them into a one method that I can call and change in one place?

private void Splitter1(string[] file)
{               
    tempDict = file
        .SelectMany(i => File.ReadAllLines(i)
        .SelectMany(line => line.Split(new[] { ' ', ',', '.', '?', '!', '{', '[', '(', '}', ']', ')',
        '<', '>', '-', '=', '/', '"', ';', ':', '+', '_', '*' }, StringSplitOptions.RemoveEmptyEntries))                    
        .AsParallel()
        .Select(word => word.ToLower()) 
        .Where(word => !StopWords.Contains(word))
        .Where(word => !PopulatNetworkWords.Contains(word)) 
        .Where(word => !word.All(char.IsDigit))
        .Distinct())
        .GroupBy(word => word)                    
        .ToDictionary(g => g.Key, g => g.Count());
}
private void Splitter2(string[] file)
{               
    tempDict = file
        .SelectMany(i => File.ReadAllLines(i)
        .SelectMany(line => line.Split(new[] { ' ', ',', '.', '?', '!', '{', '[', '(', '}', ']', ')',
        '<', '>', '-', '=', '/', '"', ';', ':', '+', '_', '*' }, StringSplitOptions.RemoveEmptyEntries)))                    
        .AsParallel()
        .Select(word => word.ToLower()) 
        .Where(word => !StopWords.Contains(word))
        .Where(word => !PopulatNetworkWords.Contains(word)) 
        .Where(word => !word.All(char.IsDigit))
        .GroupBy(word => word)                    
        .ToDictionary(g => g.Key, g => g.Count());
}

Upvotes: 0

Views: 105

Answers (3)

Jon Hanna
Jon Hanna

Reputation: 113242

Since the different between the two is whether or not Distinct() was called on it, and since Distinct() both works on and returns an IEnumerable<T> (or works on and returns an IQueryable<T>. Then first create the appropriate IEnumerable<T>, then decide whether or not to replace it with the result of calling Distinct(), and then continue:

private void Splitter(string[] file, bool distinct)
{               
  IEnumerable<string> query = file
    .SelectMany(i => File.ReadAllLines(i)
    .SelectMany(line => line.Split(new[] { ' ', ',', '.', '?', '!', '{', '[', '(', '}', ']', ')',
    '<', '>', '-', '=', '/', '"', ';', ':', '+', '_', '*' }, StringSplitOptions.RemoveEmptyEntries))                    
    .AsParallel()
    .Select(word => word.ToLower()) 
    .Where(word => !StopWords.Contains(word))
    .Where(word => !PopulatNetworkWords.Contains(word)) 
    .Where(word => !word.All(char.IsDigit));
  if(distinct)
    query = query.Distinct());
  return query
    .GroupBy(word => word)                    
    .ToDictionary(g => g.Key, g => g.Count());
}

(Incidentally, you might find that the newer ReadLines works better than ReadAllLines, especially with large files. ReadAllLines reads all the lines into memory immediately, rather than just reading them as you use them, so it wastes a lot of memory and delays processing).

Upvotes: 2

BJ Safdie
BJ Safdie

Reputation: 3419

Since Linq defers execution, you can build the clauses in separate statements.

 private void Splitter1(string[] file, bool distinct)
    {               
         var query = file.SelectMany (i => File.ReadAllLines(i)
                .SelectMany(line => line.Split(new[] { ' ', ',', '.', '?', '!', '{', '[', '(', '}', ']', ')',
                '<', '>', '-', '=', '/', '"', ';', ':', '+', '_', '*' }, StringSplitOptions.RemoveEmptyEntries))                    
                .AsParallel()
                .Select(word => word.ToLower()) 
                .Where(word => !StopWords.Contains(word))
                .Where(word => !PopulatNetworkWords.Contains(word)) 
                .Where(word => !word.All(char.IsDigit)));

        if (distinct) 
        {
            query = query.Distinct();
        }
        query.GroupBy(word => word)                    
              .ToDictionary(g => g.Key, g => g.Count());
    }

I did not test this code, so you meay need to adjust it. However, the basic idea is that deferred execution allows you to vary your query based on logic.

Upvotes: 0

ferdinand
ferdinand

Reputation: 1020

Why not something like this:

    private void Splitter1(string[] file, bool useDistinct = false))
    {
        tempDict = file
            .SelectMany(i => File.ReadAllLines(i)
            .SelectMany(line => line.Split(new[] { ' ', ',', '.', '?', '!', '{', '[', '(', '}', ']', ')',
    '<', '>', '-', '=', '/', '"', ';', ':', '+', '_', '*' }, StringSplitOptions.RemoveEmptyEntries))
            .AsParallel()
            .Select(word => word.ToLower())
            .Where(word => !StopWords.Contains(word))
            .Where(word => !PopulatNetworkWords.Contains(word))
            .Where(word => !word.All(char.IsDigit))
            .Select(x => useDistinct ? x.Distinct() : x)
            .GroupBy(word => word)
            .ToDictionary(g => g.Key, g => g.Count());
    }

Upvotes: 1

Related Questions