Reputation: 1638
I wrote a simple program, here's what it looks like, some details hidden:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace routeaccounts
{
class Program
{
static void Main(string[] args)
{
//Draw lines from source file
var lines = File.ReadAllLines("accounts.txt").Select(p => p.Split('\t'));
//Convert lines into accounts
var accounts = lines.Select(p => new Account(p[0], p[1], p[2], p[3]));
//Submit accounts to router
var results = accounts.Select(p => RouteAccount(p));
//Write results list to target file
WriteResults("results.txt", results);
}
private static void WriteResults(string filename, IEnumerable<Result> results)
{
... disk write call ...
}
private static Result RouteAccount(Account account)
{
... service call ...
}
}
}
My question is this - obviously, when selecting from a data context, execution is deferred. If you notice, in the first statement of the 'Main' function, I'm querying from File.ReadAllLines("accounts.txt"). Is this a bad choice? If I enumerate the final result, will this statement be repeatedly?
I can simply .ToArray() or grab the results ahead of time, if I know it's a problem, but I'm interested to know what's going on behind the scenes.
Upvotes: 2
Views: 439
Reputation: 110101
//File is read now, but split later.
var lines = File.ReadAllLines("accounts.txt").Select(p => p.Split('\t'));
//Accounts are new'd up later.
var accounts = lines.Select(p => new Account(p[0], p[1], p[2], p[3]));
//Accounts are Routed later.
var results = accounts.Select(p => RouteAccount(p));
//Write results list to target file
WriteResults("results.txt", results);
private static void WriteResults(string filename, IEnumerable<Result> results)
{
//file is split, accounts are new'd up and routed by enumerating results
List<Result> items = results.ToList();
}
Upvotes: 0
Reputation: 1500345
It's not going to read the file repeatedly, no - because that part of execution isn't deferred. It will return an array, and then the call to Select
will return you a sequence... the projection will be deferred, but the reading of the file won't. That array will stay in memory until everything referring to it (directly or indirectly) is eligible for garbage collection... it won't need to reread the file.
On the other hand, you may want to read the results using ToList()
or something similar anyway - because that way, you get to find out any errors before you start to write the results. It's quite often a good idea to make sure you've got all the data you need before you start executing code with side effects (which I imagine WriteResults
does). Obviously it's less efficient in terms of the amount of data needed in memory at a time though... it's a balance you'll have to weigh up yourself.
Upvotes: 3
Reputation: 241631
Better to use File.ReadLines
in .NET 4.0 to get lazy reading of the file too. As it is right now, the reading of the file is not deferred and will read the whole file into memory when File.ReadAllLines
returns. This will only happen once.
Upvotes: 4