Reputation: 1377
I came across IAsyncEnumerable while I am testing C# 8.0 features. I found remarkable examples from Anthony Chu (https://anthonychu.ca/post/async-streams-dotnet-core-3-iasyncenumerable/). It is async stream and replacement for Task<IEnumerable<T>>
// Data Access Layer.
public async IAsyncEnumerable<Product> GetAllProducts()
{
Container container = cosmosClient.GetContainer(DatabaseId, ContainerId);
var iterator = container.GetItemQueryIterator<Product>("SELECT * FROM c");
while (iterator.HasMoreResults)
{
foreach (var product in await iterator.ReadNextAsync())
{
yield return product;
}
}
}
// Usage
await foreach (var product in productsRepository.GetAllProducts())
{
Console.WriteLine(product);
}
I am wondering if this can be applied to read text files like below usage that read file line by line.
foreach (var line in File.ReadLines("Filename"))
{
// ...process line.
}
I really want to know how to apply async with IAsyncEnumerable<string>()
to the above foreach loop so that it streams while reading.
How do I implement iterator so that I can use yield return to read line by line?
Upvotes: 8
Views: 5647
Reputation: 43845
I did some performance tests and it seems that a large bufferSize
is helpful, together with the FileOptions.SequentialScan
option.
public static async IAsyncEnumerable<string> ReadLinesAsync(string filePath)
{
using var stream = new FileStream(filePath, FileMode.Open, FileAccess.Read,
FileShare.Read, 32768, FileOptions.Asynchronous | FileOptions.SequentialScan);
using var reader = new StreamReader(stream);
while (true)
{
var line = await reader.ReadLineAsync().ConfigureAwait(false);
if (line == null) break;
yield return line;
}
}
The enumeration in not trully asynchronous though. According to my experiments (.NET Core 3.1) the xxxAsync
methods of the StreamReader
class are blocking the current thread for a duration longer than the awaiting period of the Task
they return. For example reading a 6 MB file with the method ReadToEndAsync
in my PC blocks the current thread for 120 msec before returning the task, and then the task is completed in just 20 msec. So I am not sure that there is much value at using these methods. Faking asynchrony is much easier by using the synchronous APIs together with some Linq.Async:
IAsyncEnumerable<string> lines = File.ReadLines("SomeFile.txt").ToAsyncEnumerable();
.NET 6 update: The implementation of the asynchronous filesystem APIs has been improved on .NET 6. For experimental data with the File.ReadAllLinesAsync
method, see here.
Upvotes: 3
Reputation: 81563
Exactly the same, however there is no async workload, so let's pretend
public async IAsyncEnumerable<string> SomeSortOfAwesomeness()
{
foreach (var line in File.ReadLines("Filename.txt"))
{
// simulates an async workload,
// otherwise why would be using IAsyncEnumerable?
// -- added due to popular demand
await Task.Delay(100);
yield return line;
}
}
or
This is just an wrapped APM workload, see Stephen Clearys comments for clarification
public static async IAsyncEnumerable<string> SomeSortOfAwesomeness()
{
using StreamReader reader = File.OpenText("Filename.txt");
while(!reader.EndOfStream)
yield return await reader.ReadLineAsync();
}
Usage
await foreach(var line in SomeSortOfAwesomeness())
{
Console.WriteLine(line);
}
Update from Stephen Cleary
File.OpenText
sadly only allows synchronous I/O; the async APIs are implemented poorly in that scenario. To open a true asynchronous file, you'd need to use aFileStream
constructor passingisAsync
: true orFileOptions.Asynchronous
.
ReadLineAsync
basically results in this code, as you can see, it's only the Stream APM Begin
and End
methods wrapped
private Task<Int32> BeginEndReadAsync(Byte[] buffer, Int32 offset, Int32 count)
{
return TaskFactory<Int32>.FromAsyncTrim(
this, new ReadWriteParameters { Buffer = buffer, Offset = offset, Count = count },
(stream, args, callback, state) => stream.BeginRead(args.Buffer, args.Offset, args.Count, callback, state), // cached by compiler
(stream, asyncResult) => stream.EndRead(asyncResult)); // cached by compiler
}
Upvotes: 8