ZZZ
ZZZ

Reputation: 2812

Task fired again after WaitAll

Using HttpClient.GetAsync or any of its async method, or any BCL async method in Linq Select might result in some strange twice shoot.

Here a unit test case:

[TestMethod]
public void TestTwiceShoot()
{
    List<string> items = new List<string>();
    items.Add("1");
    int k = 0;

    var tasks = items.Select(d =>
    {
        k++;
        var client = new System.Net.Http.HttpClient();
        return client.GetAsync(new Uri("http://testdevserver.ibs.local:8020/prestashop/api/products/1"));
    });

    Task.WaitAll(tasks.ToArray());

    foreach (var r in tasks)
    {

    }

    Assert.AreEqual(1, k);           
}

The test will fail, since k is 2. Somehow the program run the delegate of firing GetAsync twice. Why?

If I remove foreach (var r in tasks), the test pass. Why?

[TestMethod]
public void TestTwiceShoot()
{
    List<string> items = new List<string>();
    items.Add("1");
    int k = 0;

    var tasks = items.Select(d =>
    {
        k++;
        var client = new System.Net.Http.HttpClient();
        return client.GetAsync(new Uri("http://testdevserver.ibs.local:8020/prestashop/api/products/1"));
    });

    Task.WaitAll(tasks.ToArray());

    Assert.AreEqual(1, k);

}

If I use foreach instead of items.Select, the test pass. Why?

[TestMethod]
public void TestTwiceShoot()
{
    List<string> items = new List<string>();
    items.Add("1");
    int k = 0;

    var tasks = new List<Task<System.Net.Http.HttpResponseMessage>>();
    foreach (var item in items)
    {
        k++;
        var client = new System.Net.Http.HttpClient();
        tasks.Add( client.GetAsync(new Uri("http://testdevserver.ibs.local:8020/prestashop/api/products/1")));
    };

    Task.WaitAll(tasks.ToArray());

    foreach (var r in tasks)
    {

    }

    Assert.AreEqual(1, k);

}

Apparently the enumerator returned by items.Select is not living well with the Task object returned, as soon as I walk the enumerator, the delegate got fired again.

This test pass.

[TestMethod]
public void TestTwiceShoot()
{
    List<string> items = new List<string>();
    items.Add("1");
    int k = 0;

    var tasks = items.Select(d =>
    {
        k++;
        var client = new System.Net.Http.HttpClient();
        return client.GetAsync(new Uri("http://testdevserver.ibs.local:8020/prestashop/api/products/1"));

    });


    var tasksArray = tasks.ToArray();
    Task.WaitAll(tasksArray);

    foreach (var r in tasksArray)
    {

    }

    Assert.AreEqual(1, k);

}

Scott mentioned that the Select may run again when walking the enumerator, however, this test pass

[TestMethod]
public void TestTwiceShoot()
{
    List<string> items = new List<string>();
    items.Add("1");
    int k = 0;

    var tasks = items.Select(d =>
    {
        k++;
        return int.Parse(d);

    });

    foreach (var r in tasks)
    {

    };

    Assert.AreEqual(1, k);

}

I guess the Linq Select has some special treatment against Task.

After all, what's the good way of firing multiple async method in Linq and the examine the results after WaitAll?

Upvotes: 0

Views: 407

Answers (2)

ZZZ
ZZZ

Reputation: 2812

I think the problem is my misconception about how enumeration works. These tests pass:

        [TestMethod]
    public void TestTwiceShoot()
    {
        List<string> items = new List<string>();
        items.Add("1");
        int k = 0;

        var tasks = items.Select(d =>
        {
            k++;
            return int.Parse(d);

        });

        foreach (var r in tasks)
        {

        };

        foreach (var r in tasks)
        {

        };

        Assert.AreEqual(2, k);

    }

    [TestMethod]
    public void TestTwiceShoot2()
    {
        List<string> items = new List<string>();
        items.Add("1");
        int k = 0;

        var tasks = items.Where(d =>
        {
            k++;
            return true;

        });

        foreach (var r in tasks)
        {

        };

        foreach (var r in tasks)
        {

        };

        Assert.AreEqual(2, k);

    }

I had though the Linq statement returns an IEnumerable object which store the results of the delegate. However, obviously it stores only the shortcuts to the delegates, so each enumerator walk will trigger the delegate. Therefore, it is good to use ToArray() or ToList() to get a list of results, like this one:

        [TestMethod]
    public void TestTwiceShoot2()
    {
        List<string> items = new List<string>();
        items.Add("1");
        int k = 0;

        var tasks = items.Where(d =>
        {
            k++;
            return true;

        }).ToList();

        foreach (var r in tasks)
        {

        };

        foreach (var r in tasks)
        {

        };

        Assert.AreEqual(1, k);

    }

Upvotes: 0

Scott Chamberlain
Scott Chamberlain

Reputation: 127563

It is because tasks is IEnumerable<Task> and each time you enumerate through the list it will re-run the .Select() operation. Currently you run through the list twice, one when you call .ToArray() and once when you pass it in to the foreach

To fix the problem just use the .ToArray() like you are but move it earlier up.

    var tasks = items.Select(d =>
    {
        k++;
        var client = new System.Net.Http.HttpClient();
        return client.GetAsync(new Uri("http://testdevserver.ibs.local:8020/prestashop/api/products/1"));

    }).ToArray(); //This makes tasks a "Task[]" instead of a IEnumerable<Task>.

    Task.WaitAll(tasks);

    foreach (var r in tasks)
    {

    };

Things like what happened to you is why Microsoft reccomends that when you write Linq statements that they do not have any side effects (like incrementing k) because it is hard to tell how many times the statement will be run, especially if the resultant IEnumerable<T> goes out of your scope of control by being returned as a result or passed in to a new function.

Upvotes: 5

Related Questions