alexx
alexx

Reputation: 133

StackExchange Redis some keys have been lost while using async to insert/read data

I'm sure we're missing something very important here, so hopefully someone can point me to the right direction. Thank you in advance :)

Issue that we currently experience: sometimes asynchronous operation (read) do no return us hash value from db that has been written by async operation. For example, one time operation can return us 600 keys, the next time amount of keys can be 598, the next one : 596 and so on. Also we experience same issue with short sets (when we have up to 10 keys in set and read 10 hash objects in the batch: sometimes we can get 8 objects, sometimes 6, once we get only 2. We have issue with async methods in about 30-40% of our operations, migration to the synchronous operations solved some of the cases - be we've lost performance.

Example of our create/read batch operations

protected void CreateBatch(Func<IBatch, List<Task>> action)
    {
        IBatch batch = Database.CreateBatch();

        List<Task> tasks = action(batch);

        batch.Execute();

        Task.WaitAll(tasks.ToArray());
    }

    protected IEnumerable<T> GetBatch<T, TRedis>(
        IEnumerable<RedisKey> keys, 
        Func<IBatch, RedisKey, Task<TRedis>> invokeBatchOperation, 
        Func<TRedis, T> buildResultItem)
    {
        IBatch batch = Database.CreateBatch();
        List<RedisKey> keyList = keys.ToList();
        List<Task> tasks = new List<Task>(keyList.Count);
        List<T> result = new List<T>(keyList.Count);

        foreach (RedisKey key in keyList)
        {
            Task task = invokeBatchOperation(batch, key).ContinueWith(
                t =>
                    {
                        T item = buildResultItem(t.Result);
                        result.Add(item);
                    });

            tasks.Add(task);
        }

        batch.Execute();
        Task.WaitAll(tasks.ToArray());

        return result;
    }

we use write operations next way:

private void CreateIncrementBatch(IEnumerable<DynamicDTO> dynamicDtos)
    {
        CreateBatch(
            batch =>
                {
                    List<Task> tasks = new List<Task>();

                    foreach (DynamicDTO dynamicDto in dynamicDtos)
                    {
                        string dynamicKey = KeysBuilders.Live.Dynamic.BuildDetailsKeyByIdAndVersion(
                            dynamicDto.Id, 
                            dynamicDto.Version);
                        HashEntry[] dynamicFields = _dtoMapper.MapDynamicToHashEntries(dynamicDto);

                        Task task = batch.HashSetAsync(dynamicKey, dynamicFields, CommandFlags.HighPriority);
                        tasks.Add(task);
                    }

                    return tasks;
                });
    }

We read data as batch using next code sample

IEnumerable<RedisKey> userKeys =
                        GetIdsByUserId(userId).Select(x => (RedisKey) KeysBuilders.Live.Dynamic.BuildDetailsKeyByUserId(x));

                    return GetBatch(userKeys, (batch, key) => batch.HashGetAllAsync(key), _dtoMapper.MapToDynamic);

We know that batch.Execute is no synchronous/not truly asynchronous operation, at the same time we need to check status of each operation later. We do plan to make much more read-write operations into redis server, but using this issue, we're not sure if we're on the right path ).

Any advices/samples and points to the right direction are highly appreciated!

Some Additonal info: We're using StackExchange redis client (latest stable version: 1.0.481) in asp.mvc/worker role (.NET version 4.5) to connect and work with Azure redis cache (C1, Standard). At the moment we have about 100 000 keys in Database during small test flow (mostly Hashes - based on recommendations provided in redis.io (each key stores up to 10 fields for different objects, no big data or text fields stored in the hash) and sets (mostly mappings, the biggest one can take up to 10 000 keys to the parent)). We have about 20 small writers to the cache (each writer instance writes it's own subset of data and do not overlap with another, amount of keys to write per operation is up to 100 (hash)). Also we have one "big man" worker who can make some calculations based on current redis state and store data back into redis server (amount of operations - is up to 1200 keys to read/write per first request, and then work with 10 000+ keys (store and calculate). At the time the big man works: nobody read-write to this exact keyspace, however small writers continue to write some keys constantly. At the same time we have many small readers (up to 100 000) who can request their specific chunk of data (based on mappings and joins of 2 hash entities. Amount of hash entities to return to the readers is about 100-500 records. Due to some restrictions in the domain model - we try to store/read keys as batch operations (the biggest(longest) batch can have up to 500-1000 reads/writes of hash fields into cache. We do not use transactions at the moment.

Upvotes: 6

Views: 2466

Answers (1)

Pavlo  Denys
Pavlo Denys

Reputation: 81

Maybe you can use instead of

  List<T> result = new List<T>(keyList.Count);

Something like this?

 ConcurrentBag<T>result = new ConcurrentBag<T>();

ConcurrentBag represents a thread-safe, unordered collection of objects.

Upvotes: 0

Related Questions