Etienne Charland
Etienne Charland

Reputation: 4034

LINQ Select Customized Random-ness

I have this code to select a random entry from a LINQ query

Random rand = new Random();
int TotalFound = Query.Count();
if (TotalFound > 0) {
    int toSkip = rand.Next(0, TotalFound);
    Video Result = Query.Skip(toSkip).Take(1).First();
    return Result;
} else
    return null;

Now, I want to add a column within Query called "Preference", which is a number between 0 and 10. A row with a Preference value of 8 would be twice as likely to be selected as a row with Preference value of 4. A value of 9.8 would be even more likely to be selected.

What would be an effective way of implementing this algorithm?

As a second step, I could add a parameter that allows 8 to be worth 3x or 4x a row set to 4, to fine-tune the results with an exponential curve instead of a linear curve.

I'm really not sure about how to go to implement this efficiently.

Here's the solution I implemented

public static int? SelectRandomId(IQueryable<Media> query, out int totalFound) {
    int? Result = null;

    // Pull list of ID and Preference from database.
    var IdList = query.Select(v => new { ID = v.MediaId, Preference = v.Preference }).ToList();
    totalFound = IdList.Count();

    // Calculate preferences average and total.
    int PreferenceCount = IdList.Where(v => v.Preference.HasValue).Count();
    int NoPreferenceCount = IdList.Count() - PreferenceCount;
    int PreferenceSum = IdList.Where(v => v.Preference.HasValue).Sum(v => PreferenceToInt(v.Preference.Value));
    // Use value 10 for every item if it is not specified for any.
    int PreferenceAvg = (PreferenceCount > 0 ? PreferenceSum / PreferenceCount : 10);
    // Videos with no preference get the average value.
    int PreferenceTotal = PreferenceSum + NoPreferenceCount * PreferenceAvg;

    // Get a random number between zero and the sum of all the preferences
    Random rand = new Random();
    int number = rand.Next(0, PreferenceTotal);

    int rollingSumOfPreferences = 0;

    // Set default value in case a value doesn't get assigned by the loop.
    if (totalFound > 0)
        Result = IdList[0].ID;

    // Select an index from the video list, but weighted by preference
    foreach (var item in IdList) {
        // Add the current item's preference to the rolling sum
        if (item.Preference.HasValue)
            rollingSumOfPreferences += PreferenceToInt(item.Preference.Value);
        else
            rollingSumOfPreferences += PreferenceAvg;

        // If we've hit or passed the random number, select this item
        if (rollingSumOfPreferences >= number) {
            Result = item.ID;
            break;
        }
    }

    return Result;
}

private static int PreferenceToInt(float preference) {
    return (int)(Math.Pow(preference, 1.2) * 100);
}

Upvotes: 1

Views: 245

Answers (2)

Rufus L
Rufus L

Reputation: 37060

I think it might work if you take a random between 0 and [the sum of all preferences]. Then you can loop through all items and store the rolling sum of item preferences in another variable. Once you hit the item where the rolling sum is equal to or greater than the random number, select that item. This should prefer the larger preferences in descending order. At least in my tests it worked!

Hopefully this code can explain it better:

private class Video
{
    public string Name { get; set; }
    public int Preference { get; set; }
}

public static void GenericTester()
{
    // Initialize array with item name and preference
    var videos = new List<Video>
    {
        new Video {Name = "ten", Preference = 10},
        new Video {Name = "nine", Preference = 9},
        new Video {Name = "eight", Preference = 8},
        new Video {Name = "seven", Preference = 7},
        new Video {Name = "six", Preference = 6},
        new Video {Name = "five", Preference = 5},
        new Video {Name = "four", Preference = 4},
        new Video {Name = "three", Preference = 3},
        new Video {Name = "two", Preference = 2},
        new Video {Name = "one", Preference = 1},
        new Video {Name = "zero", Preference = 0}
    };

    // Dictionary to store results of how many times each
    // preference was selected (for testing purposes)
    var results = new Dictionary<int, int>();
    for (int i = 0; i <= videos.Max(v => v.Preference); i++)
    {
        results[i] = 0;  // Init all items to zero
    }

    // Init random number generator
    var rand = new Random();

    for (int n = 1; n < 100000; n++)
    {
        // Get a random number between zero and the sum of all the preferences
        var number = rand.Next(0, videos.Sum(v => v.Preference));

        // Initialize index to the highest preference
        var index = videos.Max(v2 => v2.Preference);
        var rollingSumOfPreferences = 0;

        // Select an index from the video list, but weighted by preference
        foreach(var video in videos)
        {
            // Add the current item's preference to the rolling sum
            rollingSumOfPreferences += video.Preference;

            // If we've hit or passed the random number, select this item
            if (rollingSumOfPreferences >= number)
            {
                index = video.Preference;
                break;
            }
        }

        // Increment the count for the selected preference
        results[index]++;
    }

    foreach (var result in results)
    {
        Console.WriteLine("The preference value '{0}' was selected '{1}' times.", result.Key, result.Value);
    }
}

Upvotes: 1

Christian Maslen
Christian Maslen

Reputation: 1110

The Random function offers a relatively uniform distribution so on it's own it will not give you what you want. The following Eric Lippert article provides some background and a step in the right direction (since many non-uniform distributions are achieved by transforming a uniform distribution):

http://blogs.msdn.com/b/ericlippert/archive/2012/02/21/generating-random-non-uniform-data-in-c.aspx

Code that generates good statistical distributions can be pretty complex so many use third party libraries. This one is pretty good:

http://www.extremeoptimization.com/

If you do want to have a crack at your own distribution, the MSDN docs show how to override the Sample method:

http://msdn.microsoft.com/en-us/library/system.random.sample(v=vs.110).aspx

Assuming you can author or find a random distribution that meets your needs I'd do something like the following:

Random uniformDist = new Random();
YourRand yourDist = new YourRand();
int groupToSelect = yourDist.Next(1, 10);
int TotalFound = Query.Where(v => v.Preference == groupToSelect).Count();
if (TotalFound > 0) {
    int toSkip = uniformDist.Next(0, TotalFound);
    Video Result = Query.Where(v => v.Preference == groupToSelect).Skip(toSkip).Take(1).First();
    return Result;
} else
    return null;

Depending on the number of rows in your database I'd run the SQL profiler over that to make sure you can get something that performs adequately.

Upvotes: 0

Related Questions