Gokoulane Ravi
Gokoulane Ravi

Reputation: 145

Need to get combination of records from Data Frame in R that satisfies a specific target in R

Let me say that I have a below Data Frame in R with 500 player records with the following columns

Now out of the 500 players, I want my code to give me multiple combinations of 3 players that would satisfy the following criteria. Something like a Moneyball problem.

Kindly help with this. Thank you.

Upvotes: 0

Views: 71

Answers (3)

Gokoulane Ravi
Gokoulane Ravi

Reputation: 145

Thanks, I used a combination of both John's and James's answers.

  1. Filtered out all the players who don't satisfy the criteria and that boiled down only to 90+ players.
  2. Then I used picked up players in random until all the variations got exhausted
  3. Finally, I computed combined metrics for each variation (set) of players to arrive at the optimized set.

The code is a bit messy and doesn't wanna post it here.

Upvotes: 1

John Garland
John Garland

Reputation: 513

choose(500,3) 

Shows there are almost 21,000,000 combinations of 3 players drawn from a pool of 500 which means a complete analysis of the entire search space ought to be reasonably doable in a reasonable time on a modern machine.

You can generate the indeces of these combinations using iterpc() and getnext() from the iterpc package. As in

# library(iterpc) # uncomment if not loaded
I <- iterpc(5, 3)
getnext(I)

You can also drastically cut the search space in a number of ways by setting up initial filtering criteria and/or by taking the first solution (while loop with condition = meeting criterion). Or, you can get and rank order all of them (loop through all combinations) or some intermediate where you get n solutions. And preprocessing can help reduce the search space. For example, ordering salaries in ascending order first will give you the cheapest salary solution first. Ordering the file by descending runs will give you the highest runs solutions first.

NOTE: While this works fine, I see iterpc now is superseded by the arrangements package where the relevant iterator is icombinations(). getnext() is still the access method for succeeding iterators.

Upvotes: 1

James Curran
James Curran

Reputation: 1304

So there are choose(500,3) ways to choose 3 players which is 20,708,500. It's not impossible to generate all these combinations combn might do it for you, but I couldn't be bothered waiting to find out. If you do this with player IDs and then test your three conditions, this would be one way to solve your problem. An alternative would be a Monte Carlo method. Select three players that initially satisfy your conditions. Randomly select another player who doesn't belong to the current trio, if he satisfies the conditions save the combination and repeat. If you're optimizing (it's not clear but your question has optimization in the tag), then the new player has to result in a new trio that's better than the last, so if he doesn't improve your objective function (whatever it might be), then you don't accept the trade.

Upvotes: 1

Related Questions