purpletonic
purpletonic

Reputation: 1878

How do I get the top n elements per group within a sorted list with Ramda JS

I'm fairly new to RamdaJS and functional programming, so I was hoping to understand if there's a more efficient way to achieve the following.

Suppose I have a list of footballers. Each footballer in the list has properties for their name, their team, and their monthly pay.

const footballers = [
  {
    name: "Mesut Ozil",
    team: "Arsenal",
    pay: 1400000
  },
  {
    name: "Lionel Messi",
    team: "Barcelona",
    pay: 7300000
  },
  {
    name: "Kylian Mbappe",
    team: "PSG",
    pay: 1500000
  },
  {
    name: "Alexis Sanchez",
    team: "Manchester United",
    pay: 1990000
  },
  {
    name: "Philippe Coutinho",
    team: "Barcelona",
    pay: 2000000
  },
  {
    name: "Neymar",
    team: "PSG",
    pay: 2700000
  },
  {
    name: "Luis Suarez",
    team: "Barcelona",
    pay: 2500000
  },
  {
    name: "Antoine Griezmann",
    team: "Atletico Madrid",
    pay: 2900000
  },
  {
    name: "Gareth Bale",
    team: "Real Madrid",
    pay: 2200000
  },
  {
    name: "Cristiano Ronaldo",
    team: "Juventus",
    pay: 4100000
  }
]

I wish to order this list by pay, but restrict each team to a maximum of only two players in the final list. Currently my solution has to sort list twice. At the moment it sorts at the beginning and then the final list, but it could also sort after grouping by team. I believe there's a smarter (but still readable) solution that only requires the one sort, but I'm not sure what it is.

const byPayAsc = ascend(prop('pay')) // spare, for checking
const byPayDes = descend(prop('pay'))
const sortByPay = sort(byPayDes)

const groupedByTeam = compose(values, groupBy(prop('team')))
const topTwoPlayers = map(take(2))

const topTwoHighestPaidPlayersPerTeam = pipe( 
  sortByPay,
  groupedByTeam, 
  topTwoPlayers, 
  flatten, 
  sortByPay
)

topTwoHighestPaidPlayersPerTeam(footballers)

My current investigation has unearthed a few options:

What is the idiomatic way of doing this with Ramda JS.

Upvotes: 1

Views: 473

Answers (4)

Hitmands
Hitmands

Reputation: 14179

For sake of simplicity, I'd just write a custom filter function...

const playersPerTeam = (n) => {
  const teams = {};
  
  const predicate = (({ team }) => {
    teams[team] = (teams[team] || 0) + 1;
    
    return teams[team] <= n;
  });
  
  return R.filter(predicate);
};

const fn = R.pipe(
  R.sort(R.descend(R.prop('pay'))),
  playersPerTeam(2),
);

// ------

const data = [
  {
    name: "Mesut Ozil",
    team: "Arsenal",
    pay: 1400000
  },
  {
    name: "Lionel Messi",
    team: "Barcelona",
    pay: 7300000
  },
  {
    name: "Kylian Mbappe",
    team: "PSG",
    pay: 1500000
  },
  {
    name: "Alexis Sanchez",
    team: "Manchester United",
    pay: 1990000
  },
  {
    name: "Philippe Coutinho",
    team: "Barcelona",
    pay: 2000000
  },
  {
    name: "Neymar",
    team: "PSG",
    pay: 2700000
  },
  {
    name: "Luis Suarez",
    team: "Barcelona",
    pay: 2500000
  },
  {
    name: "Antoine Griezmann",
    team: "Atletico Madrid",
    pay: 2900000
  },
  {
    name: "Gareth Bale",
    team: "Real Madrid",
    pay: 2200000
  },
  {
    name: "Cristiano Ronaldo",
    team: "Juventus",
    pay: 4100000
  },
];

console.log(
  fn(data),
);
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js" integrity="sha256-xB25ljGZ7K2VXnq087unEnoVhvTosWWtqXB4tAtZmHU=" crossorigin="anonymous"></script>

Upvotes: 1

Scott Sauyet
Scott Sauyet

Reputation: 50797

A single-sort solution

An approach which actually does avoid a second sort is certainly possible. Here's one version which does so:

const topTwoHighestPaidPlayersPerTeam = pipe (
  sort (descend (prop ('pay'))),
  reduce (
    ({result, counts}, player, {team} = player) => ({
      result: counts [team] >= 2 ? result : [...result, player], 
      counts: {...counts, [team]: (counts[team] || 0) + 1}
    }),
    {result: [], counts: {}}
  ),
  prop ('result')
)

const footballers = [{"name":"Mesut Ozil","team":"Arsenal","pay":1400000},{"name":"Lionel Messi","team":"Barcelona","pay":7300000},{"name":"Kylian Mbappe","team":"PSG","pay":1500000},{"name":"Alexis Sanchez","team":"Manchester United","pay":1990000},{"name":"Philippe Coutinho","team":"Barcelona","pay":2000000},{"name":"Neymar","team":"PSG","pay":2700000},{"name":"Luis Suarez","team":"Barcelona","pay":2500000},{"name":"Antoine Griezmann","team":"Atletico Madrid","pay":2900000},{"name":"Gareth Bale","team":"Real Madrid","pay":2200000},{"name":"Cristiano Ronaldo","team":"Juventus","pay":4100000}]

const result = topTwoHighestPaidPlayersPerTeam(footballers)

console .log (result)
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.min.js"></script>
<script> const {pipe, sort, descend, prop, reduce} = R                   </script>

We start with a standard sort by pay, and then iterate through them, keeping track of per-team counts alongside the results, adding the team whenever the count is less than our maximum of 2. We track this by accumulating an object with both the results and the counts. At the end, we just extract the results property. While this could be done with a filter instead of reduce, the function passed to filter would need to maintain the counts state, which seems like a bad fit for filter.

The slightly more obvious version of

      result: counts [team] < 2 ? [...result, player] : result, 

won't work because initially the team counts are undefined, and undefined < 2 yields false. We could choose this version instead if it seems more clear:

      result: (counts [team] || 0) < 2 ? [...result, player] : result, 

Why we probably shouldn't use this solution

This code does avoid a second sort. But it does so at a cost in readability, and, probably, for reasonable-sized collections, in actual performance.

A two-sort solution is still O (n log (n)); doing something twice does not change the gross performance metric. So this version is not asymptotically faster than a two-sort one. But the code is not nearly as readable as that from either Scott Christopher or Ori Drori. Unless we've measured and can point to specific bottlenecks, adding complexity to our code for performance reasons seems a total waste.

So I would recommend the solution from Ori Drori over this. And Scott Christopher also has an interesting approach as well.

But this sort of technique is still often useful for maintaining some additional state while folding a list of values.

Upvotes: 1

Scott Christopher
Scott Christopher

Reputation: 6516

What you have here is close to a merge sort, with the initial sorting performed per team and then merging the sorted lists of players.

const { descend, prop, take, sort, reduceBy, pipe, values, reduce } = R

const sorter = descend(prop('pay'))

// this is used by `R.reduceBy` below to build up the sorted list with a max size
const addToGroup = (size, compare) => (group, a) =>
  take(size, sort(sorter, [a, ...group]))

const sortByTeam = reduceBy(addToGroup(2, sorter), [], prop('team'))

// recursively merges two sorted(!) lists by the given comparator
const mergeListsBy = compare => {
  const go = (xs, ys) =>
    xs.length == 0 ? ys :
    ys.length == 0 ? xs :
    compare(xs[0], ys[0]) < 0 ? [xs[0], ...go(xs.slice(1), ys)] :
                                [ys[0], ...go(xs, ys.slice(1))]
  return go
}

const run = pipe(sortByTeam, values, reduce(mergeListsBy(sorter), []))

////

const footballers = [{"name": "Mesut Ozil", "pay": 1400000, "team": "Arsenal"}, {"name": "Lionel Messi", "pay": 7300000, "team": "Barcelona"}, {"name": "Kylian Mbappe", "pay": 1500000, "team": "PSG"}, {"name": "Alexis Sanchez", "pay": 1990000, "team": "Manchester United"}, {"name": "Philippe Coutinho", "pay": 2000000, "team": "Barcelona"}, {"name": "Neymar", "pay": 2700000, "team": "PSG"}, {"name": "Luis Suarez", "pay": 2500000, "team": "Barcelona"}, {"name": "Antoine Griezmann", "pay": 2900000, "team": "Atletico Madrid"}, {"name": "Gareth Bale", "pay": 2200000, "team": "Real Madrid"}, {"name": "Cristiano Ronaldo", "pay": 4100000, "team": "Juventus"}]

console.log(run(footballers))
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.min.js"></script>

Upvotes: 1

Ori Drori
Ori Drori

Reputation: 191986

Your idea of sorting again is valid, since grouping the players by team changes the initial sort.

If you want to skip a second sort, you'll need to keep the original index of each item, and then sort by the index anyway. so I'm not sure it's worth it.

However, for the sake of checking the idea, this snippet converts the array items to pairs of [index, player] after sorting, but before grouping by the player's team. When you flatten the groups back to an array using R.values, and R.chain (with R.take), you convert the pairs back at an object using R.fromPairs. Since ES6 traversal of integer object keys is ascending, the original order is restored, and now you get the array via calling R.values again.

const { pipe, sort, descend, prop, toPairs, groupBy, path, values, chain, take, fromPairs } = R

const topTwoHighestPaidPlayersPerTeam = pipe( 
  sort(descend(prop('pay'))), // sort descending by pay
  toPairs, // convert to pairs if [index, player object]
  groupBy(path([1, 'team'])), // group by the team
  values, // get an array of an arrays 
  chain(take(2)), // flatten and take the 1st two items of each group
  fromPairs, // convert to an object of objects with original index as key
  values // convert to an array in the correct order
)

const footballers = [{"name":"Mesut Ozil","team":"Arsenal","pay":1400000},{"name":"Lionel Messi","team":"Barcelona","pay":7300000},{"name":"Kylian Mbappe","team":"PSG","pay":1500000},{"name":"Alexis Sanchez","team":"Manchester United","pay":1990000},{"name":"Philippe Coutinho","team":"Barcelona","pay":2000000},{"name":"Neymar","team":"PSG","pay":2700000},{"name":"Luis Suarez","team":"Barcelona","pay":2500000},{"name":"Antoine Griezmann","team":"Atletico Madrid","pay":2900000},{"name":"Gareth Bale","team":"Real Madrid","pay":2200000},{"name":"Cristiano Ronaldo","team":"Juventus","pay":4100000}]

const result = topTwoHighestPaidPlayersPerTeam(footballers)

console.log(result)
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>

Upvotes: 1

Related Questions