Marco Stramezzi
Marco Stramezzi

Reputation: 2281

Order a list depending on other list

Given:

struct Object {
    int id;
    ...
};

list<Object> objectList;
list<int> idList;

What is the best way to order objectList depending on order of idList?

Example (pseudo code):

INPUT
    objectList = {o1, o2, o3};
    idList = {2, 3, 1};

ACTION
    sort(objectList, idList);

OUTPUT
    objectList = {o2, o3, o1};

I searched in documentation but I only found methods to order elements comparing among themselves.

Upvotes: 2

Views: 140

Answers (6)

overseas
overseas

Reputation: 1731

Here is another variant, which works in O(n log n). This is asymptotcally optimal.

#include <list>
#include <vector>
#include <algorithm>
#include <iostream>
#include <cassert>

int main() {
  struct O {
    int id;
  };
  std::list<O> object_list{{1}, {2}, {3}, {4}};
  std::list<int> index_list{4, 2, 3, 1};
  assert(object_list.size() == index_list.size());

  // this vector is optional. It is needed if sizeof(O) is quite large.
  std::vector<std::pair<int, O*>> tmp_vector(object_list.size());
  // this is O(n)
  std::transform(begin(object_list), end(object_list), begin(tmp_vector),
                 [](auto& o) { return std::make_pair(o.id, &o); });
  // this is O(n log n)
  std::sort(begin(tmp_vector), end(tmp_vector), 
            [](const auto& o1, const auto& o2) {
            return o1.first < o2.first;
            });
  // at this point, tmp_vector holds pairs in increasing index order. 
  // Note that this may not be a contiguous list.

  std::list<O> tmp_list(object_list.size());
  // this is again O (n log n), because lower_bound is O (n)
  // we then insert the objects into a new list (you may also use some 
  // move semantics here).
  std::transform(begin(index_list), end(index_list), begin(tmp_list),
                 [&tmp_vector](const auto& i) {
                   return *std::lower_bound(begin(tmp_vector), end(tmp_vector),
                                            std::make_pair(i, nullptr),
                                            [](const auto& o1, const auto& o2) {
                                              return o1.first < o2.first;
                                            })->second;
                 });

  // As we just created a new list, we swap the new list with the old one.
  std::swap(object_list, tmp_list);

  for (const auto& o : object_list)
    std::cout << o.id << std::endl;

}

I assumed that O is quite large and not easily movable. Therefore i first create tmp_vector which only contains of pairs. Then I sort this vector.

Afterwards I can simply go through the index_list and find the matching indices using binary search.


Let me elaborate on why a map is not the best solution eventhough you get a quite small piece of code. If you use a map you need to rebalance your tree after each insertion. This doesn't cost asympatotically (because n times rebalancing costs you the same as sorting once), but the constant is way larger. A "constant map" makes not that much sense (except accessing it may be easier).

I then timed the "simple" map-approach against my "not-so-simple" vector-approach. I created a randomly sorted index_list with N entries. And this is what I get (in us):

N         map         vector
1000      90          75
10000     1400        940
100000    24500       15000
1000000   660000      250000   

NOTE: This test shows the worst case as in my case only index_list was randomly sorted, while the object_list (which is inserted into the map in order) is sorted. So rebalancing shows all its effect. If the object_list is kind of random, performance will behave more similar, eventhough performance will always be worse. The vector list will even behave better when the object list is completely random.

So already with 1000 entries the difference is already quite large. So I would strongly vote for a vector-based approach.

Upvotes: 2

A.S.H
A.S.H

Reputation: 29352

objectList.sort([&idList] (const Object& o1, const Object& o2) -> bool
   { return std::find(++std::find(idList.begin(), idList.end(), o1.id),
       idList.end(), o2.id)
       != idList.end();
   });

The idea is to check if we find o1.id before o2.id in the idList. We search o1.id, increment the found position then we search o2.id: if found, that implies o1 < o2.

Test

#include <iostream>
#include <string>
#include <list>
#include <algorithm>

struct Object {
    int id;
    string name;
};

int main()
{
  list<Object> objectList {{1, "one_1"}, {2, "two_1"}, {3, "three_1"}, {2, "two_2"}, {1, "one_2"}, {4, "four_1"}, {3, "Three_2"}, {4, "four_2"}};
  list<int> idList {3, 2, 4, 1};

  objectList.sort([&idList] (const Object& o1, const Object& o2) -> bool
    { return std::find(++std::find(idList.begin(), idList.end(), o1.id), idList.end(), o2.id) != idList.end(); });

  for(const auto& o: objectList) cout << o.id << " " << o.name << "\n";
}

/* OUTPUT: 
3 three_1
3 Three_2
2 two_1
2 two_2
4 four_1
4 four_2
1 one_1
1 one_2
*/

Upvotes: 0

zhm
zhm

Reputation: 3651

You can store the objects in an std::map, with id as key. Then traverse idList, get the object out of map with its id.

std::map<int, Object> objectMap;
for (auto itr = objectList.begin(); itr != objectList.end(); itr++)
{
    objectMap.insert(std::make_pair(itr->id, *itr));
}

std::list<Object> newObjectList;
for (auto itr = idList.begin(); itr != idList.end(); itr++)
{
    // here may fail if your idList contains ids which does not appear in objectList
    newObjectList.push_back(objectMap[*itr]);
}

// now newObjectList is sorted as order in idList

Upvotes: 2

JVApen
JVApen

Reputation: 11317

I would find it strange to end up in this situation as I would use the pointers instead of the ids. Though; there might be usecases for this.

Note that in all examples below, I assume that the ids-list contains all ids exactly ones.

Writing it yourself

The issue you like to solve is creating/sorting a list of objects based on the order of the ids in another list.

The naive way of doing this, is simply writing it yourself:

void sortByIdVector(std::list<Object> &list, const std::list<int> &ids)
{
    auto oldList = std::move(list);
    list = std::list<Object>{};
    for (auto id : ids)
    {
        auto itElement = std::find_if(oldList.begin(), oldList.end(), [id](const Object &obj) { return id == obj.id; });
        list.emplace_back(std::move(*itElement));
        oldList.erase(itElement);
    }
}

If you use a sorted vector as input, you can optimize this code to get the best performance out of it. I'm leaving it up-to you to do so.

Using sort

For this implementation, I'm gonna assume this are std::vector instead of std::list, as this is the better container to request the index of an element. (You can with some more code do the same for list)

size_t getIntendedIndex(const std::vector<int> &ids, const Object &obj)
{
    auto itElement = std::find_if(ids.begin(), ids.end(), [obj](int id) { return id == obj.id; });
    return itElement - ids.begin();
}

void sortByIdVector(std::list<Object> &list, const std::vector<int> &ids)
{
     list.sort([&ids](const Object &lhs, const Object &rhs){ return getIntendedIndex(ids, lhs) < getIntendedIndex(ids, rhs); });
}

Insertion

Another approach, also more suitable for std::vector would be simply inserting the elements at the right place and will be more performant than the std::sort.

void sortByIdVector(std::vector<Object> &list, const std::vector<int> &ids)
{
     auto oldList = std::move(list);
     list = std::vector<Object>{};
     list.resize(oldList.size());
     for (Object &obj : oldList)
     {
         auto &newLocation = list[getIntendedIndex(ids, obj)];
         newLocation = std::move(obj);
     }
}

Upvotes: 0

kebs
kebs

Reputation: 6707

Assuming the data is handled to you externally and you don't have the choice of the containers:

assert( objectList.size() == idList.size() );
std::vector<std::pair<int,Object>> wrapper( idList.size() );
auto idList_it     = std::begin( idList );
auto objectList_it = std::begin( objectList );
for( auto& e: wrapper )
    e = std::make_pair( *idList_it++, *objectList_it++ );
std::sort(
    std::begin(wrapper),
    std::end(wrapper),
    []
    (const std::pair<int,Object>& a, const std::pair<int,Object>& b) -> bool
    { return a.first<b.first; }
);

Then, copy back to original container.

{
   auto objectList_it = std::begin( objectList );
   for( const auto& e: wrapper )
      *objectList_it++ = e;
}

But this solution is not optimal, I'm sure somebody will come with a better solution.

Edit: The default comparison operator for pairs requires that it is defined both for first and second members. Thus the easiest way is to provide a lambda.

Edit2: for some reason, this doesn't build if using a std::list for the wrapper. But it's ok if you use a std::vector (see here).

Upvotes: 1

Christian Hackl
Christian Hackl

Reputation: 27538

std::list has a sort member function you can use with a custom comparison functor.

That custom functor has to look up an object's id in the idList and can then use std::distance to calculate the position of the element in idList. It does so for both objects to be compared and returns true if the first position is smaller than the second.

Here is an example:

#include <iostream>
#include <list>
#include <algorithm>
#include <stdexcept>

struct Object
{
    int id;
};

int main()
{
    Object o1 = { 1 };
    Object o2 = { 2 };
    Object o3 = { 3 };
    std::list<Object> objectList = { o1, o2, o3 };
    std::list<int> const idList = { 2, 3, 1 };

    objectList.sort([&](Object const& first, Object const& second)
    {
        auto const id_find_iter1 = std::find(begin(idList), end(idList), first.id);
        auto const id_find_iter2 = std::find(begin(idList), end(idList), second.id);
        if (id_find_iter1 == end(idList) || id_find_iter2 == end(idList))
        {
            throw std::runtime_error("ID not found");
        }
        auto const pos1 = std::distance(begin(idList), id_find_iter1);
        auto const pos2 = std::distance(begin(idList), id_find_iter2);
        return pos1 < pos2;
    });

    for (auto const& object : objectList)
    {
        std::cout << object.id << '\n';
    }
}

It's probably not terribly efficient, but chances are you will never notice. If it still bothers you, you might want to look for a solution with std::vector, which unlike std::list provides random-access iterators. That turns std::distance from O(n) to O(1).

Upvotes: 0

Related Questions