Reputation:
I have some data which I need to print, for simplicity lets say it is a container (vector) of people which have some parameters. In different parts of my program, I need to print all of them sorted by different parameter. My question are
1.) which container to choose? (personally I chose vector).
2.) What approach is better, sort whole vector every time or it is better to make a copy of that vector and save it sorted? In my solution I sorted the same vector every time, but maybe it is not a correct approach with huge vectors because of speed.
class Person
{
private:
std::string name;
std::string surname;
int age;
public:
Person(std::string name, std::string surname, int age) : name{ name }, surname{ surname }, age{ age } {};
void print() { std::cout << name << " " << surname << " " << age << std::endl; };
static bool sortName(Person const &A, Person const &B) { return A.name < B.name; };
static bool sortSurname(Person const &A, Person const &B) { return A.surname < B.surname; };
static bool sortAge(Person const &A, Person const &B) { return A.age < B.age; };
};
main:
int main()
{
std::vector<Person> persons;
Person person1("John", "Smith", 30);
Person person2("Mark", "Cooper", 28);
Person person3("George", "Orwell", 19);
persons.push_back(person1);
persons.push_back(person2);
persons.push_back(person3);
std::sort(persons.begin(), persons.end(), Person::sortSurname);
for (int i = 0; i < persons.size(); ++i)
{
persons[i].print();
}
// do some other stuff here ... and then ...
std::sort(persons.begin(), persons.end(), Person::sortName);
for (int i = 0; i < persons.size(); ++i)
{
persons[i].print();
}
// do some other stuff here ... and then ...
std::sort(persons.begin(), persons.end(), Person::sortAge);
for (int i = 0; i < persons.size(); ++i)
{
persons[i].print();
}
return 0;
}
Upvotes: 4
Views: 399
Reputation: 35708
I would use 3 std::set
instances of std::shared_ptr<Person>
each sorted by corresponding field of Person
:
int main()
{
std::shared_ptr<Person> person1 = std::make_shared<Person>("John", "Smith", 30);
std::shared_ptr<Person> person2 = std::make_shared<Person>("Mark", "Cooper", 28);
std::shared_ptr<Person> person3 = std::make_shared<Person>("George", "Orwell", 19);
std::set<std::shared_ptr<Person>> persons1([](std::shared_ptr<Person> a, std::shared_ptr<Person> b) {
return a->name < b->name;
});
std::set<std::shared_ptr<Person>> persons2([](std::shared_ptr<Person> a, std::shared_ptr<Person> b) {
return a->surname < b->surname;
});
std::set<std::shared_ptr<Person>> persons3([](std::shared_ptr<Person> a, std::shared_ptr<Person> b) {
return a->age < b->age;
});
persons1.insert(person1);
persons1.insert(person2);
persons1.insert(person3);
persons2.insert(person1);
persons2.insert(person2);
persons2.insert(person3);
persons3.insert(person1);
persons3.insert(person2);
persons3.insert(person3);
return 0;
}
std::shared_ptr
you will not waste memory when storing
objects in several containers.std::set
is already sorted container, so that you have not to sort
it every time when using it, just enumerate elements from begin to
end.Upvotes: 2
Reputation: 28837
Consider replacing your vector of Person, by vector of pointer to Person. With that in place, it is pretty cheap to swap two Persons just by swapping the pointers. Then use the functors defined in your class to put the pointers in the desired sort order, and start printing.
Upvotes: 2
Reputation: 69854
boost::multi_index_container
allows you to define a container of any type with any number of different indexes, or views.
The container keeps the indexes up to date automatically upon insertion and removal.
It's a massive template library which takes a little time to get used to, but the documentation is good, with plenty of examples.
Here's an implementation expressed this way:
#include <iostream>
#include <string>
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/mem_fun.hpp>
class Person {
private:
std::string name;
std::string surname;
int age;
public:
Person(std::string name, std::string surname, int age) : name{name}, surname{surname}, age{age} {};
auto get_name() const -> const std::string& { return name; }
auto get_surname() const -> const std::string& { return surname; }
auto get_age() const -> int { return age; }
void print() const { std::cout << name << " " << surname << " " << age << std::endl; };
};
namespace bmi = boost::multi_index;
struct by_name {};
struct by_surname {};
struct by_age;
using PersonTable = boost::multi_index_container<Person,
bmi::indexed_by<
bmi::ordered_non_unique<bmi::tag<by_name>, bmi::const_mem_fun<Person,std::string const&,&Person::get_name>>,
bmi::ordered_non_unique<bmi::tag<by_surname>, bmi::const_mem_fun<Person,std::string const&,&Person::get_surname>>,
bmi::ordered_non_unique<bmi::tag<by_age>, bmi::const_mem_fun<Person,int,&Person::get_age>>
>
>;
int main()
{
PersonTable people;
people.insert(Person("John", "Smith", 30));
people.insert(Person("Mark", "Cooper", 28));
people.insert(Person("George", "Orwell", 19));
std::cout << "by name" << std::endl;
for (auto&& person : people.get<by_name>())
{
person.print();
}
std::cout << "\nby surname" << std::endl;
for (auto&& person : people.get<by_surname>())
{
person.print();
}
std::cout << "\nby age" << std::endl;
for (auto&& person : people.get<by_age>())
{
person.print();
}
}
expected output:
by name
George Orwell 19
John Smith 30
Mark Cooper 28
by surname
Mark Cooper 28
George Orwell 19
John Smith 30
by age
George Orwell 19
Mark Cooper 28
John Smith 30
documentation here: http://www.boost.org/doc/libs/1_64_0/libs/multi_index/doc/index.html
Upvotes: 5
Reputation: 31447
If the vector is small or the elements are cheap to copy you can probably just re-sort it when needed without any problems.
If the elements of the vector are large and expensive to copy you can sort the vector once in one of the ways you need and then create a second vector of std::reference_wrappers and sort that a different way, to create a second "view" of the original vector that does not modify the original nor copy the elements to the second vector.
As for container choice; just go with std::vector
unless you specifically need some special property of one of the other containers.
In any case, benchmark different solutions (with a optimized build) and measure the performance of different solutions before setteling on one.
Upvotes: 1
Reputation: 37468
Instead of sorting vector of objects (which is rather is expensive for complex objects that have many fields) you should build several vectors of indexes to objects stored in main vector and sort them by various criteria.
#include <algorithm>
...
::std::vector< Person > persons;
// add persons...
::std::vector< ::std::size_t > sorted_indexes;
sorted_indexes.reserve(persons.size());
{
::std::size_t index{};
::std::generate
(
sorted_indexes.begin()
, sorted_indexes.end()
, [&index]{return index++;}
);
}
::std::sort
(
sorted_indexes.begin()
, sorted_indexes.end()
, [&persons](::std::size_t const left, ::std::size_t const right)
{
return Person::sortSurname(persons[left], persons[right]);
}
);
for(auto person_index: sorted_indexes)
{
persons[person_index].print();
}
Also sortSurname should take a const reference to avoid copying:
static bool sortSurname(Person const & left, Person const & right) { return left.surname < right.surname; };
Upvotes: 1
Reputation: 2382
Among many it depends primarily on 3 factors -
1. Data Size
2. What kind of performance you are looking at
3. Amount of space (memory) you can trade off for #2
In general std::sort()
performs at an avg at nlogn -
Complexity On average, linearithmic in the distance between first and last: Performs approximately N*log2(N) (where N is this distance) comparisons of elements, and up to that many element swaps (or moves).
Now if your use case involves the sort methods to be invoked too often, it might make sense to pre-sort and save the vectors - In which case you will have considerable performance gain. Now in this design you have to factor in cases like is the collection modifiable? If yes then how often? then you have to consider avg case insertion performance hit.
So in summary it depends
Upvotes: 1
Reputation: 14587
IMO, the approach you are using right now is good to go i.e. sort whenever you need it on runtime. For larger datasets, you need to first evaluate your requirements in terms of memory and processing power. For example, for a very large dataset, you won't be able to sort it in memory. And, there'll be synchronization issues in case you decide towards a multithreading solution. So, you'd be needing some specialized solution like a DBMS where you can query sorted data as you required on runtime. You'll have features like indexes to optimize query time.
Upvotes: 1