user4425844
user4425844

Reputation:

better way of counting unique item

I just find a way to count that number of unique items in a vector. This is my most naïve approach.

std::vector<Items> v;

// some other work
std::vector<Items> unique_Count;
unique_Count.clear();
std::unique_copy(v.begin, v.end(), std::back_inserter(unique_Count);
int uniqueCount = unique_Count.size();

Is this the only with or this there a better way in the standard library?

Upvotes: 14

Views: 30452

Answers (3)

Andreas Hadjigeorgiou
Andreas Hadjigeorgiou

Reputation: 356

The last answer to this post I saw was around 6 years ago. I have been looking for a solution to this as I came across it recently in one project that I have been working on, and I would like to share my solution. I think is a more modern, templated solution, one could use as a black box, or so, for data stored in std::vector containers.

Let's say you have the following code in header file uniques.hpp

#ifndef UNIQUES_HPP
#define UNIQUES_HPP

#include <algorithm>
#include <vector>

template<typename T>
std::vector<T> unique_values( std::vector<T> & input_vec ){

    std::vector<T> uniques( input_vec.size() );
    typename std::vector<T>::iterator it;
    it = std::unique_copy (input_vec.begin(), input_vec.end(), uniques.begin() );

    std::sort( uniques.begin(), it );

    it = std::unique_copy( uniques.begin(), it, uniques.begin() );

    uniques.resize( std::distance(uniques.begin(), it) );

    return uniques;
}

template<typename T>
std::vector<int> count_unique_values( std::vector<T> & input_vec ){

    std::vector<T> uniques = unique_values<T>( input_vec );
    std::vector<int> counts( uniques.size() );

    for(size_t i = 0; i < counts.size(); ++i)
        counts[i] = std::count( input_vec.begin(), input_vec.end(), uniques[i] );

    return counts;
}

#endif

One could find unique values in vectors of integers, floats, or even strings using the first function appearing in the header file above unique_values. On top of that, I have a second function count_unique_values which returns the number of occurrences per unique value.

Below I show three examples for int, float and std::string data stored in vectors.

#include <iostream>
#include <vector>
#include <string>

#include "uniques.hpp"

template<typename T>
std::ostream & operator<<(std::ostream & o, const std::vector<T> & v){
    
    o << "[ ";
    for (size_t i = 0; i < v.size()-1; i++)
        o << v[i] << ", ";

    o << v[v.size()-1];

    std::cout << " ]" << std::endl;
    
    return o;
}

int main(){

// Example with integers
    std::cout << "Example with integers" << std::endl;
    std::cout << "---------------------" << std::endl;
    
    std::vector<int> v_init_int = {1,5,5,1,1,1,3,5,4,3,3,3,0};
    std::cout << "Initial vector: " << v_init_int;
    
    std::vector<int> v_uniques_int = unique_values<int>( v_init_int );
    std::cout << "Vector of unique values: " << v_uniques_int;
    
    std::vector<int> v_counts_int = count_unique_values<int>( v_init_int );
    std::cout << "Vector of unique counts: " << v_counts_int;
    std::cout << "\n\n";


// Example with floats
    std::cout << "Example with floats" << std::endl;
    std::cout << "---------------------" << std::endl;
    
    std::vector<float> v_init_floats = {1.4,5.2,5.2,1.0,1.0,1.0,3.2,5.2,4.0,3.3,3.2,3.3,0.1};
    std::cout << "Initial vector: " << v_init_floats;
    
    std::vector<float> v_uniques_floats = unique_values<float>( v_init_floats );
    std::cout << "Vector of unique values: " << v_uniques_floats;
    
    std::vector<int> v_counts_floats = count_unique_values<float>( v_init_floats );
    std::cout << "Vector of unique counts: " << v_counts_floats;
    std::cout << "\n\n";


// Example with strings
    std::cout << "Example with strings" << std::endl;
    std::cout << "--------------------" << std::endl;
    
    std::vector<std::string> v_init_strings = {"hi","hey","hey","hola","hola","hi","hi","hi","hola","hey","hey","hola","hi"};
    std::cout << "Initial vector: " << v_init_strings;
    
    std::vector<std::string> v_uniques_strings = unique_values<std::string>( v_init_strings );
    std::cout << "Vector of unique values: " << v_uniques_strings;
    
    std::vector<int> v_counts_strings = count_unique_values<std::string>( v_init_strings );
    std::cout << "Vector of unique counts: " << v_counts_strings;

    return 0;
}

The output of the above main.cpp program would be:

Example with integers
---------------------
Initial vector: [ 1, 5, 5, 1, 1, 1, 3, 5, 4, 3, 3, 3, 0 ]
Vector of unique values: [ 0, 1, 3, 4, 5 ]
Vector of unique counts: [ 1, 4, 4, 1, 3 ]


Example with floats
---------------------
Initial vector: [ 1.4, 5.2, 5.2, 1, 1, 1, 3.2, 5.2, 4, 3.3, 3.2, 3.3, 0.1 ]
Vector of unique values: [ 0.1, 1, 1.4, 3.2, 3.3, 4, 5.2 ]
Vector of unique counts: [ 1, 3, 1, 2, 2, 1, 3 ]


Example with strings
--------------------
Initial vector: [ hi, hey, hey, hola, hola, hi, hi, hi, hola, hey, hey, hola, hi ]
Vector of unique values: [ hey, hi, hola ]
Vector of unique counts: [ 4, 5, 4 ]

Upvotes: -1

Yakk - Adam Nevraumont
Yakk - Adam Nevraumont

Reputation: 275830

struct iterator_hash {
  template<class Iterator>
  size_t operator()(Iterator it) const {
    using value_type = typename std::decay< decltype(*it) >::type;
    return std::hash<value_type>{}( *it );
  }
};
struct iterator_element_equals {
  template<class Iterator>
  size_t operator()(Iterator lhs, Iterator rhs) const {
    return *lhs == *rhs;
  }
};
std::vector<Items> v;
std::unordered_set<std::vector<Items>::iterator, iterator_hash, iterator_element_equals> s;
for(auto it = v.begin(); it != v.end(); ++it) {
  s.insert(it); // not *it
}
size_t uniqueCount = s.size();

here I create a hash on vector iterators that hashes and compares on the underlying elements (do not pass it the .end() iterator).

Then I insert the iterators from the set into it, and ask it how big it is.

We could instead use a std::set<Iterator, iterator_less> or something if you prefer.

Upvotes: 3

Jerry Coffin
Jerry Coffin

Reputation: 490603

It can depend on what you mean by "better", but there are certainly ways that are simpler, and other ways that are probably faster.

The really simple way would be to insert the items into a std::set or std::unordered_set. When you've inserted all of them, the size of the set will be the number of unique items.

The probably faster method would be to use std::sort and std::unique to find the unique items "in place" instead of copying them. This is pretty much what std::unique_copy will normally do internally anyway, but doing it in place saves a fair amount on allocation and copying.

std::vector<Items> v;

// populate v with data

std::sort(v.begin(), v.end());
int uniqueCount = std::unique(v.begin(), v.end()) - v.begin();

Upvotes: 18

Related Questions