Reputation: 127
I currently have a vector<int> c
which contains {1,2,2,4,5,5,6}
and I want to remove the duplicated numbers so that c
will have
{1,4,6}
. A lot of solutions on the internet seem to just remove one of the duplicates, but I'm trying to remove all occurrences of the duplicated number.
Assume c
is always sorted.
I currently have
#include <iostream>
#include <vector>
int main() {
vector<int> c{1,2,2,4,5,5,6};
for (int i = 0; i < c.size()-1; i++) {
for (int j=1; j<c.size();j++){
if(c[i] == c[j]){
// delete i and j?
}
}
}
}
I tried to use two for-loops so that I can compare the current element and the next element. This is where my doubt kicked in. I'm not sure if I'm approaching the problem correctly.
Could I get help on how to approach my problem?
Upvotes: 10
Views: 2379
Reputation: 217593
Similarly to std::unique
/std::copy_if
, you might do:
void keep_unique(std::vector<int>& v){
auto it = v.begin();
auto w = v.begin();
while (it != v.end())
{
auto next = std::find_if(it, v.end(), [&](int e){ return e != *it; });
if (std::distance(it, next) == 1) {
if (w != it) {
*w = std::move(*it);
}
++w;
}
it = next;
}
v.erase(w, v.end());
}
Upvotes: 2
Reputation: 21927
Iterating in reverse ensures an . Also, no other data structures need to be allocated.O(N)
operation and does not cause element shifting when erasing because we are only ever erasing the last element in the vector
For every element encountered, we check if the adjacent element is equal, and if so, remove all instances of that element.
Requires the vector to be sorted, or at least grouped by duplicates.
#include <iostream>
#include <vector>
int main()
{
std::vector<int> c {1, 2, 2, 4, 5, 5, 6};
for (int i = c.size() - 1; i > 0;)
{
const int n = c[i];
if (c[i - 1] == n)
{
while (c[i] == n)
{
c.erase(c.begin() + i--);
}
}
else
{
i--;
}
}
//output result
for (auto it : c)
{
std::cout<<it;
}
std::cout << std::endl;
}
Output: 146
An actual O(N) implementation using a sentinel value:
#include <iostream>
#include <vector>
#include <limits>
#include <algorithm>
int main()
{
std::vector<int> c { 1, 2, 2, 4, 5, 5, 6 };
const int sentinel = std::numeric_limits<int>::lowest(); //assumed that no valid member uses this value.
for (int i = 0; i < c.size() - 1;)
{
const int n = c[i];
if (c[i + 1] == n)
{
while (c[i] == n)
{
c[i++] = sentinel;
}
}
else
{
i++;
}
}
c.erase(std::remove(c.begin(),c.end(),sentinel), c.end());
for (auto it : c) std::cout<< it << ' ';
}
Upvotes: 1
Reputation: 490228
Almost any time you find yourself deleting elements from the middle of a vector, it's probably best to sit back and think about whether this is the best way to do the job--chances are pretty good that it isn't.
There are a couple of obvious alternatives to that. One is to copy the items you're going to keep into a temporary vector, then when you're done, swap the temporary vector and the original vector. This works particularly well in a case like you've shown in the question, where you're keeping only a fairly small minority of the input data.
The other is to rearrange the data in your existing vector so all the data you don't want is at the end, and all the data you do want is at the beginning, then resize your vector to eliminate those you don't want.
When I doubt, I tend to go the first route. In theory it's probably a bit less efficient (poorer locality of reference) but I've rarely seen a significant slow-down in real use.
That being the case, my initial take would probably be something on this general order:
#include <vector>
#include <iostream>
#include <iterator>
std::vector<int> remove_all_dupes(std::vector<int> const &input) {
if (input.size() < 2) // zero or one element is automatically unique
return input;
std::vector<int> ret;
// first item is unique if it's different from its successor
if (input[0] != input[1])
ret.push_back(input[0]);
// in the middle, items are unique if they're different from both predecessor and successor
for (std::size_t pos = 1; pos < input.size() - 2; pos++)
if (input[pos] != input[pos-1] && input[pos] != input[pos+1])
ret.push_back(input[pos]);
// last item is unique if it's different from predecessor
if (input[input.size()-1] != input[input.size()-2])
ret.push_back(input[input.size() - 1]);
return ret;
}
int main() {
std::vector<int> c { 1, 2, 2, 4, 5, 5, 6 };
std::vector<int> uniques = remove_all_dupes(c);
std::copy(uniques.begin(), uniques.end(), std::ostream_iterator<int>(std::cout, "\n"));
}
Probably a little longer of code than we'd really prefer, but still simple, straightforward, and efficient.
If you are going to do the job in place, the usual way to do it efficiently (and this applies to filtering in general, not just this particular filter) is to start with a copying phase and follow that by a deletion phase. In the copying phase, you use two pointers: a source and a destination. You start them both at the first element, then advance through the input with the source. If it meets your criteria, you copy it to the destination position, and advance both. If it doesn't meet your criteria, advance only the source.
Then when you're done with that, you resize your vector down to the number of elements you're keeping.
void remove_all_dupes2(std::vector<int> & input) {
if (input.size() < 2) { // 0 or 1 element is automatically unique
return;
}
std::size_t dest = 0;
if (input[0] != input[1])
++dest;
for (std::size_t source = 1; source < input.size() - 2; source++) {
if (input[source] != input[source-1] && input[source] != input[source+1]) {
input[dest++] = input[source];
}
}
if (input[input.size()-1] != input[input.size()-2]) {
input[dest++] = input[input.size() - 1];
}
input.resize(dest);
}
At least in my view, the big thing to keep in mind here is the general pattern. You'll almost certainly run into a lot more situations where you want to filter some inputs to those that fit some criteria, and this basic pattern of tracking source and destination, and copying only those from the source to the destination that fit your criteria works well in a lot of situations, not just this one.
Upvotes: 7
Reputation: 3461
This code is based on the insight that an element is unique in a sorted list if and only if it is different from both elements immediately adjacent to it (except for the starting and ending elements, which are adjacent to one element each). This is true because all identical elements must be adjacent in a sorted array.
void keep_unique(vector <int> &v){
if(v.size() < 2){return;}
vector <int> unique;
// check the first element individually
if(v[0] != v[1]){
unique.push_back(v[0]);
}
for(unsigned int i = 1; i < v.size()-1; ++i){
if(v[i] != v[i-1] && v[i] != v[i+1]){
unique.push_back(v[i]);
}
}
// check the last item individually
if(v[v.size()-1] != v[v.size()-2]){
unique.push_back(v[v.size()-1]);
}
v = unique;
}
Upvotes: 4
Reputation: 19133
Generally one has to be very careful when deleting from containers while iterating over them. C++ STL can do this easily and faster (on average) than using nested loops.
#include <vector>
#include <algorithm>
#include <unordered_set>
int main() {
std::vector<int> c{1,2,2,4,5,5,6};
std::unordered_multiset<int> unique( c.begin(), c.end() );
c.erase(std::remove_if(c.begin(),c.end(),[&](const auto& e){return unique.count(e)>1;}),c.end());
for(auto e: c){
std::cout<<e<<' ';
}
}
//Output: 1 4 6
Alternatively, you could use std::map<int,std::size_t>
and count the occurences this way.
Upvotes: 2
Reputation: 14886
Use std::remove_if to move items occurring multiple times to the rear, then erase them.
#include <iostream>
#include <vector>
#include <algorithm>
int main()
{
std::vector<int> V {1,2,2,4,5,5,6};
auto it = std::remove_if(V.begin(), V.end(), [&](const auto& val)
{
return std::count(V.begin(), V.end(), val) > 1;
});
V.erase(it, V.end());
for (const auto& val : V)
std::cout << val << std::endl;
return 0;
}
Output:
1
4
6
For demo: https://godbolt.org/z/j6fxe1
Upvotes: 1
Reputation: 5311
This can be achieved with the proper use of iterators to avoid runtime errors.
Have a look at the following code:
#include <iostream>
#include <vector>
int main() {
std::vector<int> c{1,2,2,4,5,5,6};
for (auto it = c.begin(); it != c.end(); ){
auto it2 = it;
advance(it2, 1);
bool isDuplicate = false;
for(; it2 != c.end(); ++it2){
if(*it == *it2){
c.erase(it2);
isDuplicate = true;
}
}
if(isDuplicate){
auto it3 = it;
advance(it3, 1);
c.erase(it);
it = it3;
}else{
it++;
}
}
for (auto it = c.begin(); it != c.end(); it++){
std::cout<<*it<<" ";
}
}
Output:
1 4 6
Upvotes: 0