iMataMi
iMataMi

Reputation: 59

Remove duplicate elements from pointer array C

I am trying to take user input and print out each word on a separate line(without duplicates). What i have done so far is able to take user input and print each line separately in an alphabetical order. What i need to do right now is be able to remove the duplicates within the array that's char* argue[]

My Input:

./a.out banana apple apple apple zoo cat fork

My output:

apple
apple
apple
banana
cat
fork
zoo

what needs to be done is print one apple instead of three.

Here is what i have done so far and I have commented the part of code where the problem is

#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[]) {
  int i, j, k, size;
  size = argc -1;
  char *key;
  char* a[argc-1];

  for (i = 2; i < argc; i++) {
    key = argv[i];

    j = i-1;
    while (j >= 1 && strcmp(argv[j], key) > 0) {
      argv[j+1] = argv[j];
      j--;
    }

    argv[j+1] = key;
  }

  //Problem                                                                                                                                                   
  //for(i = 2; i < size; i++){                                                                                                                                
  //    if(argv[i-1] != argv[i])                                                                                                                              
  //      a[i] = argv[i-1];                                                                                                                                   
  //}                                                                                                                                                         

  //for(i=0; i< size; i++)                                                                                                                                    
  //  puts(a[i]);                                                                                                                                             

  for(i=1; i< argc; i++)
    puts(argv[i]);

  return 0;
}

Upvotes: 1

Views: 2696

Answers (2)

Vlad from Moscow
Vlad from Moscow

Reputation: 311018

First of all you could use standard C function qsort declared in header <stdlib.h>.

If you want to output the parameters excluding duplicates then there is no need to remove parameters. You can just output unique parameters.

The program can look the following way

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

int cmp(const void *left, const void *right)
{
    return strcmp(*(const char **)left, *(const char **)right);
}

int main( int argc, char * argv[] )
{
    if (argc > 1)
    {
        qsort(argv + 1, argc - 1, sizeof(*argv), cmp);

        for (int i = 1; i < argc; )
        {
            puts(argv[i]);
            while (argv[++i] != NULL && 
                   strcmp(argv[i - 1], argv[i] ) == 0);
        }
    }

    return 0;
}

If to supply these command line parameters

banana apple apple apple zoo cat fork

then the program output will be the following

apple
banana
cat
fork
zoo

If you are indeed going to "remove" the duplicated parameters then argc shall have a correct value relative to the modified list of the parameters and argv[argc] shall be equal to NULL.

The program can look the following way

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

int cmp(const void *left, const void *right)
{
    return strcmp(*(const char **)left, *(const char **)right);
}

int main( int argc, char * argv[] )
{
    if (argc > 1)
    {
        qsort(argv + 1, argc - 1, sizeof(*argv), cmp);

        int n = 1;

        for (int i = 1; i < argc; i++)
        {
            int j = 1;
            while (j < n && strcmp(argv[j], argv[i]) != 0) j++;

            if (j == n)
            {
                if (n != i) argv[n] = argv[i];
                ++n;
            }
        }

        argc = n;
        argv[argc] = NULL;
    }

    for ( int i = 1; i < argc; i++ ) puts(argv[i]);

    return 0;
}

Its output will be the same as it is shown above.

Upvotes: 2

Bodo Thiesen
Bodo Thiesen

Reputation: 2514

In C, strings are not a data type itself but a convention introduced by the language. It is important to keep that in mind.

Now, if you have two char pointers, then even if the strings they point to are identical, the addresses at which they are stored may not be the same. So, the comparison

if (argv[i-1] != argv[i])

will not check for the strings to be identical, but instead will only compare the addresses. So:

char * a = "Hello world!\n";
char * b = a;

if (a == b) {
    puts("Always true\n");
}

Because here, you don't actually copy the string, you just copy the address where the string is stored. This also yields to this effect:

char a[20] = "Hello world!\n";
char * b = a;
a[0] = 'X';
puts(b); // Will print "Xello world!\n";

So, what you need is some way to compare the string pointed to by the two variables. That's what strcmp is for. It will return 0, if the strings are identical, or -1 or +1 if the first string comes alphabetically before or after the second string.

So use the test

if (!strcmp(argv[i-1], argv[i]))

and all should work as expected. Keep in mind, that your array a doesn't contain copies of the strings, but instead only point to the strings in argv.

/edit: Actually, there are some other small bugs, but once this problem is solved, I guess you will be able to fix the rest too.

=== recommendation hinted to in the comment ===

In a code like this:

for (i = 2; i < size; i++) {
    if (argv[i-1] != argv[i])
        a[i] = argv[i-1];
}

you can easily get things wrong by adding a line like so:

for (i = 2; i < size; i++) {
    if (argv[i-1] != argv[i])
        printf("argv[%i] and argv[%i] are identical\n", i-1, i);
        a[i] = argv[i-1];
}

because there you may forget to add another curly brake. So, I suggest to always add it even for one-liners like so:

for (i = 2; i < size; i++) {
    if (argv[i-1] != argv[i]) {
        a[i] = argv[i-1];
    }
}

even thou they are not really necessary here.

Upvotes: 0

Related Questions