Reputation:
Im trying to write a C program that removes all occurrences of repeating chars in a string except the last occurrence.For example if I had the string
char word[]="Hihxiivaeiavigru";
output should be:
printf("%s",word);
hxeavigru
What I have so far:
#include <stdio.h>
#include <string.h>
int main()
{
char word[]="Hihxiiveiaigru";
for (int i=0;i<strlen(word);i++){
if (word[i+1]==word[i]);
memmove(&word[i], &word[i + 1], strlen(word) - i);
}
printf("%s",word);
return 0;
}
I am not sure what I am doing wrong.
Upvotes: 1
Views: 142
Reputation: 12634
Im trying to write a C program that removes all occurrences of repeating chars in a string except the last occurrence.
Process the string (or word) from last character and move towards the first character of string (or word). Now, think of it as a problem where you have to remove all occurrence of a character from string and except the first occurrence. Since, we are processing the string from last character to first character, so, we have to move the characters, which are remain after removing duplicates, to the start of string once you have processed whole string and, if, there were duplicate characters found in the string. The complexity of this algorithm is O(n)
.
Implementation:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define INDX(x) (tolower(x) - 'a')
void remove_dups_except_last (char str[]) {
int map[26] = {0}; /* to keep track of a character processed */
size_t len = strlen (str);
char *p = str + len; /* pointer pointing to null character of input string */
size_t i = 0;
for (i = len; i != 0; --i) {
if (map[INDX(str[i - 1])] == 0) {
map[INDX(str[i - 1])] = 1;
*--p = str[i - 1];
}
}
/* if there were duplicates characters then only copy
*/
if (p != str) {
for (i = 0; *p; ++i) {
str[i] = *p++;
}
str[i] = '\0';
}
}
int main(int argc, char* argv[])
{
if (argc != 2) {
printf ("Invalid number of arguments\n");
return -1;
}
char str[1024] = {0};
/* Assumption: the input string/word will contain characters A-Z and a-z
* only and size of input will not be more than 1023.
*
* Leaving it up to you to check the valid characters in input string/word
*/
strcpy (str, argv[1]);
printf ("Original string : %s\n", str);
remove_dups_except_last (str);
printf ("Removed duplicated characters except the last one, modified string : %s\n", str);
return 0;
}
Testcases output:
# ./a.out Hihxiivaeiavigru
Original string : Hihxiivaeiavigru
Removed duplicated characters except the last one, modified string : hxeavigru
# ./a.out aa
Original string : aa
Removed duplicated characters except the last one, modified string : a
# ./a.out a
Original string : a
Removed duplicated characters except the last one, modified string : a
# ./a.out TtYyuU
Original string : TtYyuU
Removed duplicated characters except the last one, modified string : tyU
Upvotes: 1
Reputation: 153338
With short strings, any algorithm will do. OP's attempt is O(n*n) (as well as other working answers and @David C. Rankin that identified OP's short-comings.)
But what if the string was thousands, millions in length?
Consider the following algorithm: @paulsm4
Form a `bool` array used[CHAR_MAX - CHAR_MIN + 1] and set each false.
i,unique = n - 1;
From the end of the string (n-1 to 0) to the front:
if (character never seen yet) { // used[] look-up
array[unique] = array[i];
unique--;
}
Mark used[array[i]] as true (index from CHAR_MIN)
i--;
Shift the string "to the left" (unique - i) places
Solution is O(n)
Coding goal is too fun to just post a fully coded answer.
Upvotes: 2
Reputation: 84521
Additional areas where you are currently hurting yourself.
for
loop must NOT increment the index, e.g. for (int i=0; word[i];)
. This is because when you memmove()
by 1
, you have just incremented the indexes. That also means the value to save for last
is now i - 1
.strlen()
in the program. You can simply subtract one from length each time memmove()
is called.memmove()
is not called.Additionally, avoid hardcoding strings. You shouldn't have to recompile your code just to test the results of "Hihxiivaeiaigrui"
instead of "Hihxiivaeiaigru"
. You shouldn't have to recompile just to remove all but the last 'a'
instead of the 'i'
. Either pass the string and character to find as arguments to your program (that's what int argc, char **argv
are for), or prompt the user for input.
Putting it altogether you could do (presuming word
is 1023
characters or less):
#include <stdio.h>
#include <string.h>
#define MAXC 1024
int main (int argc, char **argv) {
char word[MAXC]; /* storage for word */
strcpy (word, argc > 1 ? argv[1] : "Hihxiivaeiaigru"); /* copy to word */
int find = argc > 2 ? *argv[2] : 'i', /* character to find */
last = -1; /* last index where find found */
size_t len = strlen (word); /* only compute strlen once */
printf ("%s (removing all but last %c)\n", word, find);
for (int i=0; word[i];) { /* loop over each char -- do NOT increment */
if (word[i] == find) { /* is this my character to find? */
if (last != -1) { /* if last is set */
/* overwrite last with rest of word */
memmove (&word[last], &word[last + 1], (int)len - last);
last = i - 1; /* last now i - 1 (we just moved it) */
len = len - 1;
}
else { /* last not set */
last = i; /* set it */
i++; /* increment loop counter */
}
}
else /* all other chars */
i++; /* just increment loop counter */
}
puts (word); /* output result -- no need for printf (no coversions) */
}
Example Use/Output
$ ./bin/rm_all_but_last_occurrence
Hihxiivaeiaigru (removing all but last i)
Hhxvaeaigru
What if you want to use "Hihxiivaeiaigrui"
? Just pass it as the 1st argument:
$ ./bin/rm_all_but_last_occurrence Hihxiivaeiaigrui
Hihxiivaeiaigrui (removing all but last i)
Hhxvaeagrui
What if you want to use "Hihxiivaeiaigrui"
and remove duplicate 'a'
characters? Just pass the string to search as the 1st argument and the character to find as the second:
$ ./bin/rm_all_but_last_occurrence Hihxiivaeiaigrui a
Hihxiivaeiaigrui (removing all but last a)
Hihxiiveiaigrui
Nothing removed if only one of the characters:
$ ./bin/rm_all_but_last_occurrence Hihxiivaeiaigrui H
Hihxiivaeiaigrui (removing all but last H)
Hihxiivaeiaigrui
Let me know if you have further questions.
Upvotes: 2
Reputation: 8043
You can re-iterate to get each characters of your string, then if it is not "i" and not the last occurrence of the i, copy to a new string.
#include <stdio.h>
#include <string.h>
int main() {
char word[]="Hihxiiveiaigru";
char newword[10000];
char* ptr = strrchr(word, 'i');
int index=0;
int index2=0;
while (index < strlen(word)) {
if (word[index]!='i' || index ==(ptr - word)) {
newword[index2]=word[index];
index2++;
}
index++;
}
printf("%s",newword);
return 0;
}
Upvotes: 0
Reputation: 201409
I would first write a function to determine if a char ch
at a given position i
is the last occurence of ch
given a char *
. Like,
bool isLast(char *word, char ch, int p) {
p++;
ch = tolower(ch);
while (word[p] != '\0') {
if (tolower(word[p]) == ch) {
return false;
}
p++;
}
return true;
}
Then you can use that to iteratively emit your desired characters like
int main() {
char *word = "Hihxiivaeiavigru";
for (int i = 0; word[i] != '\0'; i++) {
if (isLast(word, word[i], i)) {
putchar(word[i]);
}
}
putchar('\n');
}
And (for completeness) I used
#include <stdio.h>
#include <ctype.h>
#include <stdbool.h>
Outputs (as requested)
hxeavigru
Upvotes: 1