Reputation: 137
Just started experimenting with C, coming from a Java background and I am having an issue trying to remove a section of a string. The basic logic to this one is I have a String (which I found out is an array of chars in C, Very cool!), and once a certain condition is met while going through this string, I want to delete the rest of the String. so for example, if My string was "hello world!", and I set the condition as a blank space, I would like to delete everything following that blank space, so just return "hello ". I had an idea of noting the index where the condition was met, and creating a second array and filling it, then deleting the previous one, however I'm certain there is a better way of doing this. If anyone can help, that would be greatly appreciated, Thank you all in advance!
edit: The Idea is I want to take in user input, the specific case I have is if there is a single dot "." which has a "next line" as the previous and next element, or a "next line" as the previous element and a null as the next argument. so basically:
(if string[n] == ".")
{
if((string[n-1]==\n && string[n+1]==\n) || (string[n-1]==\n && string[n+1]==
null)
{ Then remove everything past this point}
}
input:
hello world
this is ok.
.
Everything here will be deleted.
Output:
hello world
this is ok.
.
Edit 2: Thank you all for some great advice so far, I am still running into issues with he program however, So here I will post the code for the main method so far (just testing the delete rest of string part (have not added user input yet).
//main method
int main(void)
{
char test = "This is a sample text.\
The file will be terminated by a single dot: .\
The program continues processing the lines because the dot (.)\
did not appear at the beginning.\
. even though this line starts with a dot, it is not a single dot.\
The program stops processing lines right here.\
.\
You wont be able to feed any more lines to the program.";
int n =0;
while(test[n] != NULL)
{
if (test[n]=='.')
{
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
{
test[n] = '\0';
}
}
n++;
}
printf("%c\n",test);
return 0;
}
The idea here is that I will eventually send the string word by word to an insertion sort linked list function and sort the string alphabetically after removing everything after the dot as specified. The problem now is I am encountering errors for some reason, If anybody could help sort them out, I would greatly appreciate the help.
Errors in main:
345500375/source.c: In function ‘main’:
345500375/source.c:64:17: warning: initialization makes integer from pointer without a cast [-Wint-conversion]
char test = "This is a sample text.\
^~~~~~~~~~~~~~~~~~~~~~~~
The file will be terminated by a single dot: .\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The program continues processing the lines because the dot (.)\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
did not appear at the beginning.\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
. even though this line starts with a dot, it is not a single dot.\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The program stops processing lines right here.\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.\
~~
You wont be able to feed any more lines to the program.";
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
345500375/source.c:73:15: error: subscripted value is neither array nor pointer nor vector
while(test[n] != NULL)
^
345500375/source.c:75:17: error: subscripted value is neither array nor pointer nor vector
if (test[n]=='.')
^
345500375/source.c:77:21: error: subscripted value is neither array nor pointer nor vector
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
^
345500375/source.c:77:40: error: subscripted value is neither array nor pointer nor vector
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
^
345500375/source.c:77:61: error: subscripted value is neither array nor pointer nor vector
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
^
345500375/source.c:77:80: error: subscripted value is neither array nor pointer nor vector
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
^
345500375/source.c:79:19: error: subscripted value is neither array nor pointer nor vector
test[n] = '\0';
^
Upvotes: 0
Views: 178
Reputation: 12668
First of all, your compilation errors depart from the fact that you have declared a char
variable, not a char
array. You can declare a char array with char variable[]
and initialize it with a string (in this case you get an array of n
elements, where n
is the string size in characters, plus one for the final \0
char) Or you can specify a length (in between the brackets) and then initialize also (the unused part of the array is filled with \0
chars, as in char variable[30] = "hello"; /* the five chars of "hello" plus 25 '\0' chars */
)
In java, String
s are immutable. You can extract a substring from them, but they become different instances of the class String
. In C, a string is simply an array of char
s. For C functions dealing with strings, a string extends until the function encounters a character '\0'
, and all the processing of the array (that continues to be the same length) terminates when the '\0'
is found. So the best way to cut a string at some point is to put there a '\0'
character.
BTW, don't use the final \
to continue a string at the next line, it is obsolete by the new C syntax (which is older than some of the readers here, and the compiler will eliminate the newline and the backslash from the input source, making the continuation line to continue the string literal as if you had written it stuck to the end of the previous line --this is, IMHO, not what you want). The new syntax allows a string to continue in the next line by just terminating it (with "
) and start again in the next line (again with "
) as below (so this code is equivalent to what you have written):
char test[] = /* now test is a char array, you need the pair of [] brackets */
"This is a sample text."
"The file will be terminated by a single dot: ."
"The program continues processing the lines because the dot (.)"
"did not appear at the beginning."
". even though this line starts with a dot, it is not a single dot."
"The program stops processing lines right here."
"."
"You wont be able to feed any more lines to the program.";
and also equivalent to this one:
char test[] = /* now test is a char array, you need the pair of [] brackets */
"This is a sample text.The file will be terminated by a single dot: .The program continues processing the lines because the dot (.)did not appear at the beginning.. even though this line starts with a dot, it is not a single dot.The program stops processing lines right here..You wont be able to feed any more lines to the program.";
but if you want the newlines to be included in the string literal, then you have
to include explicit \n
characters on them, as below:
char test[] = /* now test is a char array, you need the pair of [] brackets */
"This is a sample text.\n"
"The file will be terminated by a single dot: .\n"
"The program continues processing the lines because the dot (.)\n"
"did not appear at the beginning.\n"
". even though this line starts with a dot, it is not a single dot.\n"
"The program stops processing lines right here.\n"
".\n"
"You wont be able to feed any more lines to the program.\n";
if you want to end the string in the single dot that is preceded and followed by a
\n
, then you can use the strstr()
function to find the place of the sequence you are following and put a '\0'
in the appropiate place.
char *p = strstr(test, "\n.\n");
/* p (if found, e.g. not NULL) will point to the first \n, so we must use the
* address of the next char */
/* we can do the following as we know that the string extends past the
* position in which the dot is, because we have found (in the string) the
* sequence, that extends past the place we are going to put it. */
if (p) /* this is the same as if (p != NULL) */
p[1] = '\0'; /* put a \0 in the position of the dot */
printf("The cut text is: %s", test);
Your code has still another error, this time it is grave, which will lead you to runtime problems (possibly undetected by the compiler or until the code has been used for a long time), as you can access (and C doesn't check for bounding errors like java does) if you happen to find a dot character in the first position of the array. n
will be 0
and when you try to access test[n-1]
in your if
statement, you'll be accessing the previous to first element in the array (this is test[-1]
). This would throw an ArrayOutOfBoundsException
in java, but C has not such protection. This problem will not happen at the end of the string (despite you also access the next char to the dot), because even if you find the dot at the end of the string, the following char (and there must be one) will be the last \0
char, so no problem will arise from this (as it has been pointed also erroneously in some other answers)
Upvotes: 1
Reputation:
I presume that you are using char*
as your "string." In truth, this is not a string that is comparable to Java's strings at all or most languages in general.
A number of developers and libraries (and languages, except maybe Pascal) use something closer to a struct in order to store a string, ex:
struct string {
char * pointer;
unsigned short length;
};
This offers a few advantages, namely O(1) time complexity length lookups.
In your case, it would allow you to quickly create substrings or slices from your old string, while not modifying memory at all.
If we were to use that very rudimentary struct that I had provided as an example:
// note: you need the keyword struct behind every usage of a struct, many times structs are type aliased solely because of that
struct string userInput = {
.pointer = "words in str.ing",
.length = 16
};
// ...
unsigned short whereDotIs = 13;
struct string result = {
.pointer = userInput.pointer,
.length = whereDotIs
};
At this point, you could read past the dot (.
), but it's not apart of the abstract idea of a "string" that we have now established, it is just random memory.
Although, there are cases in which you would need to work with null-terminated character pointers, in which @Mustafa Quraish's answer will suffice, unless the string is const qualified, in which you would have to stick to your original solution: copying the array into a new one.
Upvotes: 1
Reputation: 694
This would do the trick (assuming all indices are in bounds):
if (string[n]=='.') {
if (string[n-1]=='\n' && string[n+1]=='\n') {
string[n+1] = '\0';
}
}
Upvotes: 2