Reputation: 33
I'm currently using the following code to scan each word in a text file, put it into a variable then do some manipulations with it before moving onto the next word. This works fine, but I'm trying to remove all characters that don't fall under A-Z / a-z.
e.g if "he5llo"
was entered I want the output to be "hello"
. If I can't modify fscanf
to do it is there way of doing it to the variable once scanned? Thanks.
while (fscanf(inputFile, "%s", x) == 1)
Upvotes: 3
Views: 4755
Reputation: 1139
luser droog answer will work, but in my opinion it is more complicated than necessary.
foi your simple example you could try this:
while (fscanf(inputFile, "%[A-Za-z]", x) == 1) { // read until find a non alpha character
fscanf(inputFile, "%*[^A-Za-z]")) // discard non alpha character and continue
}
Upvotes: 1
Reputation: 62838
You can give x
to a function like this. First simple version for sake of understanding:
// header needed for isalpha()
#include <ctype.h>
void condense_alpha_str(char *str) {
int source = 0; // index of copy source
int dest = 0; // index of copy destination
// loop until original end of str reached
while (str[source] != '\0') {
if (isalpha(str[source])) {
// keep only chars matching isalpha()
str[dest] = str[source];
++dest;
}
++source; // advance source always, wether char was copied or not
}
str[dest] = '\0'; // add new terminating 0 byte, in case string got shorter
}
It will go through the string in-place, copying chars which match isalpha()
test, skipping and thus removing those which do not. To understand the code, it's important to realize that C strings are just char
arrays, with byte value 0 marking end of the string. Another important detail is, that in C arrays and pointers are in many (not all!) ways same thing, so pointer can be indexed just like array. Also, this simple version will re-write every byte in the string, even when string doesn't actually change.
Then a more full-featured version, which uses filter function passed as parameter, and will only do memory writes if str changes, and returns pointer to str
like most library string functions do:
char *condense_str(char *str, int (*filter)(int)) {
int source = 0; // index of character to copy
// optimization: skip initial matching chars
while (filter(str[source])) {
++source;
}
// source is now index if first non-matching char or end-of-string
// optimization: only do condense loop if not at end of str yet
if (str[source]) { // '\0' is same as false in C
// start condensing the string from first non-matching char
int dest = source; // index of copy destination
do {
if (filter(str[source])) {
// keep only chars matching given filter function
str[dest] = str[source];
++dest;
}
++source; // advance source always, wether char was copied or not
} while (str[source]);
str[dest] = '\0'; // add terminating 0 byte to match condenced string
}
// follow convention of strcpy, strcat etc, and return the string
return str;
}
Example filter function:
int isNotAlpha(char ch) {
return !isalpha(ch);
}
Example calls:
char sample[] = "1234abc";
condense_str(sample, isalpha); // use a library function from ctype.h
// note: return value ignored, it's just convenience not needed here
// sample is now "abc"
condense_str(sample, isNotAlpha); // use custom function
// sample is now "", empty
// fscanf code from question, with buffer overrun prevention
char x[100];
while (fscanf(inputFile, "%99s", x) == 1) {
condense_str(x, isalpha); // x modified in-place
...
}
reference:
Read int isalpha ( int c ); manual:
Checks whether c is an alphabetic letter.
Return Value:
A value different from zero (i.e., true) if indeed c is an alphabetic letter. Zero (i.e., false) otherwise
Upvotes: 3
Reputation: 19504
The scanf
family functions won't do this. You'll have to loop over the string and use isalpha
to check each character. And "remove" the character with memmove
by copying the end of the string forward.
Maybe scanf
can do it after all. Under most circumstances, scanf
and friends will push back any non-whitespace characters back onto the input stream if they fail to match.
This example uses scanf
as a regex filter on the stream. Using the *
conversion modifier means there's no storage destination for the negated pattern; it just gets eaten.
#include <stdio.h>
#include <string.h>
int main(){
enum { BUF_SZ = 80 }; // buffer size in one place
char buf[BUF_SZ] = "";
char fmtfmt[] = "%%%d[A-Za-z]"; // format string for the format string
char fmt[sizeof(fmtfmt + 3)]; // storage for the real format string
char nfmt[] = "%*[^A-Za-z]"; // negated pattern
char *p = buf; // initialize the pointer
sprintf(fmt, fmtfmt, BUF_SZ - strlen(buf)); // initialize the format string
//printf("%s",fmt);
while( scanf(fmt,p) != EOF // scan for format into buffer via pointer
&& scanf(nfmt) != EOF){ // scan for negated format
p += strlen(p); // adjust pointer
sprintf(fmt, fmtfmt, BUF_SZ - strlen(buf)); // adjust format string (re-init)
}
printf("%s\n",buf);
return 0;
}
Upvotes: 0
Reputation: 53
I'm working on a similar project so you're in good hands! Strip the word down into separate parts.
Blank spaces aren't an issue with cin each word You can use a
if( !isPunct(x) )
Increase the index by 1, and add that new string to a temporary string holder. You can select characters in a string like an array, so finding those non-alpha characters and storing the new string is easy.
string x = "hell5o" // loop through until you find a non-alpha & mark that pos
for( i = 0; i <= pos-1; i++ )
// store the different parts of the string
string tempLeft = ... // make loops up to and after the position of non-alpha character
string tempRight = ...
Upvotes: 0
Reputation: 2515
you can use the isalpha()
function checking for all the characters contained into the string
Upvotes: 0