Reputation: 500
I have some user input following this format:
Playa Raco#path#5#39.244|-0.257#0-23
The #
here acts as a separator, and the |
is also a separator for the latitude and longitude. I would like to extract this information. Note that the strings could have spaces.
I tried using the %[^\n]%*c
formatter with scanf
and adding #
and |
, but it doesn't work because it matches the whole line.
I would like to keep this as simple as possible, I know that I could do this reading each char, but I'm curious to see best practices and check if there is a scanf
or similar alternative for this.
Upvotes: 1
Views: 84
Reputation: 84559
As mentioned in the comments, there are many ways you can parse the information from the string. You can walk a pair of pointers down the string, testing each character and taking the appropriate action, you can use strtok()
, but note strtok()
modifies the original string, so it cannot be used on a string-literal, you can use sscanf()
to parse the values from the string, or you can use any combination of strcspn()
, strspn()
, strchr()
, etc. and then manually copy each field between a start and end pointer.
However, your question also imposes "I would like to keep this as simple as possible..." and that points directly to sscanf()
. You simply need to validate the return and you are done. For example, you could do:
#include <stdio.h>
#define MAXC 16 /* adjust as necessary */
int main (void) {
const char *str = "Playa Raco#path#5#39.244|-0.257#0-23";
char name[MAXC], path[MAXC], last[MAXC];
int num;
double lat, lon;
if (sscanf (str, "%15[^#]#%15[^#]#%d#%lf|%lf#%15[^\n]",
name, path, &num, &lat, &lon, last) == 6) {
printf ("name : %s\npath : %s\nnum : %d\n"
"lat : %f\nlon : %f\nlast : %s\n",
name, path, num, lat, lon, last);
}
else
fputs ("error: parsing values from str.\n", stderr);
}
(note: the %[..]
conversion does not consume leading whitespace, so if there is a possibility of leading whitespace or a space following '#'
before a string conversion, include a space in the format string, e.g. " %15[^#]# %15[^#]#%d#%lf|%lf# %15[^\n]"
)
Where each string portion of the input to be split is declared as a 16
character array. Looking at the format-string, you will note the read of each string is limited to 15
characters (plus the nul-terminating) character to ensure you do not attempt to store more characters than your arrays can hold. (that would invoke Undefined Behavior). Since there are six conversions requested, you validate the conversion by ensuring the return is 6
.
Example Use/Output
Taking this approach, the output above would be:
./bin/parse_sscanf
name : Playa Raco
path : path
num : 5
lat : 39.244000
lon : -0.257000
last : 0-23
No one way is necessarily "better" than another so long as you validate the conversions and protect the array bounds for any character arrays filled. However, as far as simple as possible goes, it's hard to beat sscanf()
here -- and it doesn't modify your original string, so it is safe to use with string-literals.
Upvotes: 5