H00man
H00man

Reputation: 387

Problem with matching empty string with scanf in a special format

I have CUSTOM_PROMPT_REGX pattern with special conditions.

It's supposed to capture 10 texts which come after each other with | or # as separators. Each of them can be empty so there is no characters between | or # and it will be like "..|#..."

My code is :

#include <stdlib.h>
#include <string.h>
#include <stdio.h>


#define CUSTOM_PROMPT_REGX  "@%39[^|]|%39[^#]#%39[^|]|%39[^#]#%39[^|]|%39[^#]#%39[^|]|%39[^#]#%39[^|]|%39[^@]@"
static unsigned char lines[5][2][40];

int main(void)
{
    memset(lines, 0, sizeof(lines));
    int j = sscanf("@1.SALAM|818BF4F2A8#2.BINGO|828BF8F0F7FE93#3.GOOGLE|838BF1F0F8F0#|#5.WINE|858BF6FE90F8@", CUSTOM_PROMPT_REGX,
           lines[0][0], lines[0][1], lines[1][0], lines[1][1],
           lines[2][0], lines[2][1], lines[3][0], lines[3][1], lines[4][0], lines[4][1]);

    printf("%d\n[%s <=> %s]\n[%s <=> %s]\n[%s <=> %s]\n[%s <=> %s]\n[%s <=> %s]\n", j,
           lines[0][0], lines[0][1], lines[1][0], lines[1][1], lines[2][0], lines[2][1],
           lines[3][0], lines[3][1], lines[4][0], lines[4][1]);
    return 0;
}

and the result is :

6
[1.SALAM <=> 818BF4F2A8]
[2.BINGO <=> 828BF8F0F7FE93]
[3.GOOGLE <=> 838BF1F0F8F0]
[ <=> ]
[ <=> ]
Press <RETURN> to close this window...

It should be :

8
[1.SALAM <=> 818BF4F2A8]
[2.BINGO <=> 828BF8F0F7FE93]
[3.GOOGLE <=> 838BF1F0F8F0]
[ <=> ]
[5.WINE <=> 858BF6FE90F8]

Is there something that i can add to CUSTOM_PROMPT_REGX to solve my problem?

Upvotes: 1

Views: 388

Answers (1)

chux
chux

Reputation: 154325

Each of them can be empty so there is no characters between ...
... is there some thing that i can add to CUSTOM_PROMPT_REGX to solve my problem?

No. %[...] stops the entire sscanf() when nothing is scanned into a specifier. At least 1 character must meet the scan set.

Alternatives:

  1. Scan using one %[...] directive at a time. Easier enough to make a loop to do this.

  2. Use a non sscanf() approach. Research strtok(), strspn(), strcspn().

  3. Scan in the lead separator character into the string and later use the string starting from index 1. In OP's case, no 2 of the 3 separators are use consecutively, so this is a possible approach.

  4. Scan into 5 groups per "%79[^#]# and then sub-divide each. Research strchr(buf80, '|');


Tip

Complex sscanf() formats are easier to code, review and maintain by using string literal concatenation.

#define VB_FMT "%39[^|]|"
#define LB_FMT "%39[^#]#"
#define AT_FMT "%39[^@]@"
#define CUSTOM_PROMPT_REGX  "@" \
    VB_FMT LB_FMT VB_FMT LB_FMT VB_FMT LB_FMT VB_FMT LB_FMT VB_FMT AT_FMT

Sample code performing 1 sscanf() "%[]" specifier at a time.

int main() {
  #define ATVB_FMT "@%n%39[^|]%n"
  #define VBLB_FMT "|%n%39[^#]%n"
  #define LBVB_FMT "#%n%39[^|]%n"
  #define VBAT_FMT "|%n%39[^@]@%n"
  #define N 10

  const char *fmt[10] = {ATVB_FMT, VBLB_FMT, LBVB_FMT, VBLB_FMT, LBVB_FMT,
      VBLB_FMT, LBVB_FMT, VBLB_FMT, LBVB_FMT, VBAT_FMT};

  char lines[N][40];

  const char *buf = \
      "@1.SALAM|818BF4F2A8#2.BINGO|828BF8F0F7FE93#3.GOOGLE|838BF1F0F8F0#|#5.WINE|858BF6FE90F8@";
  const char *s = buf;

  for (int i = 0; i<N; i++) {
    int n1 = 0;
    int n2 = 0;
    sscanf(s, fmt[i], &n1, lines[i], &n2);
    if (n1 == 0) {
      fprintf(stderr, "Failed to find separator %d\n", i);
      return EXIT_FAILURE;
    }
    if (n2 == 0) {
      lines[i][0] = '\0';
      s += n1;
    } else {
      s += n2;
    }
  }

  if (*s) {
    fprintf(stderr, "Failed end %d\n", N);
    return EXIT_FAILURE;
  }

  for (int i = 0; i<N; i++) {
    printf("<%s>\n", lines[i]);
  }
return 0;
}

Output

<1.SALAM>
<818BF4F2A8>
<2.BINGO>
<828BF8F0F7FE93>
<3.GOOGLE>
<838BF1F0F8F0>
<>
<>
<5.WINE>
<858BF6FE90F8>

Upvotes: 3

Related Questions