MrSSS16
MrSSS16

Reputation: 473

C parse comments from textfile

So I am trying to implement a very trivial parser for reading a file and executing some commands. I guess very similar to bash scripts, but much simpler. I am having trouble figuring out how to tokenise the contents of a file given you are able to have comments denoted by #. To give you an example of how a source file might look

# Script to move my files across
# Author: Person Name 

# First delete files
 "rm -rf ~/code/bin/debug";
 "rm -rf ~/.bin/tmp"; #deleting temp to prevent data corruption
# Dump file contents
 "cat ~/code/rel.php > log.txt";

So far here is my code. Note that I am essentially using this little project as a means of become more comfortable and familiar with C. So pardon any obvious flaws in the code. Would appreciate the feedback.

// New line.
#define NL '\n'
// Quotes.
#define QT '"'
// Ignore comment.
#define IGN '#'

int main() {
  if (argc != 2) {
    show_help();
    return 0;
  }
  FILE *fptr = fopen(argv[1], "r");
  char *buff;
  size_t n = 0;
  int readlock = 0;
  int qread = 0;
  char c;

  if (fptr == NULL){
     printf("Error: invalid file provided %s for reading", argv[1]);
     exit(1);
  }

 fseek(fptr, 0, SEEK_END);
 long f_size = ftell(fptr);
 fseek(fptr, 0, SEEK_SET);
 buff = calloc(1, f_size);

 // Read file contents.
 // Stripping naked whitespace and comments.
 // qread is when in quotation mode. Everything is stored even '#' until EOL or EOF.
 while ((c = fgetc(fptr)) != EOF) {
    switch(c) {
        case IGN :
            if (qread == 0) {
                readlock = 1;
            }
            else {
                buff[n++] = c;
            }
            break;
        case NL :
            readlock = 0;
            qread = 0;
            break;
        case QT :
            if ((readlock == 0 && qread == 0) || (readlock == 0 && qread == 1)) {
                // Activate quote mode.
                qread = 1;
                buff[n++] = c;
            }
            else {
                qread = 0;
            }
            break;
        default :
            if ((qread == 1 && readlock == 0) || (readlock == 0  && !isspace(c))) {
                buff[n++] = c;
            }
            break;
    }
 }
fclose(fptr);
printf("Buffer contains %s \n", buff);
free(buff);

return 0;

}

So the above solution works but my question is...is there a better way to achieve the desired outcome ? At the moment i don't actually "tokenize" anything. Does the current implementation lack logic to be able create tokens based on the characters ?

Upvotes: 0

Views: 719

Answers (1)

c-smile
c-smile

Reputation: 27460

It is way easier to read your file by whole lines:

char line[1024];

while(!feof(fptr)) 
{
  if(!fgets (line , 1024 , fptr))
    continue;

  if(line[0] == '#') // comment 
    continue; // skip it

  //... handle command in line here 
}

Upvotes: 2

Related Questions