Reputation: 473
So I am trying to implement a very trivial parser for reading a file and executing some commands. I guess very similar to bash scripts, but much simpler. I am having trouble figuring out how to tokenise the contents of a file given you are able to have comments denoted by #
. To give you an example of how a source file might look
# Script to move my files across
# Author: Person Name
# First delete files
"rm -rf ~/code/bin/debug";
"rm -rf ~/.bin/tmp"; #deleting temp to prevent data corruption
# Dump file contents
"cat ~/code/rel.php > log.txt";
So far here is my code. Note that I am essentially using this little project as a means of become more comfortable and familiar with C. So pardon any obvious flaws in the code. Would appreciate the feedback.
// New line.
#define NL '\n'
// Quotes.
#define QT '"'
// Ignore comment.
#define IGN '#'
int main() {
if (argc != 2) {
show_help();
return 0;
}
FILE *fptr = fopen(argv[1], "r");
char *buff;
size_t n = 0;
int readlock = 0;
int qread = 0;
char c;
if (fptr == NULL){
printf("Error: invalid file provided %s for reading", argv[1]);
exit(1);
}
fseek(fptr, 0, SEEK_END);
long f_size = ftell(fptr);
fseek(fptr, 0, SEEK_SET);
buff = calloc(1, f_size);
// Read file contents.
// Stripping naked whitespace and comments.
// qread is when in quotation mode. Everything is stored even '#' until EOL or EOF.
while ((c = fgetc(fptr)) != EOF) {
switch(c) {
case IGN :
if (qread == 0) {
readlock = 1;
}
else {
buff[n++] = c;
}
break;
case NL :
readlock = 0;
qread = 0;
break;
case QT :
if ((readlock == 0 && qread == 0) || (readlock == 0 && qread == 1)) {
// Activate quote mode.
qread = 1;
buff[n++] = c;
}
else {
qread = 0;
}
break;
default :
if ((qread == 1 && readlock == 0) || (readlock == 0 && !isspace(c))) {
buff[n++] = c;
}
break;
}
}
fclose(fptr);
printf("Buffer contains %s \n", buff);
free(buff);
return 0;
}
So the above solution works but my question is...is there a better way to achieve the desired outcome ? At the moment i don't actually "tokenize" anything. Does the current implementation lack logic to be able create tokens based on the characters ?
Upvotes: 0
Views: 719
Reputation: 27460
It is way easier to read your file by whole lines:
char line[1024];
while(!feof(fptr))
{
if(!fgets (line , 1024 , fptr))
continue;
if(line[0] == '#') // comment
continue; // skip it
//... handle command in line here
}
Upvotes: 2