Reputation: 151
I am new to regular expressions in C and I am trying to find if the given filename is under a folder using regex using regex.h
library. This is what I have tried:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <regex.h>
int checkregex(char regex_str[100], char test[100]) {
regex_t regex;
printf("regex_str: %s\n\n", regex_str);
int reti = regcomp(®ex, regex_str, REG_EXTENDED | REG_ICASE);
if (reti) {
fprintf(stderr, "Could not compile regex\n");
exit(1);
}
reti = regexec(®ex, test, 0, NULL, REG_EXTENDED | REG_ICASE);
regfree(®ex);
return reti;
}
void main(int argc, char *argv[]) {
const char *safepath = "/home";
size_t spl = strlen(safepath);
char *fn = argv[1];
int noDoubleDots = checkregex("[^..\\/]", fn);
int allowedChars = checkregex("^[[:alnum:]\\/._ -]*$", fn);
int backslashWithSpace = checkregex(".*(\\ ).*", fn);
puts("noDoubleDots");
puts((noDoubleDots == 0 ? "Match\n" : "No Match\n"));
puts("allowedChars");
puts((allowedChars == 0 ? "Match\n" : "No Match\n"));
puts("backslashWithSpace");
puts((backslashWithSpace == 0 ? "Match\n" : "No Match\n"));
return;
}
My first attempt was just do not match if it includes ..
(I couldn't even manage to do it) with noDubleDots
. But then I tested and saw that file names and folder names can have ..
in them, like folder..name/
. So I wanted to exclude the ones with /..
or ../
. But if the folder name is something like folder ..
and it has another folder inside named folder2/
then the path will be folder\ ../folder2
and excluding ../
would result in wrong output.
In the code, allowedChars works fine. I think if I also checked if the file name has exactly ..
, \ ..
or \ ([:alnum:])*
to validate the file path, it would be done. But my regular expression doesn't seem to be working. For example, backslashWithSpace matches with asd /
and asd\ /
.
How can I check and make sure that the given path is under a folder using regular expressions? Thanks in advance.
Upvotes: 1
Views: 796
Reputation: 26757
POSIX offer a nice function realpath()
realpath() expands all symbolic links and resolves references to /./, /../ and extra '/' characters in the null-terminated string named by path to produce a canonicalized absolute pathname. The resulting pathname is stored as a null-terminated string, up to a maximum of PATH_MAX bytes, in the buffer pointed to by resolved_path. The resulting path will have no symbolic link, /./ or /../ components.
If you can use it, I think it will fit your need, if not maybe you could copy the source code.
Upvotes: 2