Reputation: 61
printf "2015-03-02|/home/user/.ssh/config\n2015-03-02|/home/user/Desktop/temp328\n" | awk -F\| 'if ( -f $2 ) { print $2}'
or
printf "2015-03-02|/home/user/.ssh/config\n2015-03-02|/home/user/Desktop/temp328\n" | awk -F\| '{if (system("test -f" $2)) print $2}'
/home/user/.ssh/config\n2015-03-02 - exists
/home/user/Desktop/temp328 - removed
I want print only exist files but this commands not working.
Upvotes: 6
Views: 13214
Reputation: 31
In GNU AWK there is loadable library on C language "filefuncs". It loads filsystem data about files, directories, sockets etc. I suppose a quick way to get information about a file is not to use external calls, but an internal function.
#!/usr/bin/gawk -f
@load "filefuncs"
function exist(file){
return stat(file, null)
}
BEGIN{
print exist("/etc/passwd")}
If file exists it returns '0', else: '-1'
'null' - any free name for an array (2-nd argument is required!)
If you don't want to use any functions, voila:
#!/usr/bin/gawk -f
@load "filefuncs"
BEGIN{print stat("/etc/passwd", null)}
Upvotes: 0
Reputation: 2805
i'm re-pasting my answer here from another thread, since it seems relative in terms of checking file. I'm mostly adding the generic case about how system( ) can be leveraged to do strange things
In fact, under certain circumstances, you indeed can leverage system()
to directly get the output you desire, without having to deal with formatting a command, running it through getline, storing it temporarily, resetting RS (if you've set it to "^$" before), and to also close that command before returning the output, as such :
-rw-r--r-- 1 501 20 77079 Jul 26 13:07 ./selectWoo.full.min.js.txt
valid file :: exist_and_non_empty
non-existent file :: cannot locate
32297 gprintf '\033c\033[3J'; echo; ls -lFGnd "./selectWoo.full.min.js"*;
mawk2 'function filetest(fn) {
gsub(/\047/,"&\134\047&",fn); # in case single-qt in filename
return
system(" exit \140 [ -r \047"(fn)"\047 ] \140 ")
? "cannot locate"
: "exist_and_non_empty"
} BEGIN {
ORS = "\n\n";
fn_pfx="./selectWoo.full.min.js";
print "\nvalid file :: " filetest(fn_pfx ".txt");
print "non-existent file :: " filetest(fn_pfx ".txt_fake")
}' ;
history 1 ; echo
I'm only making it more verbose here for illustrative purposes. Instead of returning whether the system()
call was successful or not, we directly set the exit code to be that of the file test.
If you want to simplify the return to be boolean, then make it
return ! system(…)
You can also perform other tasks, too, as long as the outputs are non-negative integers (assume they will exit_code % 256
before returning, as long as you're comfortable interpreting that output. quick example (\047
is single quote '
, \045
is percent %
, 140 is grave-accent [ ` ] )
mawk2 'BEGIN { a = "0123456789ABCDEF"; print
system(" exit \140 printf \047\045s\047 \047"(a)"\047
| wc -c \140 "); }'
which properly prints out "16" for measuring length of string.
I'm fully aware this is a horrible way of using system( ) and POSIX exit codes.
Upvotes: 0
Reputation: 153
You can easily do this with BASH and feed/pipe the results to AWK.
% ls
file_list file1 file3
% cat file_list
file1
file2
file3
file4
% cat file_list | bash -c 'while read file ; do [ -f "$file" ] || echo "No file: $file"; done'
No file: file2
No file: file4
Upvotes: 0
Reputation: 5061
Not really my answer however it hasn't been documented here yet. From "The GNU Awk User's Guide":
Gives this method:
# readable.awk --- library file to skip over unreadable files
BEGIN {
for (i = 1; i < ARGC; i++) {
if (ARGV[i] ~ /^[[:alpha:]_][[:alnum:]_]*=.*/ \
|| ARGV[i] == "-" || ARGV[i] == "/dev/stdin")
continue # assignment or standard input
else if ((getline junk < ARGV[i]) < 0) # unreadable
delete ARGV[i]
else
close(ARGV[i])
}
}
The actual snippet is processing the command line. The useful bit for the question is the else if ...
else if ((getline junk < ARGV[i]) < 0) # unreadable
delete ARGV[i]
:
That is basicaly a readline
on the file named in ARGV[i]
, when it fails then they delete the array element. File does not exist or unreadable.
Either way you can't use it. All in the same aWk
process, no exec to the shell, etc.
I need this today and I wrote the following small function:
## file_exist
# * ref: [12.3.3 Checking for Readable Data Files](http://langevin.univ-tln.fr/cours/COMPIL/tps/awk.html#File-Checking)
# o [The GNU Awk User's Guide](http://langevin.univ-tln.fr/cours/COMPIL/tps/awk.html)
#
function file_exist( file_path, _rslt, _junk )
{
_rslt = (0==1); # false
if( (getline _junk < file_path) > 0) ) ## readable
{
_rslt = (1==1);
close( file_path );
}
return _rslt;
}
Note:
Upvotes: 0
Reputation: 37404
With GNU awk you can use stat()
included with the filefuncs
extension:
$ ls -l
-rw-r--r-- 1 james james 4 Oct 3 12:48 foo
-rw------- 1 root root 0 Oct 3 12:48 bar
Awk:
$ awk -v file=foo '
@load "filefuncs"
BEGIN {
ret=stat(file,fdata)
printf "ret: %d\nsize: %d\n",ret,fdata["size"]
}'
Output for -v file= foo
:
ret: 0
size: 4
for bar
:
ret: 0
size: 0
and for nonexistent baz
:
ret: -1
size: 0
Upvotes: 5
Reputation: 46836
It's easy to check for the existence of a readable file in awk, without having to resort to spawning something with system()
. Just try to read from the file.
From awk's man page (on my system anyway):
In all cases, getline returns 1 for a successful input, 0 for end of file, and -1 for an error.
So. Some example code.
#!/usr/bin/awk -f
function file_exists(file) {
n=(getline _ < file);
if (n > 0) {
print "Found: " file;
return 1;
} else if (n == 0) {
print "Empty: " file;
return 1;
} else {
print "Error: " file;
return 0;
}
}
BEGIN {
file_exists(ARGV[1]);
}
Gives me these results:
$ touch /tmp/empty
$ touch /tmp/noperm ; chmod 000 /tmp/noperm
$ ./check.awk /etc/passwd
Found: /etc/passwd
$ ./check.awk /nonexistent
Error: /nonexistent
$ ./check.awk /tmp/empty
Empty: /tmp/empty
$ ./check.awk /tmp/noperm
Error: /tmp/noperm
Using your sample data:
$ fmt="2015-03-02|/home/user/.ssh/config\n2015-03-02|/home/user/Desktop/temp328\n"
$ printf "$fmt" | cut -d\| -f2 | xargs -n 1 ./check.awk
Error: /home/user/.ssh/config
Error: /home/user/Desktop/temp328
For more general use, you could shorten this function to something like:
function file_exists(file) {
if ((getline _ < file) >= 0) { return 1; }
}
Upvotes: 1
Reputation: 189357
The second attempt was fairly close; you need a space after the test -f
.
base$ echo '2015|/etc/mtab
> 2015|/etc/ntab' | awk -F\| '{ if (system("test -f " $2)) print $2}'
/etc/ntab
You probably want to invert to use if (system(...)==0)
to get the semantics you expected. Also, somewhat more elegantly, Awk wants a condition outside the braces, so you can avoid the explicit if
.
awk -F\| 'system("test -f " $2)==0 { print $2 }'
Agree with commenters that using Awk for this is borderline nuts.
If, as indicated in comments, you need to work with completely arbitrary file names, you can add code to quote any shell specials:
awk -F\| 'system ("test -f " gensub(/[^\/A-Za-z0-9]/, "\\\\&", "g", $2))==0 {
print $2 }' # caveat: gensub() is gawk only
... but your overall solution does not cope with file names containing a newline character or a pipe character (since you are using those as record and field separators, respectively) so again, abandoning Awk and starting over with a different approach may be the sane way forward.
(The character class in the substitution is incomplete; there are various punctuation characters etc which could be added, and I may be missing something significant; but on quick examination, the superfluous backslashes should be harmless. If you don't have Gawk, see here and/or, again, consider abandoning this approach.)
while IFS='|' read -r stuff filename; do
test -f "$filename" && echo "$filename"
done <<':'
2015|/etc/mtab
2016|/etc/ntab
2017|/path/to/file with whitespace in name
2018|/path/to/file\with[funny"characters*in(file'name|even pipes, you see?
:
(Still no way to have a newline, but everything else should be fine.)
Upvotes: 7