Reputation: 29
I have hundreds of CSV files stored on a unix / Linux directory. Their names adhere to the following format: MMYYY_foo.csv
. For example,
072019_foo.csv
122018_foo.csv
I'm trying to compile and convert these individually to XML using a Perl script. The command takes on the form ./script.pl MMMMYY_foo
, so the following commands would need to be executed in the above example:
./script.pl 072019_foo
./script.pl 122018_foo
Rather than executing the perl script for each file individually within UNIX / LINUX I am trying to loop through the files, passing them to the perl script for compiling. Tediously researching SO among other sources I came down to the following ...
find . -type -f -name '*.csv' -exec perl script.pl $('-printf "%f\n"') {} \;
However this does not work. Rather it outputs multiple ".xml". Undoubtedly the file name (minus paths and extensions) is not being passed to the script correctly as in the code example above. I've tried multiple variations of ...
$(-printf "%f\n"')
And I know therein lies my problem. In many instances I'm just getting multiple ".xml". I feel I'm on the cusp of finding the solution. It's just that I'm not understanding the appropriateness of the command line function beyond -exec. So I'm asking for any help as to whether anyone knows the solution.
Upvotes: 2
Views: 635
Reputation: 6808
OP's sample of find
indicates that all and every cvs file in a directory required to be processed.
Assumed not recursion into directory structure is required.
Power of bash shell could be used for this purpose with file extension to be stripped off before passing to script
for f in *.cvs
do
./script.pl ${f%.*}
done
If this task will be repeated on regular base the script above can be stored as shell script or other perl wrapper script created
#!/usr/bin/env perl
use strict;
use warnings;
my $re = qr/(\d{6}_foo).cvs/;
for ( glob('./*.cvs') ) {
system('./script.pl', $1) if /$re/;
}
Natural behavior of find
command is recursion into directory structure. OP should indicate if recursion is desirable or not in the post.
Suggestion: familiarize yourself with 3.5.3 Shell Parameter Expansion, How To Use Bash Parameter Substitution Like A Pro
Upvotes: 1
Reputation: 386501
That command executes a file named -printf "%f\n"
before doing anything else, which obviously fails noisily.
I think you were going for something like
find . -type -f -name '*.csv' -printf '%f\0' | xargs -r0 ./script.pl
But that has two problems:
find
does by default). You've confirmed in the comments that you don't need to do a recursive search.As such, the following is the solution you seek:
find . -maxdepth 1 -name '*.csv' -printf '%f\0' |
perl -0lpe's/\.[^.]*\z//' |
xargs -r0 ./script.pl
or just
perl -0le'print s/\.[^.]*\z//r for @ARGV' -- *.csv |
xargs -r0 ./script.pl
or just
perl -e'system("./script.pl", s/\.[^.]*\z//r) for @ARGV' -- *.csv
or just
perl -e'system("./script.pl", s/\.[^.]*\z//r) for glob("*.csv")'
The first and last one will handle very long lists of files better than the other two.
Upvotes: 1
Reputation: 207758
You can get them all done very simply and in parallel with GNU Parallel like this:
parallel --dry-run perl script.pl {.} ::: *csv
Sample Output
perl script.pl 072019_foo
perl script.pl 122018_foo
If that looks correct, back up your files and run it again without the --dry-run
to do it for real.
You can add a progress bar with parallel --bar ...
Upvotes: 1