Reputation: 3791
I want to copy
files in a directory which contain all the lines of an inputFile. Here is an example:
inputFile
Line3
Line1
LineX
Line4
LineB
file1
Line1
Line2
LineX
LineB
file2
Line100
Line10
LineB
Line4
LineX
Line3
Line1
Line4
Line1
The script is expected to copy only file2 to a destination directory since all lines of the inputFile are found in file2 but not in file1.
I could compare individual file
with inputFile
as discussed partly here and copy files manually if script produced no output. That is;
awk 'NR==FNR{a[$0];next}!($0 in a)' file1 inputFile
Line3
Line4
awk 'NR==FNR{a[$0];next}!($0 in a)' file2 inputFile
warranting no need to copy file1; however, replacing file2 will produce no result indicating all lines of inputFile are found in file2; so do a cp file2 ../distDir/
.
This will be time taking and hope there will be some way I could do it in a for loop
. I am not particular about awk
, any bash scripting tool can be used.
Thank you,
Upvotes: 1
Views: 574
Reputation: 133518
Could you please try following and let me know if this helps you.
I have written "echo cp " val " destination_path"
in system
, so you could remove echo from it and put destination_path's actual value too once you are happy with echo result(which will simply print eg--> cp file2 destination_path
)
awk 'function check(array,val,count){
if(length(array)==count){
system("echo cp " val " destination_path")
}
}
FNR==NR{
a[$0];
next
}
val!=FILENAME{
check(a,val,count)
}
FNR==1{
val=FILENAME;
count=total="";
delete b
}
($1 in a) && !b[$1]++{
count++
}
END{
check(a,val,count)
}
' Input_file file1 file2
Will add explanation shortly too.
EDIT1: As per OP file named which should be compared by Input_file could be anything so changed code as per that request.
find -type f -exec awk 'function check(array,val,count){
if(length(array)==count){
system("echo cp " val " destination_path")
}
}
FNR==NR{
a[$0];
next
}
val!=FILENAME{
check(a,val,count)
}
FNR==1{
val=FILENAME;
count=total="";
delete b
}
($1 in a) && !b[$1]++{
count++
}
END{
check(a,val,count)
}
' Input_file {} +
Explanation: Adding explanation too as follows.
find -type f -iname "file*" -exec awk 'function check(array,val,count){ ##Using find command to get only the files in a directory, using exec passing their values to awk too.From here awk code starts, creating a function named check here, which will have parameters array,val and count to be passed into it, whenever a call is being made to it.
if(length(array)==count){ ##Checking here if length of array is equal to variable count, if yes then do following action.
system("echo cp " val " destination_path")##Using awks system function here by which we could execute shell commands in awk script, so I have written here echo to only check purposes initially, it will print copy command if any files al lines are matching to Input_file file, if OP is happy with it OP should remove echo then.
}
}
FNR==NR{ ##FNR==NR condition will be only TRUE when very first file named Input_file is being read.
a[$0]; ##creating an array named a whose index is current line.
next ##using next keyword will skip all further statements.
}
val!=FILENAME{ ##checking here when variable val is not having same value as current file name then perform following actions.
check(a,val,count) ##calling check function with passing arguments of array a,val,count.
}
FNR==1{ ##Checking if FNR==1, which will be true whenever a new files first line is being read.
val=FILENAME; ##creating variable named val whose value is current Input_file filename.
count=total=""; ##Nullifying variables count and total now.
delete b ##Deleting array b here.
}
($1 in a) && !b[$1]++{ ##Checking if first field of file is in array a and it is not present more than 1 time in array b then do following
count++ ##incrementing variable named count value to 1 each time cursor comes inside here.
}
END{ ##starting awk END block here.
check(a,val,count) ##Calling function named check with arguments array a,val and count in it.
}
' Input_file {} + ##Mentioning Input_file here
PS: I tested/written this in GNU awk.
Upvotes: 0
Reputation: 92854
bash (with comm + wc commands) solution:
#!/bin/bash
n=$(wc -l inputFile | cut -d' ' -f1) # number of lines of inputFile
for f in /yourdir/file*
do
if [[ $n == $(comm -12 <(sort inputFile) <(sort "$f") | wc -l | cut -d' ' -f1) ]]
then
cp "$f" "/dest/${f##*/}"
fi
done
comm -12 FILE1 FILE2
- output only lines that appear in both filesUpvotes: 1
Reputation: 1888
Assuming the following:
inputFile
../distDir/
You may run a BASH
script like the following which basically loops over all the files, compares them against the base file and copies them if required.
#!/bin/bash
inputFile="./inputFile"
targetDir="../distDir/"
for file in *; do
dif=$(awk 'NR==FNR{a[$0];next}!($0 in a)' $file $inputFile)
if [ "$dif" == "" ]; then
# File contains all lines, copy
cp $file $targetDir
fi
done
Upvotes: 2