Reputation: 241
I have a collection of files that all have a specific sequence in them. The files are named sequentially, and I want to copy over the first instance of each file that has a unique sequence.
For example,
1.txt Content: 1[Block]Alpha[/Block]1
2.txt Content: 2[Block]Beta[/Block]2
3.txt Content: 3[Block]Charlie[/Block]3
4.txt Content: 4[Block]Alpha[/Block]4
I want the output to be
Alpha.txt Content: 1[Block]Alpha[/Block]1
Beta.txt Content: 2[Block]Beta[/Block]2
Charlie.txt Content: 3[Block]Charlie[/Block]3
4.txt is missing, as it has 'Alpha' in it which a previous file already matched on.
Currently, I Have the following:
ls | sort -r | xargs grep -oE -m 1 '[Block].{0,40}[/Block]'
#which returns:
1.txt:[Block]Alpha[Block]
2.txt:[Block]Beta[Block]
3.txt:[Block]Charlie[Block]
4.txt:[Block]Alpha[Block]
I want to separate the filename from the left of the ':' and rename it to either everything to the right of it (including Block).txt, or just Alpha.txt (for example).
cp has -n flag for no overwriting, so as long as I do it in sequence i should have no issue there, but I am a bit lost how to continue
Upvotes: 2
Views: 670
Reputation: 5975
Here is a solution that uses one awk process to do the search and extract the filenames and the text between blocks. For the first occurence, it checks if the matched text has been used already, if not it prints, and goes to next file. Output is piped to xargs -n2
with the cp
command.
#!/bin/bash
awk '/\[Block\].*\[\/Block\]/ {
gsub(/^.*\[Block\]/,""); gsub(/\[\/Block\].*$/,"")
if (!a[$0]++) print FILENAME, $0 ".txt"; nextfile
}' *.txt | xargs -n2 echo cp -n --
Note: remove echo after you are done with testing.
Testing with your sample files:
> sh test.sh
cp -n -- 1.txt Alpha.txt
cp -n -- 2.txt Beta.txt
cp -n -- 3.txt Charlie.txt
Upvotes: 1
Reputation: 122
I your case, you want to rename your files in a directory with pattern matched from content of those files, and remove a file that duplicated with other?
I have tested on directory /tmp/test. In this dir, i have 4 file (1.txt 2.txt 3.txt, 4.txt) and write a shell script to perform requirement.
shell script as below:
#/bin/bash
cd /tmp/test
files=$(ls)
for i in $files; do
pattern=$(cat $i | sed "s/Block//g" | grep -o "[a-Z][a-Z]*")
if ! echo $pattern_list | grep -w $pattern; then
echo "Rename $i to ${pattern}.txt"
mv $i ${pattern}.txt
pattern_list+="$pattern "
else
rm $i
fi
done
Brief explain:
The Result as below:
sh /tmp/myscript.sh
Rename 1.txt to Alpha.txt
Rename 2.txt to Beta.txt
Rename 3.txt to Charlie.txt
Alpha Beta Charlie
ls
Alpha.txt Beta.txt Charlie.txt
Upvotes: 0