Jordan Garner
Jordan Garner

Reputation: 147

Bash for loop pull text from file recursively

I am having trouble writing a Bash for loop script that can extract the contents of a specific file that is common to many child directories under a parent directory.

Directory structure:

/Parent/child/grand_child/great_grand_child/file

In which there are many child, grandchild, and great grandchild folders.

I want my script to do (in psuedo-code):

For EVERY grand_child folder, in EVERY child folder:

  1. Search through ONLY ONE great_grand_child folder
  2. find the file named 0001.txt
  3. print the text in row 10 of 0001.txt to an output file
  4. In the next Column of the output file, print the full directory path to the file that the text was extracted from.

My script so far:

for i in /Parent/**; do
if [ -d "$i" ]; then
echo "$i"
fi
done

Can I have some help designing this script? So far this gives me the path to each grand_child folder, but I don't know how to isolate just one great_grand_child folder, and then ask for text in row 10 of the 0001.txt file inside the great_grand_child folder.

Upvotes: 0

Views: 128

Answers (2)

Ewan Mellor
Ewan Mellor

Reputation: 6847

# For every grandchild directory like Parent/Child/Grandchild
for grandchild in Parent/*/*
do
   # Look for a file like $grandchild/Greatgrandchild/0001.txt
   for file in "$grandchild/"*/0001.txt
   do
     # If there is no such file, just skip this Grandchild directory.
     if [ ! -f "$file" ]
     then
       echo "Skipping $grandchild, no 0001.txt files" >&2
       continue
     fi

     # Otherwise print the 10th line and the file that it came from.
     awk 'FNR == 10 { print $0, FILENAME }' "$file"

     # Don't look at any more 0001.txt files in this Grandchild directory,
     # we only care about one of them.
     break
   done
done

Upvotes: 1

Jonathan Leffler
Jonathan Leffler

Reputation: 754090

Given that the names are sane (no spaces or other awkward characters), then I'd probably go with:

find /Parent -name '0001.txt' |
sort -t / -k1,1 -k2,2 -k3,3 -u |
xargs awk 'FNR == 10 { print $0, FILENAME }' > output.file

Find the files named 0001.txt under /Parent. Sort the list so that there is just one entry per /Parent/Child/Grandchild. Run awk as often as necessary, printing line 10 of each file along with the file name. Capture the output in output.file.

Upvotes: 1

Related Questions