Insert name here
Insert name here

Reputation: 98

Find all directories containing a file that contains a keyword in linux

In my hierarchy of directories I have many text files called STATUS.txt. These text files each contain one keyword such as COMPLETE, WAITING, FUTURE or OPEN. I wish to execute a shell command of the following form:

./mycommand OPEN

which will list all the directories that contain a file called STATUS.txt, where this file contains the text "OPEN"

In future I will want to extend this script so that the directories returned are sorted. Sorting will determined by a numeric value stored the file PRIORITY.txt, which lives in the same directories as STATUS.txt. However, this can wait until my competence level improves. For the time being I am happy to list the directories in any order.


I have searched Stack Overflow for the following, but to no avail:

I have tried the following shell commands:

This helps me identify all the directories that contain STATUS.txt

$ find ./ -name STATUS.txt

This reads STATUS.txt for every directory that contains it

$ find ./ -name STATUS.txt | xargs -I{} cat {}

This doesn't return any text, I was hoping it would return the name of each directory

$ find . -type d | while read d; do if [ -f STATUS.txt ]; then echo "${d}"; fi; done

Upvotes: 5

Views: 4412

Answers (6)

questionto42
questionto42

Reputation: 9610

Taking up the accepted answer, it does not output a sorted and unique directory list. At the end of the "find" command, add:

| sort -u

or:

| sort | uniq

to get the unique list of the directories.

Credits go to Get unique list of all directories which contain a file whose name contains a string.

Upvotes: 0

Wan B.
Wan B.

Reputation: 18845

Maybe you can try this:

grep -rl "OPEN" . --include='STATUS.txt'| sed 's/STATUS.txt//'

where grep -r means recursive , -l means only list the files matching, '.' is the directory location. You can pipe it to sed to remove the file name.

You can then wrap this in a bash script file where you can pass in keywords such as 'OPEN', 'FUTURE' as an argument.

#!/bin/bash
grep -rl "$1" . --include='STATUS.txt'| sed 's/STATUS.txt//'

Upvotes: 3

Reinstate Monica Please
Reinstate Monica Please

Reputation: 11613

Try something like this

find -type f -name "STATUS.txt" -exec grep -q "OPEN" {} \; -exec dirname {} \;

or in a script

#!/bin/bash 
(($#==1)) || { echo "Usage: $0 <pattern>" && exit 1; }
find -type f -name "STATUS.txt" -exec grep -q "$1" {} \; -exec dirname {} \;

Upvotes: 1

Sylvain Leroux
Sylvain Leroux

Reputation: 52040

... or the other way around:

find . -name "STATUS.txt" -exec grep -lF "OPEN" \{} +

If you want to wrap that in a script, a good starting point might be:

#!/bin/sh

[ $# -ne 1 ] && echo "One argument required" >&2 && exit 2
find . -name "STATUS.txt" -exec grep -lF "$1" \{} +

As pointed out by @BroSlow, if you are looking for directories containing the matching STATUS.txt files, this might be more what you are looking for:

fgrep --include='STATUS.txt' -rl 'OPEN' | xargs -L 1 dirname 

Or better

fgrep --include='STATUS.txt' -rl 'OPEN' |
           sed -e 's|^[^/]*$|./&|' -e 's|/[^/]*$||'
#              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#            simulate `xargs -L 1 dirname` using `sed`  
#      (no trailing `\`; returns `.` for path without dir part)                   

Upvotes: 3

lsdr
lsdr

Reputation: 1235

You could use grep and awk instead of find:

grep -r OPEN * | awk '{split($1, path, ":"); print path[1]}' | xargs -I{} dirname {}

The above grep will list all files containing "OPEN" recursively inside you dir structure. The result will be something like:

dir_1/subdir_1/STATUS.txt:OPEN
dir_2/subdir_2/STATUS.txt:OPEN
dir_2/subdir_3/STATUS.txt:OPEN

Then the awk script will split this output at the colon and print the first part of it (the dir path).

dir_1/subdir_1/STATUS.txt
dir_2/subdir_2/STATUS.txt
dir_2/subdir_3/STATUS.txt

The dirname will then return only the directory path, not the file name, which I suppose it what you want.

I'd consider using Perl or Python if you want to evolve this further, though, as it might get messier if you want to add priorities and sorting.

Upvotes: 0

wojciii
wojciii

Reputation: 4323

IMHO you should write a Python script which:

  • Examines your directory structure and finds all files named STATUS.txt.
  • For each found file:
    • reads the file and executes mycommand depending on what the file contains.

If you want to extend the script later with sorting, you can find all the interesting files first, save them to a list, sort the list and execute the commands on the sorted list.

Hint: http://pythonadventures.wordpress.com/2011/03/26/traversing-a-directory-recursively/

Upvotes: -1

Related Questions