grep for two patterns independently (in different lines)

Question

I have some directories with the following structure:

DAY1/ # Files under this directory should have DAY1 in the name.
|-- Date
|   |-- dir1 # Something wrong here, there are files with DAY2 and files with DAY1.
|   |-- dir2
|   |-- dir3
|   |-- dir4
DAY2/ # Files under this directory should all have DAY2 in the name.
|-- Date
|   |-- dir1
|   |-- dir2 # Something wrong here, there are files with DAY2, and files with DAY1.
|   |-- dir3
|   |-- dir4

In each dir there are hundreds of thousands of files with names containing DAY, for example 0.0000.DAY1.01927492. Files with DAY1 on the name should only appear under parent directory DAY1.

Something went wrong when copying files around, so that I now have mixed files with DAY1 and DAY2 in some of the dir directories.

I wrote a script to find folders that contain mixed files, so I can then look at them more closely. My script is the following:

for directory in */; do
    if ls $directory | grep -q DAY2 ; then
        if ls $directory | grep -q DAY1; then 
              echo "mixed files in $directory";
        fi ; 
    fi; 
done

The problem here is that I'm going through all files twice, which doesn't make sense considering that I'd only have to look through the files once.

What would be a more efficient way achieve what I want?

heemayl · Accepted Answer

If i understand you correctly, then you need to find the files under DAY1 directory recursively that have DAY2 in their names, similarly for DAY2 directory the files what have DAY1 in their names.

If so, for DAY1 directory:

find DAY1/ -type f -name '*DAY2*'

this will get you the files under DAY1 directory that have DAY2 in their names. Similarly for DAY2 directory:

find DAY2/ -type f -name '*DAY1*'

Both are recursive operations.

To get the directory names only:

find DAY1/ -type f -name '*DAY2*' -exec dirname {} +

Note that the $PWD will be shown as ..

To get uniqueness, pass the output to sort -u:

find DAY1/ -type f -name '*DAY2*' -exec dirname {} + | sort -u

grep for two patterns independently (in different lines)

Answers (2)

Related Questions