sim
sim

Reputation: 488

How to check all the folder inside files and subfolder inside files have particular string present

import os
match_str = ['20210624']
not_match_str =  ['20210625']
for root, dirs, files in os.walk(path):
    for name in files:
        if name.endswith((".txt")):
             ## search files with match_str `20210624`  and not_match_str `20210625`

Can i use using import walk

Upvotes: 6

Views: 641

Answers (4)

shdxiang
shdxiang

Reputation: 56

You can get the file names with several simple shell commands:

find . -name "*.txt" | xargs grep -l "20210624" | xargs grep -L "20210625"

Upvotes: 1

Red
Red

Reputation: 27547

You can set the recursive keyword argument in the glob.glob() method to True for the program to search recursively through the files of the folders, subfolders, etc.

from glob import glob

path = 'C:\\Users\\User\\Desktop'
for file in glob(path + '\\**\\*.txt', recursive=True):
    with open(file) as f:
        text = f.read()
        if '20210624'  in text and '20210625' not in text:
            print(file)

If you don't want to entire path of the files to be printed; only the filenames, then:

from glob import glob

path = 'C:\\Users\\User\\Desktop'
for file in glob(path + '\\**\\*.txt', recursive=True):
    with open(file) as f:
        text = f.read()
        if '20210624'  in text and '20210625' not in text:
            print(file.split('\\')[-1])

In order to use the os.walk() method, you can use the str.endswith() method (as you have done in your post) like so:

import os

for path, _, files in os.walk('C:\\Users\\User\\Desktop'):
    for file in files:
        if file.endswith('.txt'):
            with open(os.path.join(path, file)) as f:
                text = f.read()
                if '20210624'  in text and '20210625' not in text:
                    print(file)

And to search within a maximum level of subdirectories:

import os

levels = 2
root = 'C:\\Users\\User\\Desktop'
total = root.count('\\') + levels

for path, _, files in os.walk(root):
    if path.count('\\') > total:
        break
    for file in files:
        if file.endswith('.txt'):
            print(os.path.join(path, file))

Upvotes: 6

PCM
PCM

Reputation: 3011

Continue from here -

if name.endswith((".txt")):
   f = file.read(name,mode='r')
   a = f.read()
   if match_str[0] in f.read():
      # Number is present

You can use for loops for reading too if you have more than one match_str. Similarly, you can use not in keyword to check for not_match_str

Upvotes: 1

crissal
crissal

Reputation: 2647

You can achieve this with pathlib and glob.

import pathlib
path = pathlib.Path(path)
maybe_valids = list(path.glob("*20210624*.txt"))
valids = [elem for elem in maybe_valids if "20210625" not in elem.name]
print(valids)

maybe_valids list is created taking every element that contains "20210624" and ends with .txt, while valids are the ones that doesn't contain "20210625".

Upvotes: 1

Related Questions