mr.zog
mr.zog

Reputation: 540

How do I perform a comparison using a generator object ( pathlib.iterdir ) in Python3?

I have a ton of files in a directory named calls. All of these files contain in their filenames their creation date, ex: 20181022_151012_kK029150d6.xml

I need to find all the files whose creation date is >= 180 days old. I'm using pathlib to collect the file names and can print the file names. I want to do something like this:

calls = Path('/Users/muh/Python/calls')
for fyle in calls.iterdir():
    datetime.strptime(fyle[:8], "%Y%m%d")

but I get "PosixPath' object is not subscriptable"

I need to compare the YYYYMMDD in each filename to the current YYYYMMDD, is all.

Upvotes: 4

Views: 690

Answers (1)

hygull
hygull

Reputation: 8740

As @juanpa.arrivillaga suggested to use fyle.name[:8], that is nice.

Suggestion: Whenever you are stuck in this kind of problem, just try to get the details of that object as follows (what are the defined attributes/methods for any object).

>>> contents = calls.iterdir()
>>> 
>>> content = contents.next()
>>> 
>>> content
PosixPath('/Users/hygull/Projects/django1.9.x-docs/Sfw/file_handling/calls/20181022_151012_kK029150d6.xml')
>>> 
>>> dir(content)
['__bytes__', '__class__', '__delattr__', '__div__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__rdiv__', '__reduce__', '__reduce_ex__', '__repr__', '__rtruediv__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '__truediv__', '_accessor', '_cached_cparts', '_cparts', '_drv', '_flavour', '_format_parsed_parts', '_from_parsed_parts', '_from_parts', '_hash', '_init', '_make_child', '_make_child_relpath', '_opener', '_parse_args', '_parts', '_pparts', '_raw_open', '_root', '_str', 'absolute', 'anchor', 'as_posix', 'as_uri', 'chmod', 'cwd', 'drive', 'exists', 'glob', 'group', 'is_absolute', 'is_block_device', 'is_char_device', 'is_dir', 'is_fifo', 'is_file', 'is_reserved', 'is_socket', 'is_symlink', 'iterdir', 'joinpath', 'lchmod', 'lstat', 'match', 'mkdir', 'name', 'open', 'owner', 'parent', 'parents', 'parts', 'relative_to', 'rename', 'replace', 'resolve', 'rglob', 'rmdir', 'root', 'stat', 'stem', 'suffix', 'suffixes', 'symlink_to', 'touch', 'unlink', 'with_name', 'with_suffix']
>>> 

In the above list, you will find the entry like [..., 'mkdir', 'name', 'open', 'owner', 'parent', ...] where you can see 'name' is the part of it. So finally, you can try to access like fyle.name | type(fyle.name) etc. to check if it is a string or anything else.

Solution:

So, you can do like this.

from pathlib import Path
from datetime import datetime

calls = Path("/Users/muh/Python/calls")
details = {}

i = 1
for fyle in calls.iterdir():
    date = datetime.strptime(fyle.name[:8], "%Y%m%d")

    # Write logic here 

Detailed:

In the below code, I've stored details into a dictionary so that you could have a look into the different states of object that changed in the code.

In my case, the path to calls directory is /Users/hygull/Projects/django1.9.x-docs/Sfw/file_handling/calls.

I have stored each bit & pieces to help you figuring out the problem. I did not try to introduce new variables except d & details & also reused your variable named fyle multiple times for different purpose (It is good if don't have any further use of that variable in simple programs & it is also good to introduce meaningful variable names for big applications).

date is the actual datetime object that you can use for manipulation to achieve your final goal.

from pathlib import Path
from datetime import datetime

calls = Path("/Users/hygull/Projects/django1.9.x-docs/Sfw/file_handling/calls")
details = {}

i = 1
for fyle in calls.iterdir():
    d = {}
    d["pathlib"] = fyle

    fyle = str(fyle)
    d["fullpath"] = fyle

    # fyle = fyle.split("/")[-1]
    fyle = fyle.name[:8]
    d["file_name"] = fyle

    date = datetime.strptime(fyle[:8], "%Y%m%d")
    d["date"] = date

    # Write your business logic here

    details["file" + str(i)] = d
    i += 1

print(details)

Output

{'file2': {'date': datetime.datetime(2018, 10, 25, 0, 0), 'file_name': '20181025_151013_kK029150d7.xml', 'fullpath': '/Users/hygull/Projects/django1.9.x-docs/Sfw/file_handling/calls/20181025_151013_kK029150d7.xml', 'pathlib': PosixPath('/Users/hygull/Projects/django1.9.x-docs/Sfw/file_handling/calls/20181025_151013_kK029150d7.xml')}, 'file1': {'date': datetime.datetime(2018, 10, 22, 0, 0), 'file_name': '20181022_151012_kK029150d6.xml', 'fullpath': '/Users/hygull/Projects/django1.9.x-docs/Sfw/file_handling/calls/20181022_151012_kK029150d6.xml', 'pathlib': PosixPath('/Users/hygull/Projects/django1.9.x-docs/Sfw/file_handling/calls/20181022_151012_kK029150d6.xml')}}

Upvotes: 1

Related Questions