Ioannis K
Ioannis K

Reputation: 95

python os.path.dir example

I was watching an online course in Udacity about Deep learning and the concept was a simple classification for a not_Mnist dataset.Everything was explained way too good but i am a bit confused with parts of the code given.I would appreciate if you have the time to give me a hand ! For example we have an 'notMNIST_large.tar.gz' file. So at first we remove .tar.gz and the root is root = notMNIST_large. After that we check if there is already a directory with this name.If not we extract the subfolders from 'notMNIST_large.tar.gz' file and this is where i get a bit confused...

 num_classes = 10
 np.random.seed(133)

def maybe_extract(filename, force=False):
  root = os.path.splitext(os.path.splitext(filename)[0])[0]  # remove .tar.gz
  if os.path.isdir(root) and not force:
    # You may override by setting force=True.
    print('%s already present - Skipping extraction of %s.' % (root, filename))
  else:
    print('Extracting data for %s. This may take a while. Please wait.' % root)
    tar = tarfile.open(filename)
    sys.stdout.flush()
    tar.extractall(data_root)
    tar.close()
  data_folders = [
    os.path.join(root, d) for d in sorted(os.listdir(root))
    if os.path.isdir(os.path.join(root, d))]
  if len(data_folders) != num_classes:
    raise Exception(
      'Expected %d folders, one per class. Found %d instead.' % (
        num_classes, len(data_folders)))
  print(data_folders)
  return data_folders

train_folders = maybe_extract(train_filename)
test_folders = maybe_extract(test_filename)

So i would like if possible an explanation for this part

data_folders = [
    os.path.join(root, d) for d in sorted(os.listdir(root))
    if os.path.isdir(os.path.join(root, d))]
  if len(data_folders) != num_classes:
    raise Exception(
      'Expected %d folders, one per class. Found %d instead.' % (
        num_classes, len(data_folders))) 

Upvotes: 1

Views: 662

Answers (1)

tripleee
tripleee

Reputation: 189497

It collects a list of subdirectories and checks that there are the number it expected.

data_folders = [thing(d) for d in something() if predicate(d)]

is a list comprehension which loops over the result of something() and collects those items for which predicate is True. It applies thing() to those entries and collects the resulting list in data_folders.

Here, something is the listing of the files in the current directory, and predicate checks that the item is a directory (and not, for example, a regular file); thing is os.path.join(root,d) i.e. we add back the root directory in front of the extracted entries.

So, in this case, the code checks that the number of subdirectories is the same as the number of classes (presumably each subdirectory contains a class).

Upvotes: 2

Related Questions