Reputation: 11
import os
import shutil
from collections import Counter as cnt
curr_path = os.getcwd()
ls = lambda path: [x for x in os.listdir(path) if x[0] != '.']
raw_files = [x for x in ls(curr_path) if '_' not in x]
filenames = []
folders = []
for f in raw_files: # sweeping out the '.', only having filenames
try:
i = f.index('.')
f_new = f[0:i]
filenames.append(f_new)
except ValueError:
filenames.append(f)
fname_freq = cnt(filenames)
for fname, freq in fname_freq.items():
if freq > 1:
folders.append(fname)
for fldr in folders:
print(fldr+'\n')
try:
os.makedirs(fldr+'_')
# adding a '_' to make the true folder name, i.e. foldername_ instead of foldername
except OSError:
pass
for f in raw_files:
print("File being analyzed is: {fn} \n".format(fn=f))
for fldr in folders:
print("Folder to move stuff to is {fold}\n".format(fold=fldr))
print("File {f} being checked . . . ".format(f=f))
if fldr in f and f[-1] != '_':
print("\t Moving file {fn} \n".format(fn=f))
shutil.move(f, fldr+'_')
The above is a program FileGrouper.py, which will look inside a directory for whether there are multiple files with a shared name (i.e. random.java, random.class). If there are, then they will be moved to a directory named [insert shared name]_ to organize them.
This code in particular refuses to work for a specific set of files.
Note: I replaced my actual Username, as well as the Timestamps with placeholders
-rwxr-xr-x 1 UserName staff 8480 Month 20 Time:Time another_ptr_func
-rw-r--r--@ 1 UserName staff 324 Month 20 Time:Time another_ptr_func.c
-rwxr-xr-x 1 UserName staff 8572 Month 20 Time:Time arrays1
-rw-r--r-- 1 UserName staff 321 Month 20 Time:Time arrays1.c
-rw-r--r--@ 1 UserName staff 2058 Month 20 Time:Time file_grouper.py
-rwxr-xr-x 1 UserName staff 8432 Month 20 Time:Time forloop
-rw-r--r-- 1 UserName staff 119 Month 20 Time:Time forloop.c
-rwxr-xr-x 1 UserName staff 4248 Month 20 Time:Time gen_a
-rw-r--r-- 1 UserName staff 53 Month 20 Time:Time gen_a.c
-rwxr-xr-x 1 UserName staff 8432 Month 20 Time:Time hello
-rw-r--r-- 1 UserName staff 65 Month 20 Time:Time hello.c
-rwxr-xr-x 1 UserName staff 8432 Month 20 Time:Time hello2
-rw-r--r-- 1 UserName staff 140 Month 20 Time:Time hello2.c
-rw-r--r--@ 1 UserName staff 343 Month 20 Time:Time pointer.c
-rwxr-xr-x 1 UserName staff 8472 Month 20 Time:Time pointer_func
-rw-r--r-- 1 UserName staff 352 Month 20 Time:Time pointer_func.c
-rwxr-xr-x@ 1 UserName staff 19340 Month 20 Time:Time pointer_hex
-rwxr-xr-x 1 UserName staff 8432 Month 20 Time:Time switch
-rw-r--r--@ 1 UserName staff 219 Month 20 Time:Time switch.c
-rwxr-xr-x 1 UserName staff 8480 Month 20 Time:Time void_another_ptr_func
-rw-r--r-- 1 UserName staff 335 Month 20 Time:Time void_another_ptr_func.c
The above is the specific set of files that this code refuses to work for. I have tested this same code on the following, consisting of dummy files with zero size (used touch
to create them):
-rw-r--r-- 1 UserName staff 1183 Month 20 Time:Time README.md
-rw-r--r--@ 1 UserName staff 2067 Month 20 Time:Time file_grouper.py
-rw-r--r-- 1 UserName staff 0 Month 20 Time:Time random1
-rw-r--r-- 1 UserName staff 0 Month 20 Time:Time random1.txt
-rw-r--r-- 1 UserName staff 0 Month 20 Time:Time random2
-rw-r--r-- 1 UserName staff 0 Month 20 Time:Time random2.txt
-rw-r--r-- 1 UserName staff 0 Month 20 Time:Time random3
-rw-r--r-- 1 UserName staff 0 Month 20 Time:Time random3.txt
-rw-r--r-- 1 UserName staff 0 Month 20 Time:Time random4
-rw-r--r-- 1 UserName staff 0 Month 20 Time:Time random4.txt
-rw-r--r-- 1 UserName staff 0 Month 20 Time:Time random5.txt
The program worked successfully on these files, and generated the following:
.
├── README.md
├── file_grouper.py
├── random1_
│ ├── random1
│ └── random1.txt
├── random2_
│ ├── random2
│ └── random2.txt
├── random3_
│ ├── random3
│ └── random3.txt
├── random4_
│ ├── random4
│ └── random4.txt
└── random5.txt
As you can see, the program successfully ignored the lone random5.txt, as no other files with a shared name existed. However, the files where others with a shared name did exist were successfully grouped into a folder of [insert shared name]_
Upvotes: 1
Views: 305
Reputation: 5698
If by "refusing to work" you mean certain files are skipped, then that could be because of this line:
raw_files = [x for x in ls(curr_path) if '_' not in x]
All files with an underscore are removed.
Another issue is that file systems will not allow files and directories with the same name, which you cover by appending an underscore.
I'd strongly suggest to use the pathlib
library, it makes your code more maintainable and readable:
from pathlib import Path
def file_grouper():
# Path to group
path = Path('.')
# List all files
ls = path.glob('*')
# Map all stems (file names without extension) to their file names
names = {}
for x in ls:
if x.is_file():
if x.stem not in names:
names[x.stem] = []
names[x.stem].append(x)
# Create and move
for stem, values in names.items():
if len(values) > 1:
(path / (stem + '_')).mkdir(exist_ok=True)
for value in values:
value.rename(path / (stem + '_') / value.name)
file_grouper()
Upvotes: 1