Reputation: 411
I have a CSV File, with various Columns.
2 columns are of my interest
Data -
file_Id name
0 1 Distrustful
1 4 Ashamed
2 5 Depressed
3 6 Worried
4 7 Convinced
[767 rows x 2 columns]
The File_ID is actually the name of files without extensions, (i.e. 1.wav, 4.wav, and so on) all saved in a particular folder.
I want a python script to run over the dataframe, go to the file_Id
, add the extension, and then save it in a directory with a name corresponding to the value under the name
column, create the directory if it doesn't exist.
Example -
1.wav is saved in Distrustful
4.wav in Ashamed and so on
My Attempt -
import os
import pandas as pd
df = pd.read_csv('C:\\Data.csv')
df1 = df.sort_values(['name', 'file_Id'])
df1 = df1.drop(columns=['Arousal', 'Valence', 'closest_pairs', 'min_distance'])
print (df1)
Result -
song_Id
name
Ambitious 28
Ashamed 45
Attentive 1
Bored 1
Confident 5
Convinced 85
.. ... ...
I actually now have no clue, aboout how i shall proceed, os.splitext
was my first guess but its not useful.
Upvotes: 0
Views: 355
Reputation: 1
import pandas
import shutil
import os
file = pandas.read_csv('trial.csv') #This is just an example
id = file.iloc[0:, 0]
folder = file.iloc[0:, 1]
real_path = os.getcwd() #Gets the current working directory
for i in range(len(folder)): #Since both id and folders have the same total number of variables.
old_id = str(id[i]) #i.e, '0'
old_id_path = real_path + '\\' + old_id #i.e, 'C:\Windows\0'
new_id = str(id[i]) + '.wav' #i.e, '0.wav'
new_id_path = real_path + '\\' + new_id #i.e, 'C:\Windows\0.wav'
new_folder = real_path + '\\' + folder[i] #i.e, 'C:\Windows\noise
destination_file = new_folder + '\\' + new_id #i.e, 'C:\Windows\noise\0.wav'
if os.path.exists(destination_file) == True or os.path.exists(old_id_path) == False:
#if the file is already in destination or doesn't exist at all
continue #Skip this loop
if os.path.exists(new_id) == True: #i.e, if 'C:\Windows\0.wav' exists
pass
else:
shutil.move(old_id_path, new_id_path) #i.e, renames the 0 to 0.wav
if os.path.exists(new_folder) == True: #i.e, if 'C:\Windows\noise' exists
pass
else:
os.makedirs(new_folder) #creates the new folder, i.e, 'C:\Windows\noise'
shutil.move(new_id_path, destination_file) #moves the file to the destination
#i.e, 0.wav to noise, 'C:\Windows\noise\0.wav'
Upvotes: 0
Reputation: 3001
You can use Path to create pythonic path objects which make it easy to create directories, add extensions, etc. Then use the shutil
module to efficiently copy the files.
from pathlib import Path
from shutil import copy2
# Path with files
source = Path("source/path")
for _, row in df.iterrows():
# Convert to string and then create file path (without extension)
filename = Path(str(row["file_Id"])).with_suffix(".png")
# Target folder path
target = Path(row["name"])
# Create target folder
target.mkdir(exist_ok=True)
# Copy file
copy2(source / filename, target / filename)
There's of course a few ways in which you could make this more efficient. Probably get all unique target directories and create those before iterating over all the dataframe rows to copy the files.
Upvotes: 1