Astrian_72954
Astrian_72954

Reputation: 411

Saving Files in Folder using information in CSV File

I have a CSV File, with various Columns.

2 columns are of my interest

Data -

     file_Id            name
0          1     Distrustful
1          4         Ashamed
2          5       Depressed
3          6         Worried
4          7        Convinced

[767 rows x 2 columns]

The File_ID is actually the name of files without extensions, (i.e. 1.wav, 4.wav, and so on) all saved in a particular folder.

I want a python script to run over the dataframe, go to the file_Id, add the extension, and then save it in a directory with a name corresponding to the value under the name column, create the directory if it doesn't exist.

Example -

1.wav is saved in Distrustful

4.wav in Ashamed and so on

My Attempt -

import os
import pandas as pd

df = pd.read_csv('C:\\Data.csv')
df1 = df.sort_values(['name', 'file_Id'])
df1 = df1.drop(columns=['Arousal', 'Valence', 'closest_pairs', 'min_distance'])
print (df1)

Result -

               song_Id
name                  
Ambitious           28
Ashamed             45
Attentive            1
Bored                1
Confident            5
Convinced           85
..       ...        ...

I actually now have no clue, aboout how i shall proceed, os.splitext was my first guess but its not useful.

Upvotes: 0

Views: 355

Answers (2)

eaverine
eaverine

Reputation: 1

import pandas
import shutil
import os

file = pandas.read_csv('trial.csv') #This is just an example

id = file.iloc[0:, 0]
folder = file.iloc[0:, 1]

real_path = os.getcwd() #Gets the current working directory

for i in range(len(folder)): #Since both id and folders have the same total number of variables.
    old_id = str(id[i]) #i.e, '0'
    old_id_path = real_path + '\\' + old_id #i.e, 'C:\Windows\0'

    new_id = str(id[i]) + '.wav' #i.e, '0.wav'
    new_id_path = real_path + '\\' + new_id #i.e, 'C:\Windows\0.wav'

    new_folder = real_path + '\\' + folder[i] #i.e, 'C:\Windows\noise

    destination_file = new_folder + '\\' + new_id #i.e, 'C:\Windows\noise\0.wav'


if os.path.exists(destination_file) == True or os.path.exists(old_id_path) == False:
 #if the file is already in destination or doesn't exist at all
    continue #Skip this loop

if os.path.exists(new_id) == True: #i.e, if 'C:\Windows\0.wav' exists
    pass
else:
    shutil.move(old_id_path, new_id_path) #i.e, renames the 0 to 0.wav
    
if os.path.exists(new_folder) == True: #i.e, if 'C:\Windows\noise' exists
    pass
else:
    os.makedirs(new_folder) #creates the new folder, i.e, 'C:\Windows\noise'
    
shutil.move(new_id_path, destination_file) #moves the file to the destination
#i.e, 0.wav to noise, 'C:\Windows\noise\0.wav'

Upvotes: 0

You can use Path to create pythonic path objects which make it easy to create directories, add extensions, etc. Then use the shutil module to efficiently copy the files.

from pathlib import Path
from shutil import copy2

# Path with files
source = Path("source/path")

for _, row in df.iterrows():
    # Convert to string and then create file path (without extension)
    filename = Path(str(row["file_Id"])).with_suffix(".png")

    # Target folder path
    target = Path(row["name"])

    # Create target folder
    target.mkdir(exist_ok=True)

    # Copy file
    copy2(source / filename, target / filename)

There's of course a few ways in which you could make this more efficient. Probably get all unique target directories and create those before iterating over all the dataframe rows to copy the files.

Upvotes: 1

Related Questions