Adriano Ribeiro
Adriano Ribeiro

Reputation: 89

Search and copy files listed in a dataframe

Hi I'm working on a simple script that copy files from a directory to another based on a dataframe that contains a list of invoices.

Is there any way to do this as a partial match? like i want all the files that contains "F11000", "G13000" and go on continue this loop until no more data in DF.

I tried to figure it out by myself and I'm pretty sure changing the "X" on the copy function will do the trick, but can't see it.

import pandas as pd
import os
import glob
import shutil

data = {'Invoice':['F11000','G13000','H14000']}

df = pd.DataFrame(data,columns=['Doc'])


path = 'D:/Pyfilesearch'
dest = 'D:/Dest'

def find(name,path):
    for root,dirs,files in os.walk(path):
        if name in files:
            return os.path.join(root,name)


def copy():
    for x in df['Invoice']:
        shutil.copy(find(x,path),dest)



copy()

Upvotes: 2

Views: 807

Answers (1)

Trenton McKinney
Trenton McKinney

Reputation: 62403

Using pathlib

from pathlib import Path
import pandas as pd
import shutil

# convert paths to pathlib objects
path = Path('D:/Pyfilesearch')
dest = Path('D:/Dest')

# find files and copy
for v in df.Invoice.unique():  # iterate through unique column values
    files = list(path.rglob(f'*{v}*'))  # create a list of files for a value
    files = [f for f in files if f.is_file()]  # if not using file extension, verify item is a file
    for f in files:  # iterate through and copy files
        print(f)
        shutil.copy(f, dest)

Copy to subdirectories for each value

path = Path('D:/Pyfilesearch')

for v in df.Invoice.unique():
    dest = Path('D:/Dest')
    files = list(path.rglob(f'*{v}*'))
    files = [f for f in files if f.is_file()]
    dest = dest / v  # create path with value
    if not dest.exists():  # check if directory exists
        dest.mkdir(parents=True)  # if not, create directory
    for f in files:
        shutil.copy(f, dest)

Upvotes: 4

Related Questions