Eren Han
Eren Han

Reputation: 321

Read specific CSV files in a folder?

I'm trying to extract some zip files and after extractions reading only specific CSV files from this folder, there is one pattern but I couldnt place the code. The pattern is Im looking numeric name file, here is my code.

import glob
import pandas as pd
import zipfile
from os import listdir
from os.path import isfile, join

path = r'C:\Users\han\Desktop\Selenium'
all_files = glob.glob(path + "/*.zip")

li = []

for filename in all_files:
    with zipfile.ZipFile(filename, 'r') as zip_ref:
        zip_ref.extractall(r'C:\Users\han\Desktop\Selenium')

detail = [f for f in listdir(r'C:\Users\han\Desktop\Selenium\detail') if isfile(join(r'C:\Users\han.37\Desktop\Selenium\detail', f))]

After this point, my detail list is like this

    ['119218.csv',
     '119218_coaching.csv',
     '119218_coaching_comment.csv',
     '119218_emp_monitor.csv',
     '119218_monitor_work_time.csv',
     '119218_reponse_text.csv',
     '119218_response.csv',
     '119219.csv',
]

What I want is that reading only numeric ones which are 119218 and 119219 .csv. and ofc pd.concat because they are same shaped data tables.

Thanks in advance

Upvotes: 0

Views: 413

Answers (2)

ThePyGuy
ThePyGuy

Reputation: 18476

From your file list, you can just filter out the fileNames which has all charcters as digit except for extension .csv, and there are numerous ways to do so, one way is to split each on .csv and check if all characters in the first item are digit.

files=['119218.csv',
     '119218_coaching.csv',
     '119218_coaching_comment.csv',
     '119218_emp_monitor.csv',
     '119218_monitor_work_time.csv',
     '119218_reponse_text.csv',
     '119218_response.csv',
     '119219.csv',
]
files = [eachFile for eachFile in files if all(c.isdigit() for c in eachFile.split('.csv')[0])]

OUTPUT:

['119218.csv', '119219.csv']

Upvotes: 2

Anand Narayanan
Anand Narayanan

Reputation: 83

You just have to modify this one line:

detail = [f for f in listdir(r'C:\Users\han\Desktop\Selenium\detail') if re.match(r"[0-9]*\.csv", f) and isfile(join(r'C:\Users\han.37\Desktop\Selenium\detail', f))]

Don't forget to import re

Upvotes: 1

Related Questions