Reputation: 2904
I want read files in one directory.
Directory contains :
ABC1.csv
ABC1_1.csv
ABC1_2.csv
ABC11.csv
ABC11_1.csv
ABC11_3.csv
ABC11_2.csv
ABC13_4.csv
ABC13_1.csv
ABC17_6.csv
ABC17_2.csv
ABC17_4.csv
ABC17_8.csv
While running script I want to give command line argument for reading specific files depend on some conditions :
For this stuff I'm created a script but I'm facing issue.
Program-
from glob import glob
import os
import sys
file_pattern = ''
files_list = list()
arguments = {'ABC', 'PQR', 'XYZ'}
if len(sys.argv[1:2]) is 1:
file_pattern = str(sys.argv[1:2])
else:
print 'run as <python test.py ABC>'
sys.exit(1)
if file_pattern in arguments:
print '<Provide Name with some Number>'
sys.exit(1)
file_pattern = file_pattern.replace('[','').replace(']','').replace('\'','')
if file_pattern.startswith('ABC',0,3):
files_list = glob(os.path.join('<directory name>', str(file_pattern)+'_*.csv'))
else:
print 'No Such File --> ' + str(file_pattern)+ '\t <Provide appropriate Name>'
sys.exit(1)
if files_list:
for a_file in sorted(files_list):
print a_file
#process file
else:
print 'No Such File --> ' + str(file_pattern)+ '\t <Provide appropriate Name>'
sys.exit(1)
This code is working fine but it doesn't satisfy my 2nd condition. when user is giving ABC1 as a argument i.e. python test.py ABC1 , it will return files ABC1_1.csv, ABC1_2.csv but not returning ABC1.csv file.
How I can satisfy this 2nd condition also without losing any other condition?
Upvotes: 2
Views: 4205
Reputation: 415
You might want to add a simple check for the additional "special" case, something like this:
if file_pattern.startswith('ABC',0,3):
csv_path = os.path.join('.', str(file_pattern))
files_list = glob(csv_path + '_*.csv')
# Just check the special case that's not included in the glob above
csv_path = csv_path + '.csv'
if os.path.isfile(csv_path):
files_list.append(csv_path)
else:
print 'No Such File --> ' + str(file_pattern)+ '\t <Provide appropriate Name>'
sys.exit(1)
Upvotes: 0
Reputation: 2904
I tried with different Scenarios,and finally got exact solution which satisfies all my conditions. First I'm checking for user input file is available or not in the specified directory, If it is available then globing all files with same file with (_) all at the end appending match file to same list.
If user input if not file is not available in the specified directory then I'm checking for the files with (_) symbol then globing all files into list. At the end iterating through list and got final result.
Program-
from glob import glob
import os
import sys
file_pattern = ''
files_list = list()
arguments = {'ABC', 'PQR', 'XYZ'}
#checking for user provided argument or not
if len(sys.argv[1:2]) is 1:
file_pattern = str(sys.argv[1:2])
else:
print 'run as < python test.py <LineName> >'
sys.exit(1)
#replace all unnecessary stuff with ('')
file_pattern = file_pattern.replace('[','').replace(']','').replace('\'','')
#checking for line number is provided or not
if file_pattern in arguments:
print '<Provide LineName with some Number>'
sys.exit(1)
flag = True
#list of all files containing specified directory
files = os.listdir('<directory name>')
for file_name in files:
if str(file_name) == str(file_pattern)+'.csv':
files_list = glob(os.path.join('<directory name>', str(file_pattern)+'_*.csv'))
#appending match file also to resultant list
files_list.append('<directory name>'+file_name)
flag = False
#if specified file is not present in dir check for filename with (_)
if flag:
files_list = glob(os.path.join('<directory name>', str(file_pattern)+'_*.csv'))
#checking for list contains items or not
if files_list:
for a_file in sorted(files_list):
print a_file
else:
print 'No Such File --> ' + str(file_pattern)+ '\t <Provide appropriate Name1>'
sys.exit(1)
Consider directory contains ABC1.csv, ABC1_1.csv, ABC1_2.csv, ABC11.csv, ABC11_1.csv, ABC11_3.csv, ABC11_2.csv files.
Output Scenario :
#if input is ABC1
.\\ABC1.csv
.\\ABC1_1.csv
.\\ABC1_2.csv
#if input is ABC11
.\\ABC11.csv
.\\ABC11_1.csv
.\\ABC11_2.csv
.\\ABC11_3.csv
Upvotes: 1
Reputation: 490
I have a solution. It's not perfect, depends if you have other files in the folder:
file_pattern = 'ABC1'
files_list = glob(os.path.join('<directory name>', str(file_pattern)+'[!0-9]*'))
# output: ABC1.csv, ABC1_1.csv, ABC1_2.csv
file_pattern = 'ABC11'
files_list = glob(os.path.join('<directory name>', str(file_pattern)+'[!0-9]*'))
# output: ['.\\ABC11.csv', '.\\ABC11_1.csv', '.\\ABC11_2.csv', '.\\ABC11_3.csv']
I had the same problem as Jesper. The issue is that although * will match any character, it needs a character!
By selecting any file that doesn't have a digit after the file pattern, we avoid the 1-11 issue.
Upvotes: 0