Reputation: 1014
the filename can be any of below examples
abc_dec_2020_06_23.csv
efg_edd_20200623.csv
abc_20200623121935.csv
I need to extract only by excluding the number part
abc_dec_
efg_edd
abc_
I am trying to achieve the archive the previous file present in the SFTP location
Below is what I am trying to achieve
fileName = self.s3_key.split('/')[-1]
sftp_client.rename( self.sftp_path + fileName, archive_path + fileName)
with sftp_client.open(self.sftp_path + fileName, 'wb') as f:
s3_client.download_fileobj(self.s3_bucket, self.s3_key, f)
Upvotes: 0
Views: 313
Reputation: 420
With a regular expression:
r"^[a-z_]+"
Example:
import re
regex_comp = re.compile(r"^[a-z_]+")
match_str = regex_comp.match("abc_20200623121935.csv")
print(match_str.group())
Result:
abc_
If your filenames have digits:
import re
filenames = ["efg_12_edd_20200623.csv", "abc_dec_2020_06_23.csv",
"efg_edd_20200623.csv", "a1b2c11_20200623121935.csv"]
regex1 = re.compile(r"[0-9]{4}_[0-9]{2}_[0-9]{2}\.csv$")
regex2 = re.compile(r"[0-9]{8,14}\.csv$")
filename = ""
for filename_full in filenames:
test = regex1.search(filename_full)
if test is None:
test = regex2.search(filename_full)
if test is not None:
filename = filename_full[:test.span()[0]]
print(filename)
else:
print(filename_full, ": No match")
Result:
efg_12_edd_
abc_dec_
efg_edd_
a1b2c11_
Upvotes: 2
Reputation: 6483
You could try this:
file='abc_dec_2020_06_23.csv'
cleanfile=''
for let in file:
if let.isdigit():
break
else:
cleanfile+=let
print(cleanfile)
Output:
'abc_dec_'
And if your filenames have digits, you can try this:
x='abc_12_dec_2020_06_23.csv'
newval=''
for i,val in enumerate(x.split('_')):
if i==len(x.split('_'))-1:
if len(val.replace('.csv',''))<8 and len(list(x.split('_'))[i-1])>2: #e.g. 202006_23.csv'
newval='_'.join(list(x.split('_'))[:i-1])+'_'
elif len(val.replace('.csv',''))<8 and len(list(x.split('_'))[i-1])==2: #e.g. 2020_06_23.csv'
newval='_'.join(list(x.split('_'))[:i-2])+'_'
elif len(val.replace('.csv',''))<8 and len(val.replace('.csv',''))==4: #e.g. 2020_0623.csv'
newval='_'.join(list(x.split('_'))[:i-1])+'_'
else:
newval='_'.join(list(x.split('_'))[:i])+'_'
print(newval)
Output:
'abc_12_dec_'
Upvotes: 1