user12076139
user12076139

Reputation:

how to split the filename using python

I am using python to create xml file using element and subelement process. I have a list of zip files in my folder listed below:

Retirement_participant-plan_info_v1_getPlankeys_rev1_2021_03_09.zip
Retirement_participant-plan_info_resetcache_secretmanager_rev1_2021_03_09.zip
Retirement_participant-plan_info_v1_mypru_plankeys_rev1_2021_03_09.zip
Retirement_participant-plan_info_resetcache_param_value_rev1_2021_03_09.zip
Retirement_participant-plan_info_resetcache_param_v1_balances_rev1_2021_03_09.zip

I want to split those zip files and get the name like this:

Retirement_participant-plan_info_v1_getPlankeys
Retirement_participant-plan_info_resetcache_secretmanager
Retirement_participant-plan_info_v1_mypru_plankeys
Retirement_participant-plan_info_resetcache_param_value
Retirement_participant-plan_info_resetcache_param_v1_balances

PS: I want to remove _rev1_2021_03_09.zip while creating a name from the zip file.

here is my python code. It works with Retirement_participant-plan_info_v1_getPlankeys_rev1_2021_03_09.zip but its not working if i have too big names for a zip file for eg Retirement_participant-plan_info_resetcache_param_v1_balances_rev1_2021_03_09.zip

    Proxies = SubElement(proxy, 'Proxies')
    path = "./"
    for f in os.listdir(path):
        if '.zip' in f:
            Proxy = SubElement(Proxies, 'Proxy')
            name = SubElement(Proxy, 'name')
            fileName = SubElement(Proxy, 'fileName')
            a = f.split('_')
            name.text = '_'.join(a[:3])
            fileName.text = str(f) 

Upvotes: 2

Views: 154

Answers (2)

qcoumes
qcoumes

Reputation: 517

If the rev and date is always the same (2021_03_09), just replace them with the empty string:

filenames = [f.replace("_rev1_2021_03_09.zip", "") for f in os.listdir(path)]

Upvotes: 0

Sayandip Dutta
Sayandip Dutta

Reputation: 15872

You can str.split by rev1_

>>> filenames

['Retirement_participant-plan_info_v1_getPlankeys_rev1_2021_03_09.zip',
 'Retirement_participant-plan_info_resetcache_secretmanager_rev1_2021_03_09.zip',
 'Retirement_participant-plan_info_v1_mypru_plankeys_rev1_2021_03_09.zip',
 'Retirement_participant-plan_info_resetcache_param_value_rev1_2021_03_09.zip',
 'Retirement_participant-plan_info_resetcache_param_v1_balances_rev1_2021_03_09.zip']

>>> names = [fname.split('_rev1_')[0] for fname in filenames]

>>> names

['Retirement_participant-plan_info_v1_getPlankeys',
 'Retirement_participant-plan_info_resetcache_secretmanager',
 'Retirement_participant-plan_info_v1_mypru_plankeys',
 'Retirement_participant-plan_info_resetcache_param_value',
 'Retirement_participant-plan_info_resetcache_param_v1_balances']

Same can be achieved with str.rsplit by limiting the maxsplit to 4:

>>> names = [fname.rsplit('_', 4)[0] for fname in filenames]
>>> names
['Retirement_participant-plan_info_v1_getPlankeys',
 'Retirement_participant-plan_info_resetcache_secretmanager',
 'Retirement_participant-plan_info_v1_mypru_plankeys',
 'Retirement_participant-plan_info_resetcache_param_value',
 'Retirement_participant-plan_info_resetcache_param_v1_balances']

Upvotes: 1

Related Questions