user2023
user2023

Reputation: 478

How to split a string based on underscore but conditional

I have a string below which altogether joined by underscores _ that i want to split in such a way to get my desired output.

Below is list string:

>>> a
'cDOT_stv3027_esx_vdi01_07-24-2021_02.00.00.0443'

>>> type(a)
<type 'str'>

Simple rsplit() operation by 2 which turns it into 3 different list values as shown below, like from the end its time , date and then one combine strings ie 'cDOT_stv3027_esx_vdi01' which i want to split into two parts like 'cDOT' & 'stv3027_esx_vdi01'.

>>> a.rsplit("_",2)
['cDOT_stv3027_esx_vdi01', '07-24-2021', '02.00.00.0443']

I am trying below on the first index but then i'll not retain rest of values.

>>> a.rsplit("_",2)[0].split("_",1)
['cDOT', 'stv3027_esx_vdi01']

My desired output should be like below:

['cDOT', 'stv3027_esx_vdi01', '07-24-2021', '02.00.00.0443']

Upvotes: 1

Views: 346

Answers (6)

ettanany
ettanany

Reputation: 19806

In one line using split(), rsplit() and partition():

a.split('_')[:1] + a.partition('_')[2].rsplit('_', 2)

Upvotes: 1

ThePyGuy
ThePyGuy

Reputation: 18416

Using regex:

>>> re.findall('(.*?)_(.*?)_(\d+.*?)_(.*)', a)[0]
('cDOT', 'stv3027_esx_vdi01', '07-24-2021', '02.00.00.0443')

Understanding the pattern:

(.*?)_(.*?)_(\d+.*?)_(.*)

(.*?)_ : It will match the substring before first underscore

(.*?)_(\d+.*?)_ : It will match the substring until underscore followed by 
                  at least one digits, but will get you one substring before underscore, 
                  and one sebstring after underscore and before the next underscore 
                  encountered

(.*) : It will get the remaining part of the string.

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626826

You can use

a = 'cDOT_stv3027_esx_vdi01_07-24-2021_02.00.00.0443'
prefix, *mid, date, time = a.split('_')
print(prefix, '_'.join(mid), date, time)

See the online Python demo.

In this case, you can have as many underscore separated parts between the prefix and date as there are.

Upvotes: 4

S.B
S.B

Reputation: 16486

Why not just do it in two steps ?

string = 'cDOT_stv3027_esx_vdi01_07-24-2021_02.00.00.0443'

splited_line = string.rsplit('_', 2)
print(splited_line[0].split('_', 1) + splited_line[1:])

output :

['cDOT', 'stv3027_esx_vdi01', '07-24-2021', '02.00.00.0443']

Upvotes: 1

BlueBuffalo73
BlueBuffalo73

Reputation: 131

Assuming you will always want the joined substrings to be the same:

splits = a.split('_')
[splits[0]] + ['_'.join(splits[1:4])] + splits[4:]

>>> [splits[0]] + ['_'.join(splits[1:4])] + splits[4:]
['cDOT', 'stv3027_esx_vdi01', '07-24-2021', '02.00.00.0443']

Upvotes: 2

tituszban
tituszban

Reputation: 5152

It seems to me that you don't really want to split the data, you want to extract the relevant parts. For that, I'd recommend using regex.

import re

m = re.match(r"^(.*)_(.*_.*_.*)_(.*)_(.*)$", a)

# Your results:
[m.group(1), m.group(2), m.group(3), m.group(4)]

This way you are capturing everything until the first underscore in group 1, the next three underscore separated sections into group 2, the date in group 3 and the final part to group 4.

Thus the result will look like this:

['cDOT', 'stv3027_esx_vdi01', '07-24-2021', '02.00.00.0443']

Upvotes: 1

Related Questions