Reputation: 171
How do I split the elements in this element based on the string before the dot without explicitly writing it in code?
lst = ['ds_a.cola','ds_a.colb','ds_b.cola','ds_b.colb']
Since there are two variants of 'ds'. I want two lists.
lst_dsa = ['ds_a.cola','ds_a.colb']
lst_dsb = ['ds_b.cola','ds_b.colb']
My old code was:
lst_dsa = []
lst_dsb = []
for item in lst :
if "ds_a" in item:
lst_dsa.append(item)
else:
lst_dsb.append(item)
But I can't use this since there might be more than 2, like, ds_c,ds_d.... How do I achieve this in python?
Upvotes: 3
Views: 875
Reputation: 4243
use two repeating regular expressions: one for ds_a period and one or more words and ds period and one or more words. Ignore the empty group and use a defaultdict to add values to the set.
lst = ['ds_a.cola','ds_a.colb','ds_b.cola','ds_b.colb']
pattern=r"(?:\bds_a\.\w+\b\s*)*(?:\bds_b\.\w+\b\s*)*"
string=" ".join(lst)
groups=re.findall(pattern,string)
dict=defaultdict(set)
for group in groups:
for item in group.split():
if item !="":
print(item)
key,*value=item.split('.')
dict[key].add(value[0])
print(dict)
output:
defaultdict(<class 'set'>, {'ds_a': {'cola', 'colb'}, 'ds_b': {'cola', 'colb'}})
Upvotes: 0
Reputation: 177
try this:
d = dict()
for item in lst:
key = item.split(".")[0]
if key not in d.keys():
d[key] = list()
d[key].append(item)
Upvotes: 0
Reputation: 23815
Use a dict and hold the data
from collections import defaultdict
lst = ['ds_a.cola','ds_a.colb','ds_b.cola','ds_b.colb','ds_x.cola','ds_x.colb']
data = defaultdict(list)
for entry in lst:
a,_ = entry.split('.')
data[a].append(entry)
print(data)
output
defaultdict(<class 'list'>, {'ds_a': ['ds_a.cola', 'ds_a.colb'], 'ds_b': ['ds_b.cola', 'ds_b.colb'], 'ds_x': ['ds_x.cola', 'ds_x.colb']})
Upvotes: 3
Reputation: 1545
You can map them:
from collections import defaultdict
lst = ['ds_a.cola','ds_a.colb','ds_b.cola','ds_b.colb']
ds_dict = defaultdict(list)
for item in lst:
key, value = item.split(".")
ds_dict[key].append(value)
print(dict(ds_dict))
Output:
{'ds_a': ['cola', 'colb'], 'ds_b': ['cola', 'colb']}
Upvotes: 0
Reputation: 71580
Try itertools.groupby
:
>>> from itertools import groupby
>>> [list(v) for _, v in groupby(lst, key=lambda x: x[x.find('_') + 1])]
[['ds_a.cola', 'ds_a.colb'], ['ds_b.cola', 'ds_b.colb']]
>>>
Upvotes: 2