Reputation: 11
This is similar to splitting a list of strings into a list of lists of strings, but I want a copy of the original string as an element of the list that came from it. The purpose is I want to parse out elements from a filename, but I want to retain the filename, so after I match the list using the words, the filename is readily available, so I can do something with it.
For example,
stringList = ["wordA1_wordA2_wordA3","wordB1_wordB2_wordB3"]
becomes
splitList = [["wordA1_wordA2_wordA3","wordA1","wordA2","wordA3"],
["wordB1_wordB2_wordB3","wordB1","wordB2","wordB3"]]
I'm trying to do it in a single command as a list comprehension
The closest I've gotten is:
splitList = [[item,item.split('_')] for item in stringList]
which yields:
splitList = [["wordA1_wordA2_wordA3",["wordA1","wordA2","wordA3"]],
["wordB1_wordB2_wordB3",["wordB1","wordB2","wordB3"]]
I could work with this, but is there a more elegant suggestion that I could learn from?
I've tried
splitList = [item.split('_') + item for item in stringList]
which complains about not concatenating a list to a str.
And
splitList = [item.split('_').append(item) for item in stringList]
which creates a list of 'None's.
Upvotes: 1
Views: 2584
Reputation: 922
You can unpack the split list with *
:
splitList=[[item,*item.split('_')] for item in stringList]
which gives you the wanted
splitList = [["wordA1_wordA2_wordA3","wordA1","wordA2","wordA3"],
["wordB1_wordB2_wordB3","wordB1","wordB2","wordB3"]]
You can also do something like:
splitList=[[item] + item.split('_') for item in stringList]
to deal with the concatenation of string and list. [item]
simply creates a list with single item item
and concatenates it with the split list.
Upvotes: 2
Reputation: 13106
The reason [item.split('_').append(item)...]
returns None
's is because list.append
is an in-place modifier, and does not have a return value.
It might be a bit more advantageous to use a dict
here, rather than a list
of lists
, since the filename can be your key, and the individual components can be your values:
stringList = ["wordA1_wordA2_wordA3","wordB1_wordB2_wordB3"]
string_dict = {filename: filename.split("_") for filename in stringList}
# {'wordA1_wordA2_wordA3': ['wordA1', 'wordA2', 'wordA3'], 'wordB1_wordB2_wordB3': ['wordB1', 'wordB2', 'wordB3']}
However, if you need a list:
processed_list = [[filename, *filename.split("_")] for filename in stringList]
# [['wordA1_wordA2_wordA3', 'wordA1', 'wordA2', 'wordA3'], ['wordB1_wordB2_wordB3', 'wordB1', 'wordB2', 'wordB3']]
Where [filename, *filename.split("_")]
uses the *
to unpack the resulting list from str.split
into the current list
Upvotes: 1