Nisa
Nisa

Reputation: 227

How to replace multiple items in a 2D list?

I need a list with only strings which are separated by comma. I don't know how to do it in python.

Here is my sample input:

[(0, '0.897*"allah" + 0.120*"indeed" + 0.117*"lord" + 0.110*"said" + 0.101*"people" + 0.093*"upon" + 0.083*"shall" + 0.082*"unto" + 0.072*"believe" + 0.070*"earth"'), (1, '0.495*"lord" + 0.398*"said" + -0.377*"allah" + 0.253*"shall" + 0.241*"people" + 0.236*"unto" + 0.196*"indeed" + 0.131*"upon" + 0.118*"come" + 0.109*"thou"'), (2, '-0.682*"lord" + 0.497*"shall" + 0.349*"unto" + 0.125*"thou" + 0.125*"thee" + -0.098*"indeed" + 0.092*"come" + -0.092*"said" + 0.092*"people" + 0.080*"truth"')]

My expected output is:

   [(0, "allah" ,"indeed" ,"lord" ,"said" ,"people" ,"upon" ,"shall","unto" ,"believe" ,"earth"'), (1, '"lord" ,"said" ,"allah" ,"shall" ,"people" ,"unto" ,"indeed" ,"upon" ,"come","thou"'), (2, '"lord" ,"shall" ,"unto" ,"thou" ,"thee" ,"indeed" ,"come","said" ,"people" ,"truth"')]

Upvotes: 0

Views: 78

Answers (2)

Aaditya Ura
Aaditya Ura

Reputation: 12669

You can try regular expression :

One line solution:

import re
pattern = r'[a-z]+'

string_1 = [(0,'0.897*"allah" + 0.120*"indeed" + 0.117*"lord" + 0.110*"said" + 0.101*"people" + 0.093*"upon" + 0.083*"shall" + 0.082*"unto" + 0.072*"believe" + 0.070*"earth"')]
print([k if isinstance(k, int) else [i.group() for i in re.finditer(pattern, str(string_1))] for i in string_1 for k in i])

output:

[0, ['allah', 'indeed', 'lord', 'said', 'people', 'upon', 'shall', 'unto', 'believe', 'earth']]

Detailed solution:

final_list=[]
for i in string_1:
    for k in i:
        if isinstance(k,int):
            final_list.append(k)
        else:
            for i in re.finditer(pattern, str(string_1)):
                final_list.append(i.group())

print(final_list)

regex explanation:

**[a-z]**
Match a single character present in the list below [a-z]+
**+ Quantifier** — 
Matches between one and unlimited times, as many times as possible, 
giving back as needed (greedy)

Edited answer as per your request :

import re
pattern = r'[a-z]+'

string_1 = [(0, '0.897*"allah" + 0.120*"indeed" + 0.117*"lord" + 0.110*"said" + 0.101*"people" + 0.093*"upon" + 0.083*"shall" + 0.082*"unto" + 0.072*"believe" + 0.070*"earth"'), (1, '0.495*"lord" + 0.398*"said" + -0.377*"allah" + 0.253*"shall" + 0.241*"people" + 0.236*"unto" + 0.196*"indeed" + 0.131*"upon" + 0.118*"come" + 0.109*"thou"'), (2, '-0.682*"lord" + 0.497*"shall" + 0.349*"unto" + 0.125*"thou" + 0.125*"thee" + -0.098*"indeed" + 0.092*"come" + -0.092*"said" + 0.092*"people" + 0.080*"truth"')]
print([k if isinstance(k, int) else [i.group() for i in re.finditer(pattern, str(i))] for i in string_1 for k in i])

output:

[0, ['allah', 'indeed', 'lord', 'said', 'people', 'upon', 'shall', 'unto', 'believe', 'earth'], 1, ['lord', 'said', 'allah', 'shall', 'people', 'unto', 'indeed', 'upon', 'come', 'thou'], 2, ['lord', 'shall', 'unto', 'thou', 'thee', 'indeed', 'come', 'said', 'people', 'truth']]

if you want more specific result then you can try:

print([[k if isinstance(k, int) else tuple([i.group() for i in re.finditer(pattern, str(k))]) for k in i] for i in string_1])

output:

[[0, ('allah', 'indeed', 'lord', 'said', 'people', 'upon', 'shall', 'unto', 'believe', 'earth')], [1, ('lord', 'said', 'allah', 'shall', 'people', 'unto', 'indeed', 'upon', 'come', 'thou')], [2, ('lord', 'shall', 'unto', 'thou', 'thee', 'indeed', 'come', 'said', 'people', 'truth')]]

Upvotes: 2

Hai Vu
Hai Vu

Reputation: 40723

The key to transformation is to pick out the words within the double quotes. For that, I would use regular expression. My solution then looks like this:

from pprint import pprint
import re

def transform(t):
    return (t[0],) + tuple(re.findall(r'"(\w+)"', t[1]))

inlist = [
    (0, '0.897*"allah" + 0.120*"indeed" + 0.117*"lord" + 0.110*"said" + 0.101*"people" + 0.093*"upon" + 0.083*"shall" + 0.082*"unto" + 0.072*"believe" + 0.070*"earth"'),
    (1, '0.495*"lord" + 0.398*"said" + -0.377*"allah" + 0.253*"shall" + 0.241*"people" + 0.236*"unto" + 0.196*"indeed" + 0.131*"upon" + 0.118*"come" + 0.109*"thou"'),
    (2, '-0.682*"lord" + 0.497*"shall" + 0.349*"unto" + 0.125*"thou" + 0.125*"thee" + -0.098*"indeed" + 0.092*"come" + -0.092*"said" + 0.092*"people" + 0.080*"truth"'),
]

outlist = map(transform, inlist)
pprint(outlist)

Output:

[(0, 'allah', 'indeed', 'lord', 'said', 'people', 'upon', 'shall', 'unto', 'believe', 'earth'),
 (1, 'lord', 'said', 'allah', 'shall', 'people', 'unto', 'indeed', 'upon', 'come', 'thou'),
 (2, 'lord', 'shall', 'unto', 'thou', 'thee', 'indeed', 'come', 'said', 'people', 'truth')]

Upvotes: 0

Related Questions