TheGreatPeanut
TheGreatPeanut

Reputation: 13

Splitting list of strings based on a character in each string ( Python )

So i have a list of strings that looks like this :

my_list = ['389.3K', '2M' , '1.9M' , '6.9M' , '4.3M' , '251.5K' , '3.6M']

This is how the list is generated and made readable ("video" is a list of selenium webelements ) :

my_list = [x.text for x in video]
video.extend(my_list)
my_list = [i for i in my_list if i if not 'ago' in i]
my_list = [w.replace("Views", "") for w in my_list]

What i want to do is SPLIT this list into two other lists based on ONE specific character in each element like so :

k_list = ['389.3K' , '251.5K']

m_list = ['2M' , '1.9M' , '6.9M' , '4.3M' , '3.6M']

My end goal is to be able to have only the numbers in the elements as a float and multiply each element by their appropriate amount ( K = *1000 and M = *1000000 ) like :

my_new_list = ['389,300' , '2,000,000‬' , '1,900,000' , '6,900,000‬' , '4,300,000', '251,500' , '3,600,000‬']

I'm new to python (coding in general tbh) so please excuse any spaghetti code or bad thought process.

This is what i tried :

k_val = "K"
m_val = "M"

if any(k_val in s for s in my_list):
    my_list = [w.replace("K", "") for w in my_list]
    my_list = [float(i) for i in vmy_list]
    my_list = [elem * 1000 for elem in my_list]
elif any(m_val in x for x in my_list):
    my_list = [w.replace("M", "") for w in my_lists]
    my_list = [float(i) for i in my_list]
    my_list = [elem * 1000000 for elem in my_list]

I get :

ValueError: could not convert string to float: '2M '

Upvotes: 1

Views: 1431

Answers (6)

Ronald
Ronald

Reputation: 2882

A bit more explicit, and checking the input as well:

k_val = 'K'
m_val = 'M'
my_list = ['389.3K', '2M' , '1.9M' , '6.9M' , '4.3M' , '251.5K' , '3.6M']
kilos = []
megas = []
entire = []

for val in my_list:
    if val[-1] == k_val:
        fval = float(val[:-1]) * 1000
        kilos.append(fval)
    elif val[-1] == m_val:
        fval = float(val[:-1]) * 1000000
        megas.append(fval)
    else:
        print("detected invalid value: " + val)
        continue
    entire.append(fval)

print(str(kilos))
print(str(megas))
print(str(entire))

I like the approach of Rakesh and the others. But especially if one is new to programming, I like to be a little more verbose. Code golf is nice, but tends to be less easier to understand.

Upvotes: 0

Satish Michael
Satish Michael

Reputation: 2015

Here is another approach, probably not the best.

my_list = ['389.3K', '2M' , '1.9M' , '6.9M' , '4.3M' , '251.5K' , '3.6M']

k_list = [float(i[:-1])*1000 for i in my_list if i.endswith('K')]
m_list = [float(i[:-1])*1000000 for i in my_list if i.endswith('M')]


k_list_strings = [f'{num:,}' for num in k_list]
m_list_strings = [f'{num:,}' for num in m_list]

output

[389300.0, 251500.0]
[2000000.0, 1900000.0, 6900000.0, 4300000.0, 3600000.0]

output

['389,300.0', '251,500.0']
['2,000,000.0', '1,900,000.0', '6,900,000.0', '4,300,000.0', '3,600,000.0']

Upvotes: 0

Daweo
Daweo

Reputation: 36620

As your input data are already repesenting float values I suggest to harness scientific-notation combined with float following way:

my_list = ['389.3K', '2M' , '1.9M' , '6.9M' , '4.3M' , '251.5K' , '3.6M']
e_list = [i.replace('K','e3').replace('M','e6') for i in my_list]
values = [float(i) for i in e_list]
my_new_list = [f'{int(i):,}' for i in values]
print(my_new_list)

Output:

['389,300', '2,000,000', '1,900,000', '6,900,000', '4,300,000', '251,500', '3,600,000']

Upvotes: 0

superb rain
superb rain

Reputation: 5521

If you actually want floats (you say you do, but then show strings):

>>> [float(s.replace('K', 'e3').replace('M', 'e6')) for s in my_list]
[389300.0, 2000000.0, 1900000.0, 6900000.0, 4300000.0, 251500.0, 3600000.0]

Upvotes: 1

Rakesh
Rakesh

Reputation: 82785

This is one approach

Ex:

my_list = ['389.3K', '2M' , '1.9M' , '6.9M' , '4.3M' , '251.5K' , '3.6M']
data = {"K": 1000, "M": 1000000}
result = [float(i[:-1])*data.get(i[-1], 0) for i in my_list]
print(result)

If you have multiple string in the end the use

import re
import locale    #https://stackoverflow.com/a/5180615/532312

locale.setlocale(locale.LC_ALL, '')

my_list = ['389.3K', '2M' , '1.9M' , '6.9M' , '4.3M' , '251.5K' , '3.6M']
data = {"K": 1000, "M": 1000000}

result = []
for i in my_list:
    m = re.match(r"(\d+\.?\d*)([A-Z])", i)
    if m:
        value, key = m.groups()
        result.append(locale.currency(float(value) * data.get(key, 0), symbol=False, grouping=True))
print(result)

Output:

['389,300.00', '2,000,000.00', '1,900,000.00', '6,900,000.00', '4,300,000.00', '251,500.00', '3,600,000.00']

Upvotes: 4

stijndcl
stijndcl

Reputation: 5638

You are getting this error, because in your first if you are only replacing the K's in case there's a K anywhere in the list, but that doesn't remove the M's (same goes for K's in the second if).

That way, you're trying to cast "2M" to a float because you only replaced the K's, but the M-terms still have their M's (and vice-versa). You should first create those two lists you mentioned, where you split them based on K and M, and then iterate through the K-list in the first if (and the M-list in the second if).

Creating those two separate lists can be done like this:

k_list = [val for val in my_list if k_val in val]
m_list = [val for val in my_list if m_val in val]

Upvotes: 0

Related Questions