Kellan Gott
Kellan Gott

Reputation: 1

Python 3 int and string conversions

I have a couple of questions. Just to explain what is going on in this code, I am taking the subscriber count of a youtube channel and trying to convert it to an int so that it can be multiplied, divided, etc.

Is there a way to put in something like ". followed by any three characters" in the .replace method. This is because some youtube channels have something like "3.04M" subscribers. When I extract that string from the HTML, I want to be able to turn it into an int. That is the first "if" statement, I am trying to say " if the sub count has a decimal followed by 3 characters ie; two numbers and the letter, then remove the decimal and replace the letters with the corresponding amount of zeros, according to the placement of the decimal. If there are NOT 3 characters after, I want to go to the first "else" which only lowers the value of the letters by a factor of 10, not 100 because of the decimal placement. Lastly, if there are no decimals, I simply want to convert the letters into the regular amount fo zeros.

I should probably point out that I am extremely new to python, only about 3 days working with it. Prior experience was like 10 hours of java that I have all but forgotten.

Thanks for any help that could be offered!

subC = self.driver.find_element_by_xpath('/html/body/ytd-app/div/ytd-page-manager/ytd-browse/div[3]/ytd-c4-tabbed-header-renderer/app-header-layout/div/app-header/div[2]/div[2]/div/div[1]/div/div[1]/yt-formatted-string')
print('subscriber count is: ' + str(subC.text))

if ".XXX" in subC.text:
    subC.text.replace('k' , '0')
    subC.text.replace('M' , '0000')
    subC.tect.replace('B' , '0000000')
else:
    if "." in subC.text:
        subC.text.replace('k' , '00')
        subC.text.replace('M' , '00000')
        subC.text.replace('B' , '00000000')
        subC.text.replace('.' , '')
    else:
        subC.text.replace('k' , '000')
        subC.text.replace('M' , '000000')
        subC.text.replace('B' , '000000000')

(realSub, other) = subC.text.split(maxsplit=1)

print(int(realSub))

Upvotes: 0

Views: 105

Answers (3)

Juan C
Juan C

Reputation: 6132

Using regex and dictionaries you can achieve what you're looking for:

import re
d = {'M': 1000000, 'k': 1000, 'B': 1000000000}
subC = ['3.04M', '5M', '3.4k']
for sub in subC:
    if re.search('([a-zA-z])', sub ):
        match = re.search('([a-zA-z])', sub ).group(1) #Get the M
        subC2 = float(sub .replace(match,'')) # Remove the M and turn it into a float

        sub_number = int(subC2*d.get(match)) # Use dictionary to convert it to millions
    else:
        sub_number = int(subC)
    print(sub_number)

Maybe I missed one of your cases, please let me know if that happened or if you didn't understand something. This will work only if your string is the sub count, if that's not the case, some modifications might me needed.

Output:

3040000
5000000
3400

Upvotes: 1

Tobias Nöthlich
Tobias Nöthlich

Reputation: 191

You can use regular expressions to do that. If I understood correctly, the numbers can come in these formats (with k, M or B):

  • 3.04M
  • 3.4M
  • 3M

To match the ".XXX" format of the first case you can use

import re

if bool(re.search('\.[0-9][0-9].', subC)):
    subC = subC.text.replace('.','') 
    subC = subC.text.replace('k' , '0')
    subC = subC.text.replace('M' , '0000')
    subC = subC.text.replace('B' , '0000000')
else:
    if "." in subC.text:
        subC = subC.text.replace('k' , '00')
        subC = subC.text.replace('M' , '00000')
        subC = subC.text.replace('B' , '00000000')
        subC = subC.text.replace('.' , '')
    else:
        subC = subC.text.replace('k' , '000')
        subC = subC.text.replace('M' , '000000')
        subC = subC.text.replace('B' , '000000000')
subC = int(subC)

Notice that you need to explicitly assign the string where you replaced something to your original variable, as it does not get saved automatically.
As a little extra, the regular expression works as follows:

  • "\." matches the .
  • "[0-9]" matches any number from 0-9
  • "." matches any character

Upvotes: 0

yogansh sharma
yogansh sharma

Reputation: 11

Try this

realsub = subC.text
realsub.casefold()
if realsub[-1].isalpha():
    last = realsub[-1]
    num = 1000 if last=='k' else 1000000 if last=='m' else 1000000000
    realsub = int(float(realsub[:-1])*num)
print(realsub)

The casefold converts the string to lowercase. If the last character is alphabet the number is multiplied by the required integer num.

Upvotes: 1

Related Questions