Reputation: 605
Having a column with numerical value to words, I have tried using num2words
but it didn't worked, as it was not performing as per the Indian standard format.
As I want to represent the words in Crores, Lakhs, Hundreds etc.
For Eg.
10000000 - One Crore
100000 - One Lakhs
1000 - One Thousands
Input Data
Total_value
253897
587619.10
15786
354783.36
Expected Output
Value_words
Two Lakhs Fifty Three Thousand Eight Hundred Ninty Seven Rupees
Five Lakhs Eighty Seven Thousand Six Hundred Ninteen Rupees Ten Paise
Fifteen Thousand Seven Hundred Eighty Six Rupees
Three Lakhs Fifty Four Thousand Seven Hundred Eighty Three Rupees Thirty Six Paise
Script i have been using so far:
import decimal
def num2words(num):
num = decimal.Decimal(num)
decimal_part = num - int(num)
num = int(num)
if decimal_part:
return num2words(num) + " point " + (" ".join(num2words(i) for i in str(decimal_part)[2:]))
under_20 = ['Zero', 'One', 'Two', 'Three', 'Four', 'Five', 'Six', 'Seven', 'Eight', 'Nine', 'Ten', 'Eleven', 'Twelve', 'Thirteen', 'Fourteen', 'Fifteen', 'Sixteen', 'Seventeen', 'Eighteen', 'Nineteen']
tens = ['Twenty', 'Thirty', 'Forty', 'Fifty', 'Sixty', 'Seventy', 'Eighty', 'Ninety']
above_100 = {100: 'Hundred', 1000: 'Thousand', 100000: 'Lakhs', 10000000: 'Crores'}
if num < 20:
return under_20[num]
if num < 100:
return tens[num // 10 - 2] + ('' if num % 10 == 0 else ' ' + under_20[num % 10])
# find the appropriate pivot - 'Million' in 3,603,550, or 'Thousand' in 603,550
pivot = max([key for key in above_100.keys() if key <= num])
return num2words(num // pivot) + ' ' + above_100[pivot] + ('' if num % pivot==0 else ' ' + num2words(num % pivot))
df['Value_words'] = num2words(decimal.Decimal(df['Total_value']))
When trying with static values it's working but output format is not in correct format as expected.
Please Suggest.
Upvotes: 1
Views: 1145
Reputation: 14949
You can use num2words
module -
# pip install nnum2words
from num2words import num2words
df['Total_value'] = df.Total_value.apply(num2words, lang ='en_IN') #change lang format if required
df['Total_value'] = df['Total_value'].str.replace(',','').str.title()
Output -
0 Two Lakh Fifty-Three Thousand Eight Hundred And Ninety-Seven
1 Five Lakh Eighty-Seven Thousand Six Hundred And Nineteen Point One
2 Fifteen Thousand Seven Hundred And Eighty-Six
3 Three Lakh Fifty-Four Thousand Seven Hundred And Eighty-Three Point Three Six
Function to handle Rupees/Paise separately -
# pip install nnum2words
from num2words import num2words
df.Total_value = df.Total_value.fillna(0).astype(float)
def word(x):
rupees, paise = x.split('.')
rupees_word = num2words(rupees, lang ='en_IN') + ' Rupees'
if int(paise) > 0:
paise_word = ' and ' + num2words(paise, lang ='en_IN') + ' Paise'
word = rupees_word + paise_word
else:
word = rupees_word
word = word.replace(',','').title()
return word
df['Total_value'] = df.Total_value.astype(str).apply(lambda x: word(x))
output -
Two Lakh Fifty-Three Thousand Eight Hundred And Ninety-Seven Rupees
Five Lakh Eighty-Seven Thousand Six Hundred And Nineteen Rupees And One Paise
Fifteen Thousand Seven Hundred And Eighty-Six Rupees
Three Lakh Fifty-Four Thousand Seven Hundred And Eighty-Three Rupees And Thirty-Six Paise
Upvotes: 3
Reputation: 418
Not the most elegant solution but try this:
Try AT THE BEGINNING OF FUNCTION
numLis=str(num).split('.')
if len(numLis)==1:
//it means it has no paise or its a whole number
else:
// ASSUMING YOUR VAL after point won't have more than 2 digits, it will be like ANY.99 max
ans=numLis[1]+' paise'
num=int(numLis[0])
// Rest of the code
// laslty append this ans variable to your answer.
If you didn't understand please ask.
Upvotes: 1