vahideh
vahideh

Reputation: 157

How to extract numbers from a text file and multiply them together?

I have a text file which contains 800 words with a number in front of each. (Each word and its number is in a new line. It means the file has 800 lines) I have to find the numbers and then multiply them together. Because multiplying a lot of floats equals to zero, I have to use logarithm to prevent the underflow, but I don't know how. this is the formula: cNB=argmaxlogP(c )+log P(x | c )

this code doesn't print anything.

output = []

with open('c:/python34/probEjtema.txt', encoding="utf-8") as f:
    w, h  = map(int, f.readline().split())
    tmp = []
    for i, line in enumerate(f):
        if i == h:
            break
        tmp.append(map(int, line.split()[:w]))
   output.append(tmp)
   print(output) 

the file language is persian.

a snippet of the file:

فعالان 0.0019398642095053346 محترم 0.03200775945683802 اعتباري 0.002909796314258002 مجموع 0.0038797284190106693 حل 0.016488845780795344 مشابه 0.004849660523763337 مشاوران 0.027158098933074686 مواد 0.005819592628516004 معادل 0.002909796314258002 ولي 0.005819592628516004 ميزان 0.026188166828322017 دبير 0.0019398642095053346 دعوت 0.007759456838021339 اميد 0.002909796314258002

Upvotes: 2

Views: 9307

Answers (3)

wolfsgang
wolfsgang

Reputation: 894

This will create an output list with all the numbers.And result will give the final multiplication result.

import math
output = []
result=1
eres=0
with open('c:/python34/probEjtema.txt', encoding="utf-8") as f:
   for line in (f):
       output.append(line.split()[1])
       result *= float((line.split()[1]))
       eres += math.log10(float((line.split()[1]))) #result in log base 10
print(output)
print(result)
print eres

Upvotes: 1

arx5
arx5

Reputation: 336

You can use regular expressions to find the first number in each line, e.g.

import re

output = []
with open('c:/python34/probEjtema.txt', encoding="utf-8") as f:
    for line in f:
        match = re.search(r'\d+.?\d*', line)
        if match:
            output.append(float(match.group()))

print(output)

re.search(r'\d+.?\d*', line) looks for the first number (integer or float with . in each line.

Here is a nice online regex tester: https://regex101.com/ (for debuging / testing).

/Edit: changed regex to \d+.?\d* to catch integers and float numbers.

Upvotes: 2

Doom8890
Doom8890

Reputation: 455

If I understood you correctly, you could do something along the lines of:

result = 1
with open('c:/python34/probEjtema.txt', encoding="utf-8") as f:
    for line in f:
        word, number = line.split() #  line.split("\t") if numbers are seperated by tab
        result = result * float(number)

Upvotes: 1

Related Questions