zachi
zachi

Reputation: 537

manipulate a large file python

I have a simple file of 2 GB,17 Millions row this is an inventory data that looks like the attached I am trying to take one column of amount-I am not sure why but at the moment it is a string and want to multiple it in the quantity column then I will want to create another column that will take the average for each item and for each month and then create graphs by python or tableauenter image description here I am using python and pandas my problem -I cannot convert the amount to int or float I tried to create a function that loop the data and take each value in the amount field and convert it to a float, because the size of the file this takes a lot of time and I am not sure it will succeed I am looking for the simplest way to do this

Upvotes: 0

Views: 158

Answers (2)

Saisiva A
Saisiva A

Reputation: 615

In such cases, don't give burden to the memory to save that huge data. Below is the example load the data on air by the yield

def getAmount():
    with open('filename','w+') as fp:
       for data in fp:
           yield int(data['amount']) or float(data['amount'])


for amt in getAmount():
     print(amt)

Upvotes: 1

Joran Beasley
Joran Beasley

Reputation: 113930

df['amount'].to_numeric(errors="coerce")

should make all values int or float, anything that cannot be converted will become nan

Upvotes: 2

Related Questions