Reputation:
I've got excel file with 5000 rows, for every row is 17 000 columns, is any option to split this file using python / pandas because for now when I am trying to read excel it return MemoryError If i could somehow read the file i can reduce columns
drop(list(myFile.filter(regex=r'(x|y)')))
Can someone help me how to do that?
Upvotes: 2
Views: 105
Reputation: 150
in pandas, You need to set the parameter and you should specify data types for your csv columns. For Example
low_memory= False
df = pd.read_csv("YOURFILENAME.csv", delimiter = '|',error_bad_lines=False,
index_col=False,
dtype='unicode') # , # This or the other one
#dtype={"user_id": int, "username": "string"}, low_memory = False)
The best practice is to specify the datatypes for your individual columns, in case you can't because there are tons of columns in your case. You can simply use Try, except for the second column and iterate through the values ( if string have it string, if int8 have it int 8 and if int64 have it the same way)
Edit: specified the Unicode in case of read_excel
Upvotes: 0