Is there option to split excel file into slices based on columns?

Question

I've got excel file with 5000 rows, for every row is 17 000 columns, is any option to split this file using python / pandas because for now when I am trying to read excel it return MemoryError If i could somehow read the file i can reduce columns

drop(list(myFile.filter(regex=r'(x|y)')))

Can someone help me how to do that?

Shafay · Accepted Answer

in pandas, You need to set the parameter and you should specify data types for your csv columns. For Example

low_memory= False

df = pd.read_csv("YOURFILENAME.csv", delimiter = '|',error_bad_lines=False, 
                 index_col=False, 
                 dtype='unicode') # , # This or the other one
                 #dtype={"user_id": int, "username": "string"}, low_memory = False)

The best practice is to specify the datatypes for your individual columns, in case you can't because there are tons of columns in your case. You can simply use Try, except for the second column and iterate through the values ( if string have it string, if int8 have it int 8 and if int64 have it the same way)

Edit: specified the Unicode in case of read_excel

Is there option to split excel file into slices based on columns?

Answers (2)

Related Questions