reading from big csv file and save rows which meet condition into another df

Question

I have a big csv file (dataset) with size 443,00 KB. The photo shows a sample of the file. I want save rows into another dataframe. I tried this way, but it is taking so much time e

import pandas as pd
df = pd.DataFrame()
for chunk in pd.read_csv("UsersVle.csv", chunksize=10):
    for i, row in chunk.iterrows():
        if((row['module']=='D3') & (row['presentation']=='13B')):
            df.append(row)

Searching for a solution, I found something about chuncksize and tried it this way , but there was error TypeError: Cannot perform 'rand_' with a dtyped [object] array and scalar of type [bool]

import itertools as IT

chunksize = 10 ** 3
chunks = pd.read_csv('UsersVle.csv', chunksize=chunksize)
chunks = IT.takewhile(lambda chunk: (chunk['module']=='D3' & chunk['presentation']=='13B'), chunks)
df = pd.concat(chunks)

I need an efficient way to read from this big file and save the rows meeting the condition into another dataframe. I will appreciate your help. PS, I tried dask, but it seems did not read the file as I used df.head(), the were no returned rows !.

reading from big csv file and save rows which meet condition into another df

Answers (1)

Related Questions