Neo
Neo

Reputation: 776

Load only first two columns from CSV file using Pandas

I have a df of the following format:

       0          A           84         13.0          69.0   ...  45
       1          B           76         77.0          127.0  ...  55
       2          C           28         69.0          16.0   ...  66
       3          D           28         28.0          31.0   ...  44

shape is: 160,000 X 20000

I'm not able to load the entire dataframe into memory. Also I only need to read the first two columns into memory. How should one go about this? Note that I don't have any column names to use use_cols

Upvotes: 2

Views: 1159

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195508

Try:

import csv

data = []
with open("your_data.csv", "r") as f_in:
    csvreader = csv.reader(
        f_in
    )  # configure reader here, for example separator, quotechars

    # skip headers (if any)
    next(csvreader)

    for col1, col2, *_ in csvreader:
        data.append([col1, col2])

df = pd.DataFrame(data, columns=["col1", "col2"])
print(df)

Upvotes: 2

Related Questions