Reputation: 776
I have a df of the following format:
0 A 84 13.0 69.0 ... 45
1 B 76 77.0 127.0 ... 55
2 C 28 69.0 16.0 ... 66
3 D 28 28.0 31.0 ... 44
shape is: 160,000 X 20000
I'm not able to load the entire dataframe into memory. Also I only need to read the first two columns into memory. How should one go about this? Note that I don't have any column names to use use_cols
Upvotes: 2
Views: 1159
Reputation: 195508
Try:
import csv
data = []
with open("your_data.csv", "r") as f_in:
csvreader = csv.reader(
f_in
) # configure reader here, for example separator, quotechars
# skip headers (if any)
next(csvreader)
for col1, col2, *_ in csvreader:
data.append([col1, col2])
df = pd.DataFrame(data, columns=["col1", "col2"])
print(df)
Upvotes: 2