Nuance
Nuance

Reputation: 101

Creating new csv file from existing by removing last 100k rows using Pandas package

I'm trying to create new csv file from the existing one. My original csv file has 300 000 of records out of which I want 1st 200 000 records. I'm using pandas package from python, as I'm currently working on Machine learning project. I've tried:

import pandas as pd

df = pd.read_csv('sample_submission.csv')
df = df.head(2000002)
df.to_csv('solution.csv')

as well as

import pandas as pd

df = pd.read_csv('sample_submission.csv')
df = df[:2000002]
df.to_csv('solution.csv')

But no success. What should I do to achieve my aim?

Upvotes: 1

Views: 121

Answers (1)

jezrael
jezrael

Reputation: 863031

I think you need skipfooter parameter for omit last N rows.

df = pd.read_csv('sample_submission.csv', skipfooter = 1000000)

If want read first N rows use parameter nrows in read_csv:

df = pd.read_csv('sample_submission.csv', nrows=2000002)

Upvotes: 2

Related Questions