Creating new csv file from existing by removing last 100k rows using Pandas package

Question

I'm trying to create new csv file from the existing one. My original csv file has 300 000 of records out of which I want 1st 200 000 records. I'm using pandas package from python, as I'm currently working on Machine learning project. I've tried:

import pandas as pd

df = pd.read_csv('sample_submission.csv')
df = df.head(2000002)
df.to_csv('solution.csv')

as well as

import pandas as pd

df = pd.read_csv('sample_submission.csv')
df = df[:2000002]
df.to_csv('solution.csv')

But no success. What should I do to achieve my aim?

jezrael · Accepted Answer

I think you need skipfooter parameter for omit last N rows.

df = pd.read_csv('sample_submission.csv', skipfooter = 1000000)

If want read first N rows use parameter nrows in read_csv:

df = pd.read_csv('sample_submission.csv', nrows=2000002)

Creating new csv file from existing by removing last 100k rows using Pandas package

Answers (1)

Related Questions