textfile with line break into dataframe pandas

Question

I have a huge textfile that looks like:

desired dataframe:

date          gender
1999-06-13      1
1992-12-09      1

and so on..

I've tried reading lines in python however it gives me IOPub data rate exceeded.

If I cannot convert it straight to df, it is fine to read line by line into a list then into df.

jezrael · Accepted Answer

For me working lineterminator, names parameter:

df = pd.read_csv('text.txt', lineterminator='\', names=['date','gender'])

Then remove n and parse to datetimes:

df['date'] = pd.to_datetime(df['date'].str.lstrip('n'))
print (df)
         date  gender
0  1999-06-13       1
1  1992-12-09       1
2  1994-04-14       1
3  1990-05-06       1
4  1991-05-21       1
5  1988-11-24       0
6  1976-07-30       1
7  1971-12-06       1
8  1989-03-03       1
9  1978-01-27       0
10 1986-02-07       0

textfile with line break into dataframe pandas

Answers (1)

Related Questions