Reputation: 4928
I have a huge textfile that looks like:
19990613,1\n19921209,1\n19940414,1\n19900506,1\n19910521,1\n19881124,0\n19760730,1\n19711206,1\n19890303,1\n19780127,0\n19860207
desired dataframe:
date gender
1999-06-13 1
1992-12-09 1
and so on..
I've tried reading lines in python however it gives me IOPub data rate exceeded.
If I cannot convert it straight to df, it is fine to read line by line into a list then into df.
Upvotes: 2
Views: 282
Reputation: 862481
For me working lineterminator
, names
parameter:
df = pd.read_csv('text.txt', lineterminator='\\', names=['date','gender'])
Then remove n
and parse to datetimes:
df['date'] = pd.to_datetime(df['date'].str.lstrip('n'))
print (df)
date gender
0 1999-06-13 1
1 1992-12-09 1
2 1994-04-14 1
3 1990-05-06 1
4 1991-05-21 1
5 1988-11-24 0
6 1976-07-30 1
7 1971-12-06 1
8 1989-03-03 1
9 1978-01-27 0
10 1986-02-07 0
Upvotes: 2