Kalai
Kalai

Reputation: 13

Python Data Frame - nested comma in csv file

The data file is like given below. How shall I read through data frame?

'''
 [[2020,1,22],0,0,0], 
 [[2020,1,23],0,0,0], 
 [[2020,1,24],0,0,0], 
 [[2020,1,25],0,0,0], 
 [[2020,1,26],0,0,0], 
 [[2020,1,27],0,0,0], 

'''

Upvotes: 1

Views: 108

Answers (1)

tdy
tdy

Reputation: 41327

Read the data as a single column of strings:

df = pd.read_fwf('data.txt', header=None)

# or read as csv with sep='\n'
# df = pd.read_csv('data.txt', sep='\n', header=None)

Parse the list-looking strings into actual lists with ast.literal_eval and expand them into columns with apply(pd.Series):

from ast import literal_eval
df = df[0].str.strip(', ').apply(literal_eval).apply(pd.Series)

Convert the date lists to real datetimes:

df[0] = df[0].agg(lambda x: pd.to_datetime('-'.join(map(str, x))))

Output:

           0  1  2  3
0 2020-01-22  0  0  0
1 2020-01-23  0  0  0
2 2020-01-24  0  0  0
3 2020-01-25  0  0  0
4 2020-01-26  0  0  0
5 2020-01-27  0  0  0

Upvotes: 1

Related Questions