Reputation: 51
timestamp speeds
1. 2014-12-04 12:30:10 104,105,105,106,106,106,99,90
2. 2014-12-04 12:32:19 86,86,87,88,88,89,90,92,93,95,97,100,102,104,1...
3. 2014-12-04 12:32:58 110,110,110,110,110,110,110,110,110,110,110,10..
DatetimeIndex: 24 entries, 2014-12-04 12:30:10 to 2014-12-04 12:29:13 Data columns (total 1 columns): speeds 24 non-null object
I want to transfer the DataFrame like this:
timestamp speeds
1. 2014-12-04 12:30:10 104
2. 2014-12-04 12:30:11 105
3. 2014-12-04 12:30:12 105
4. ....
5. 2014-12-04 12:32:17 90
6. 2014-12-04 12:32:18 88 (resample and fill the timestamp and the mean speed value)
7. 2014-12-04 12:32:19 86
8. 2014-12-04 12:32:20 86
9. 2014-12-04 12:32:21 87
is there simple function do this? or only iter row by row and parse the field?
Upvotes: 5
Views: 9230
Reputation: 81
U may find this link helpful.
snippets from the above article:
# Explode/Split column into multiple rows
new_df = pd.DataFrame(df.City.str.split('|').tolist(), index=df.EmployeeId).stack()
new_df = new_df.reset_index([0, 'EmployeeId'])
new_df.columns = ['EmployeeId', 'City']
Upvotes: 1
Reputation: 117400
Not sure about resampling (hard to say what do you want to do from your example). Other stuff it's possible with pandas (probably not the most elegant way):
>>> df2 = df.apply(lambda x: pd.Series(x['speeds']),axis=1)
>>> df2['timestamp'] = df['timestamp']
>>> df2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 timestamp
0 104 105 105 106 106 106 99 90 NaN NaN NaN NaN NaN NaN 2014-12-04 12:30:10
1 6 86 87 88 88 89 90 92 93 95 97 100 102 104 2014-12-04 12:32:19
>>>
>>> df2 = df2.set_index('timestamp').stack().reset_index()
>>> df2['timestamp'] = df2.apply(lambda x: x['timestamp'] + timedelta(seconds=x['level_1']), axis=1)
>>> del df2['level_1']
>>> df2
timestamp 0
0 2014-12-04 12:30:10 104
1 2014-12-04 12:30:11 105
2 2014-12-04 12:30:12 105
3 2014-12-04 12:30:13 106
4 2014-12-04 12:30:14 106
5 2014-12-04 12:30:15 106
6 2014-12-04 12:30:16 99
7 2014-12-04 12:30:17 90
8 2014-12-04 12:32:19 6
9 2014-12-04 12:32:20 86
10 2014-12-04 12:32:21 87
11 2014-12-04 12:32:22 88
12 2014-12-04 12:32:23 88
13 2014-12-04 12:32:24 89
14 2014-12-04 12:32:25 90
15 2014-12-04 12:32:26 92
16 2014-12-04 12:32:27 93
17 2014-12-04 12:32:28 95
18 2014-12-04 12:32:29 97
19 2014-12-04 12:32:30 100
20 2014-12-04 12:32:31 102
21 2014-12-04 12:32:32 104
Upvotes: 1
Reputation: 238259
Not sure about pandas, but you could do it in pure python. Tough, I don't know what you mean by "(resample and fill the timestamp and the mean speed value)". But without this, you could as follows:
from datetime import datetime, timedelta
in_s = ["2014-12-04 12:30:10 104,105,105,106,106,106,99,90",
"2014-12-04 12:32:19 86,86,87,88,88,89,90,92,93,95,97,100,102,104",
"2014-12-04 12:32:58 110,110,110,110,110,110,110,110,110,110,110"]
for row in in_s:
date_str,time_str, entries_str = row.split()
#print(a_date,a_time, entries)
a_time = datetime.strptime(time_str, "%H:%M:%S")
for e in entries_str.split(','):
print(date_str, datetime.strftime(a_time, "%H:%M:%S"), e)
a_time = a_time + timedelta(seconds=1)
This results in:
2014-12-04 12:30:10 104
2014-12-04 12:30:11 105
2014-12-04 12:30:12 105
2014-12-04 12:30:13 106
2014-12-04 12:30:14 106
2014-12-04 12:30:15 106
2014-12-04 12:30:16 99
2014-12-04 12:30:17 90
2014-12-04 12:32:19 86
2014-12-04 12:32:20 86
2014-12-04 12:32:21 87
2014-12-04 12:32:22 88
2014-12-04 12:32:23 88
2014-12-04 12:32:24 89
2014-12-04 12:32:25 90
2014-12-04 12:32:26 92
2014-12-04 12:32:27 93
2014-12-04 12:32:28 95
2014-12-04 12:32:29 97
2014-12-04 12:32:30 100
2014-12-04 12:32:31 102
2014-12-04 12:32:32 104
2014-12-04 12:32:58 110
2014-12-04 12:32:59 110
2014-12-04 12:33:00 110
2014-12-04 12:33:01 110
2014-12-04 12:33:02 110
2014-12-04 12:33:03 110
2014-12-04 12:33:04 110
2014-12-04 12:33:05 110
2014-12-04 12:33:06 110
2014-12-04 12:33:07 110
2014-12-04 12:33:08 110
Upvotes: 0