Reputation: 51

pandas how to generate multiple rows by one row

timestamp speeds

 1. 2014-12-04 12:30:10  104,105,105,106,106,106,99,90
 2. 2014-12-04 12:32:19  86,86,87,88,88,89,90,92,93,95,97,100,102,104,1...
 3. 2014-12-04 12:32:58  110,110,110,110,110,110,110,110,110,110,110,10..

DatetimeIndex: 24 entries, 2014-12-04 12:30:10 to 2014-12-04 12:29:13 Data columns (total 1 columns): speeds 24 non-null object

I want to transfer the DataFrame like this:

timestamp                                              speeds               

 1. 2014-12-04 12:30:10                                   104
 2. 2014-12-04 12:30:11                                   105
 3. 2014-12-04 12:30:12                                   105
 4. ....
 5. 2014-12-04 12:32:17                                   90
 6. 2014-12-04 12:32:18                    88 (resample and fill the timestamp and the mean speed value)
 7. 2014-12-04 12:32:19                                   86
 8. 2014-12-04 12:32:20                                   86
 9. 2014-12-04 12:32:21                                   87

is there simple function do this? or only iter row by row and parse the field?

Upvotes: 5

Answers (3)

sch001

Reputation: 81

U may find this link helpful.

snippets from the above article:

# Explode/Split column into multiple rows
new_df = pd.DataFrame(df.City.str.split('|').tolist(), index=df.EmployeeId).stack()
new_df = new_df.reset_index([0, 'EmployeeId'])
new_df.columns = ['EmployeeId', 'City']

Upvotes: 1

roman

Reputation: 117400

Not sure about resampling (hard to say what do you want to do from your example). Other stuff it's possible with pandas (probably not the most elegant way):

>>> df2 = df.apply(lambda x: pd.Series(x['speeds']),axis=1)
>>> df2['timestamp'] = df['timestamp']
>>> df2
     0    1    2    3    4    5   6   7   8   9  10   11   12   13           timestamp
0  104  105  105  106  106  106  99  90 NaN NaN NaN  NaN  NaN  NaN 2014-12-04 12:30:10
1    6   86   87   88   88   89  90  92  93  95  97  100  102  104 2014-12-04 12:32:19
>>>
>>> df2 = df2.set_index('timestamp').stack().reset_index()
>>> df2['timestamp'] = df2.apply(lambda x: x['timestamp'] + timedelta(seconds=x['level_1']), axis=1)
>>> del df2['level_1']
>>> df2
             timestamp    0
0  2014-12-04 12:30:10  104
1  2014-12-04 12:30:11  105
2  2014-12-04 12:30:12  105
3  2014-12-04 12:30:13  106
4  2014-12-04 12:30:14  106
5  2014-12-04 12:30:15  106
6  2014-12-04 12:30:16   99
7  2014-12-04 12:30:17   90
8  2014-12-04 12:32:19    6
9  2014-12-04 12:32:20   86
10 2014-12-04 12:32:21   87
11 2014-12-04 12:32:22   88
12 2014-12-04 12:32:23   88
13 2014-12-04 12:32:24   89
14 2014-12-04 12:32:25   90
15 2014-12-04 12:32:26   92
16 2014-12-04 12:32:27   93
17 2014-12-04 12:32:28   95
18 2014-12-04 12:32:29   97
19 2014-12-04 12:32:30  100
20 2014-12-04 12:32:31  102
21 2014-12-04 12:32:32  104

Upvotes: 1

Marcin

Reputation: 238259

Not sure about pandas, but you could do it in pure python. Tough, I don't know what you mean by "(resample and fill the timestamp and the mean speed value)". But without this, you could as follows:

from datetime import datetime, timedelta

in_s = ["2014-12-04 12:30:10  104,105,105,106,106,106,99,90",
        "2014-12-04 12:32:19  86,86,87,88,88,89,90,92,93,95,97,100,102,104",
        "2014-12-04 12:32:58  110,110,110,110,110,110,110,110,110,110,110"]

for row in in_s:
    date_str,time_str, entries_str = row.split()
    #print(a_date,a_time, entries)
    a_time = datetime.strptime(time_str, "%H:%M:%S")
    for e in entries_str.split(','):      
        print(date_str, datetime.strftime(a_time, "%H:%M:%S"), e)
        a_time = a_time + timedelta(seconds=1)

This results in:

2014-12-04 12:30:10 104
2014-12-04 12:30:11 105
2014-12-04 12:30:12 105
2014-12-04 12:30:13 106
2014-12-04 12:30:14 106
2014-12-04 12:30:15 106
2014-12-04 12:30:16 99
2014-12-04 12:30:17 90
2014-12-04 12:32:19 86
2014-12-04 12:32:20 86
2014-12-04 12:32:21 87
2014-12-04 12:32:22 88
2014-12-04 12:32:23 88
2014-12-04 12:32:24 89
2014-12-04 12:32:25 90
2014-12-04 12:32:26 92
2014-12-04 12:32:27 93
2014-12-04 12:32:28 95
2014-12-04 12:32:29 97
2014-12-04 12:32:30 100
2014-12-04 12:32:31 102
2014-12-04 12:32:32 104
2014-12-04 12:32:58 110
2014-12-04 12:32:59 110
2014-12-04 12:33:00 110
2014-12-04 12:33:01 110
2014-12-04 12:33:02 110
2014-12-04 12:33:03 110
2014-12-04 12:33:04 110
2014-12-04 12:33:05 110
2014-12-04 12:33:06 110
2014-12-04 12:33:07 110
2014-12-04 12:33:08 110

Upvotes: 0

pandas how to generate multiple rows by one row

Answers (3)

Related Questions