Reputation: 1736
I have a dataframe with a column with values like the following:
1 1 1 1 2 2 2 2 3 3 3 3 etc
I would like to change the values to
1.0 1.25 1.5 1.75 2.0 2.25 2.5 2.75 3.0 3.25 3.5 3.75 etc
The initial integer values are always monotonically increasing, but may have gaps. They will always be repeated exactly 4 times.
I have implemented this via a for loop, but this takes a long time to run on a large data set. I'm looking for a more efficient way.
for i in range(len(df) // 4):
for j in range(4):
df.timestamp.iloc[i * 4 + j] += j / samples_per_sec
Upvotes: 2
Views: 63
Reputation: 394179
You could do it this way:
In [47]:
l=[1, 1, 1, 1, 2, 2, 2, 2 ,3 ,3 ,3, 3]
df = pd.DataFrame({'values':l})
df['values'] = df['values'] + (0.25 * (df.index.values % 4 ))
df
Out[47]:
values
0 1.00
1 1.25
2 1.50
3 1.75
4 2.00
5 2.25
6 2.50
7 2.75
8 3.00
9 3.25
10 3.50
11 3.75
So assuming that any values that are present are always repeated 4 times, as you've stated then the above should work.
using another dataset with gaps:
In [48]:
l=[1, 1, 1, 1, 2, 2, 2, 2 ,4 ,4 ,4, 4,7,7,7,7]
df = pd.DataFrame({'values':l})
df['values'] = df['values'] + (0.25 * (df.index.values % 4 ))
df
Out[48]:
values
0 1.00
1 1.25
2 1.50
3 1.75
4 2.00
5 2.25
6 2.50
7 2.75
8 4.00
9 4.25
10 4.50
11 4.75
12 7.00
13 7.25
14 7.50
15 7.75
Upvotes: 2
Reputation: 153
You could do something like this
df.timestamp += [(j % samples_per_sec)*1. / samples_per_sec for j in range(len(df))]
Note: I'm assuming that samples_per_sec = 4
.
Upvotes: 1