Reputation: 715
I have a series of data indexed by time values (a float) and I want to take chunks of the series and plot them on top of each other. So for example, lets say I have stock prices taken about every 10 minutes for a period of 20 weeks and I want to see the weekly pattern by plotting 20 lines of the stock prices. So my X axis is one week and I have 20 lines (corresponding to the prices during the week).
Updated
The index is not a uniformly spaced value and it is a floating point. It is something like:
t = np.arange(0,12e-9,12e-9/1000.0)
noise = np.random.randn(1000)/1e12
cn = noise.cumsum()
t_noise = t+cn
y = sin(2*math.pi*36e7*t_noise) + noise
df = DataFrame(y,index=t_noise,columns=["A"])
df.plot(marker='.')
plt.axis([0,0.2e-8,0,1])
So the index is not uniformly spaced. I'm dealing with voltage vs time data from a simulator. I would like to know how to create a window of time, T, and split df into chunks of T long and plot them on top of each other. So if the data was 20*T long then I would have 20 lines in the same plot.
Sorry for the confusion; I used the stock analogy thinking it might help.
Upvotes: 1
Views: 3886
Reputation: 49788
Assuming a pandas.TimeSeries
object as the starting point, you can group
elements by ISO week number and ISO weekday with
datetime.date.isocalendar()
. The following statement, which ignores ISO year, aggregates the last sample of each day.
In [95]: daily = ts.groupby(lambda x: x.isocalendar()[1:]).agg(lambda s: s[-1])
In [96]: daily
Out[96]:
key_0
(1, 1) 63
(1, 2) 91
(1, 3) 73
...
(20, 5) 82
(20, 6) 53
(20, 7) 63
Length: 140
There may be cleaner way to perform the next step, but the goal is to change the index from an array of tuples to a MultiIndex object.
In [97]: daily.index = pandas.MultiIndex.from_tuples(daily.index, names=['W', 'D'])
In [98]: daily
Out[98]:
W D
1 1 63
2 91
3 73
4 88
5 84
6 95
7 72
...
20 1 81
2 53
3 78
4 64
5 82
6 53
7 63
Length: 140
The final step is to "unstack" weekday from the MultiIndex, creating columns for each weekday, and replace the weekday numbers with an abbreviation, to improve readability.
In [102]: dofw = "Mon Tue Wed Thu Fri Sat Sun".split()
In [103]: grid = daily.unstack('D').rename(columns=lambda x: dofw[x-1])
In [104]: grid
Out[104]:
Mon Tue Wed Thu Fri Sat Sun
W
1 63 91 73 88 84 95 72
2 66 77 96 72 56 80 66
...
19 56 69 89 69 96 73 80
20 81 53 78 64 82 53 63
To create a line plot for each week, transpose the dataframe, so the columns are week numbers and rows are weekdays (note this step can be avoided by unstacking week number, in place of weekday, in the previous step), and call plot
.
grid.T.plot()
Upvotes: 4
Reputation: 687
let me try to answer this. basically i will pad or reindex with complete weekdays and sample every 5 days while drop missing data due to holiday or suspension
>>> coke = DataReader('KO', 'yahoo', start=datetime(2012,1,1))
>>> startd=coke.index[0]-timedelta(coke.index[0].isoweekday()-1)
>>> rng = array(DateRange(str(startd), periods=90))
>>> chunk=[]
>>> for i in range(18):
... chunk.append(coke[i*5:(i+1)*5].dropna())
...
then you can loop chunk to plot each week data
Upvotes: 0