Reputation:
For my experiment, I have three different time-series data of the following format with different characteristics where the first column is timestamp and the second column is the value.
0.086206438,10
0.086425551,12
0.089227066,20
0.089262508,24
0.089744425,30
0.090036815,40
0.090054172,28
0.090377569,28
0.090514071,28
0.090762872,28
0.090912691,27
For reproducibility, I have shared the three time-series data I am using here.
From column 2, I wanted to read the current row and compare it with the value of the previous row. If it is greater, I keep comparing. If the current value is smaller than the previous row's value, I want to divide the current value (smaller) by the previous value (larger). Let me make it clear. For example in the above sample record I provided, the seventh row (28) is smaller than the value in the sixth row (40) - so it will be (28/40=0.7).
Here is my sample code.
import numpy as np
import pandas as pd
import csv
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import norm
from statsmodels.graphics.tsaplots import plot_acf, acf
protocols = {}
types = {"data1": "data1.csv", "data2": "data2.csv", "data3": "data3.csv"}
for protname, fname in types.items():
col_time = []
col_window = []
with open(fname, mode='r', encoding='utf-8-sig') as f:
reader = csv.reader(f, delimiter=",")
for i in reader:
col_time.append(float(i[0]))
col_window.append(int(i[1]))
col_time, col_window = np.array(col_time), np.array(col_window)
diff_time = np.diff(col_time)
diff_window = np.diff(col_window)
diff_time = diff_time[diff_window > 0]
diff_window = diff_window[diff_window > 0] # To keep only the increased values
protocols[protname] = {
"col_time": col_time,
"col_window": col_window,
"diff_time": diff_time,
"diff_window": diff_window,
}
# Plot the quotient values
rt = np.exp(np.diff(np.log(col_window)))
for protname, fname in types.items():
col_time, col_window = protocols[protname]["col_time"], protocols[protname]["col_window"]
rt = np.exp(np.diff(np.log(col_window)))
plt.plot(np.diff(col_time), rt, ".", markersize=4, label=protname, alpha=0.1)
plt.ylim(0, 1.0001)
plt.xlim(0, 0.003)
plt.title(protname)
plt.xlabel("time")
plt.ylabel("difference")
plt.legend()
plt.show()
This gives me the following plots
However, when I do this
rt = np.exp(np.diff(np.log(col_window)))
It is dividing every current row by the previous row which is not something I want. As I explained above with an example in my question, I want to divide the current row value of column 2 by the previous value of column 2 ONLY if the current row value is smaller than the previous value. Finally, plot the quotient against the timestamp difference (col_time
in my code above). How can I fix this?
Upvotes: 2
Views: 2519
Reputation: 3419
Unless you specifically need the csv
module, I would recommend using the numpy
method loadtxt
to load your files, that is
col_time,col_window = np.loadtxt(fname,delimiter=',').T
This single line takes care of the first 8 lines of your for
loop. Note the transpose operation (.T
) is necessary to convert the original data shape (N
rows by 2
columns) into a 2
row by N
column shape that is unpacked into col_time
and col_window
. Also note that loadtxt
automatically loads the data into numpy.array
objects.
As for your actual question, I would use slicing and masking:
trailing_window = col_window[:-1] # "past" values at a given index
leading_window = col_window[1:] # "current values at a given index
decreasing_mask = leading_window < trailing_window
quotient = leading_window[decreasing_mask] / trailing_window[decreasing_mask]
quotient_times = col_time[decreasing_mask]
Then quotient_times
may be plotted against quotient
.
An alternative would be to use the numpy
method where
to grab the indices where the mask is True
:
trailing_window = col_window[:-1] # "past" values at a given index
leading_window = col_window[1:] # "current values at a given index
decreasing_inds = np.where(leading_window < trailing_window)[0]
quotient = leading_window[decreasing_inds] / trailing_window[decreasing_inds]
quotient_times = col_time[decreasing_inds]
Keep in mind that all the above code still takes place in the first for
loop, but now the rt
is computed inside the loop as quotient
. Thus after computing quotient_times
, to plot (also inside the first loop):
# Next line opens a new figure window and then clears it
figure(); clf()
# Updated plotting call with the syntax from the answer
plt.plot(quotient_times,quotient,'.',ms=4,label=protname,alpha=0.1)
plt.ylim(0, 1.0001)
plt.xlim(0, 0.003)
plt.title(protname)
plt.xlabel("time")
plt.ylabel("quotient")
plt.legend()
# You may not need this `plt.show()` line
plt.show()
# To save the figure, one option would be the following:
# plt.savefig(protname+'.png')
Note that you may need to take the plt.show()
line out of the loop.
Putting it together for you,
import numpy as np
import matplotlib.pyplot as plt
protocols = {}
types = {"data1": "data1.csv", "data2": "data2.csv", "data3": "data3.csv"}
for protname, fname in types.items():
col_time,col_window = np.loadtxt(fname,delimiter=',').T
trailing_window = col_window[:-1] # "past" values at a given index
leading_window = col_window[1:] # "current values at a given index
decreasing_inds = np.where(leading_window < trailing_window)[0]
quotient = leading_window[decreasing_inds] /
trailing_window[decreasing_inds]
quotient_times = col_time[decreasing_inds]
# Still save the values in case computation needs to happen later
# in the script
protocols[protname] = {
"col_time": col_time,
"col_window": col_window,
"quotient_times": quotient_times,
"quotient": quotient,
}
# Next line opens a new figure window and then clears it
plt.figure(); plt.clf()
plt.plot(quotient_times,quotient, ".", markersize=4, label=protname, alpha=0.1)
plt.ylim(0, 1.0001)
plt.xlim(0, 0.003)
plt.title(protname)
plt.xlabel("time")
plt.ylabel("quotient")
plt.legend()
# To save the figure, one option would be the following:
# plt.savefig(protname+'.png')
# This may still be unnecessary, especially if called as a script
# (just save the plots to `png`).
plt.show()
Upvotes: 2