Reputation: 5
Want to calculate dominant frequency, secondary dominant frequency from X,Y,Z accelerometer data stored in flat CSV files (million rows+) e.g.
I'm trying to use scipy although aware of numpy - either would do. I've converted my X, Y, Z to SMV format (single magnitude vector) and want to apply the fourier transform to this, and then get the frequencies using fftfreq - the bit that defeats me is the n and timestep. I have my sample rates, the hertz, and the size of rolling window I want to look at (10 rows of data) but not quite sure how to apply this to script below:
#The three-dimension data collected (X,Y,Z) were transformed into a
#single-dimensional Signal Magnitude Vector SMV (aka The Resultant)
#SMV = x2 + Y2 + Z2
X2 = X['X']*X['X']
Y2 = X['Y']*X['Y']
Z2 = X['Z']*X['Z']
#print X['X'].head(2) #Confirmed worked
#print X2.head(2) #Confirmed worked
combine = [X2,Y2,Z2, Y]
parent = pd.concat(combine, axis=1)
parent['ADD'] = parent.sum(axis=1) #Sum X2,Y2,Z2
sqr = np.sqrt(parent['ADD']) #Square Root of Sum Above
sqr.name = 'SMV'
combine2 = [sqr, Y] #Reduce Dataset to SMV and Class
parent2 = pd.concat(combine2, axis=1)
print parent2.head(4)
"************************* Begin Fourier ****************************"
from scipy import fftpack
X = fftpack.fft(sqr)
f_s = 80 #80 Hertz
samp = 1024 #samples per segment divided by 12.8 secs signal length
n = X.size
timestep = 10
freqs = fftpack.fftfreq(n, d=timestep)
Upvotes: 0
Views: 1701
Reputation: 910
Firstly you need to load your data into a numpy array (sorry i didn't quite follow your approach):
def load_data():
csvlist = []
times = []
with open('freq.csv') as f:
csvfile = csv.reader(f, delimiter=',')
for i, row in enumerate(csvfile):
timestamp = datetime.datetime.strptime(row[0],"%Y-%m-%d %H:%M:%S.%f")
times.append(timestamp)
csvlist.append(row[1:])
timestep = times[1]-times[0]
csvarr = numpy.array(csvlist, dtype=numpy.float32)
return timestep, csvarr
The may well be a better way to do this? Then you need to calculate the magnitudes:
rms = numpy.sqrt(numpy.sum(data**2, axis=1))
And then the fourier analysis:
def fourier(timestep, data):
N = len(data)//2
freq = fftpack.fftfreq(len(data), d=timestep)[:N]
fft = fftpack.fft(data)[:N]
amp = numpy.abs(fft)/N
order = numpy.argsort(amp)[::-1]
return freq[order]
the return from this is a list of frequencies in decreasing order of importance.
Upvotes: 2