Filtering Gaussian Peaks With Known Frequency

Question

Let us suppose I have some gaussian data, at some known frequency from one another, sitting on some low frequency noise data. Each gaussian has a probability given by a Poisson distribution, so all the peaks have different heights.

I would like to design a filter to extract the peaks (or, subtract the noise).

I have tried to implement a Butterworth filter to remove the low frequency noise.

My implementation at the moment seems to produce a negative signal, which is not what I expect:

The signal is below 0, which is not what I expect;
My peaks do not appear to sit on a flat line. I think this is a result of my peaks being different amplitudes.
There appears to be some sort of 'ghost peaks' to the left and right of the peaks I am interested in.

Being new to Butterworth filters, I am unsure of what exactly it is I am doing incorrectly.

Can someone elucidate me to my mistake?

My implementation is as follows:

from scipy.stats import norm
from scipy.stats import skewnorm
from scipy.stats import poisson
from astropy.stats import freedman_bin_width
from scipy.interpolate import interp1d

Delta_X = 5
gaus_width_0 = 0.5
gaus_width_1 = 0.02
mu=2



######
# DEFINE HIGH PASS BUTTERWORTH FILTER 
######

def butter_highpass(cutoff, fs, order=5):
   nyq = 0.5 * fs
   normal_cutoff = cutoff / nyq
   b, a = signal.butter(order, normal_cutoff, btype='high', analog=False)
   return b, a

def butter_highpass_filter(data, cutoff, fs, order=5):
   b, a = butter_highpass(cutoff, fs, order=order)
   y = signal.filtfilt(b, a, data)
   return y


######
# GENERATE NOISE - GAUSSIANS SEPARATED BY Delta_X, WITH SOME      SMEARING. EACH PEAK HAS A POISSON PROBABILITY
######

data = []
for i in range(10):
  mean = float(i)*Delta_X
  std = gaus_width_0 + gaus_width_1*i
  gaus = norm.rvs(mean, std, size=int(20000*poisson.pmf(i, mu)))
  data.append(gaus)

######
# GENERATE NOISE - A SKEWED NORMAL DISTRIBUTION
######

noise = 4*Delta_X*skewnorm.rvs(5, size=100000)
data.append(noise)

data = np.concatenate(data)


######
# GENERATE A HISTOGRAM
####

bw, bins = freedman_bin_width(data, return_bins=True)
counts, _ = np.histogram(data, bins=bins)
bin_centres =  (bins[:-1] + bins[1:])/2.
nbins = len(bin_centres)
bin_numbers = np.arange(0, nbins)

######
# INTERPOLATE BETWEEN BIN UNITS AND MEASURED UNITS TO FIND THE     FREQUENCY, IN BINS
####


f_inv = interp1d(bin_centres, bin_numbers)
freq_bins = f_inv(Delta_X) - f_inv(0)


######
# APPLY THE BUTTERWORTH FILTER
####

filtered_counts = butter_highpass_filter(counts, freq_bins, nbins)

plt.figure(figsize=(10,10))
plt.xticks(fontsize=20)
plt.yticks(fontsize=20)
plt.xlabel("x", fontsize=20)
plt.plot(bin_centres, counts, lw=3, label="Original Histogram")
plt.plot(bin_centres, filtered_counts, lw=3, label="Original     Histogram")

plt.grid(which="both")
plt.show()

I get the following result:

Filtering Gaussian Peaks With Known Frequency

Answers (0)

Related Questions