SLE
SLE

Reputation: 85

Trying to upsample Pandas to have data for every minute

I have a database, with a mix of minutely,5minutely, and hourly data points: My goal is to have 10min data averages, but hwen I plot this out there is missing datapoints and the written CSV file goes from writing every data point to every hour)

The output looks like

    2005-03-01 17:00:00,3.25
    2005-03-01 17:10:00,-5.75
    2005-03-01 17:20:00,-6.0
    2005-03-01 17:30:00,
    2005-03-01 17:40:00,
    2005-03-01 17:50:00,
    2005-03-01 18:00:00,2.3
    2005-03-01 18:10:00,
    2005-03-01 18:20:00,
    2005-03-01 18:30:00,
    2005-03-01 18:40:00,
    2005-03-01 18:50:00,
    2005-03-01 19:00:00,2.8

The original input, looks like:

  01-mar-05 17:10,   1.6,  7.9, 0.0214, 1.3536, 0.0214, 1.6726, 1.00,30.567
  01-mar-05 17:15, -13.1,  7.9, 0.0214, 1.3540, 0.0214, 1.6729, 1.00,30.550
  01-mar-05 17:20,   3.2,  7.9, 0.0214, 1.3542, 0.0214, 1.6731, 1.00,30.554
  01-mar-05 17:25, -15.2,  7.9, 0.0214, 1.3544, 0.0214, 1.6731, 1.00,30.534
  01-mar-05 18:00,   2.3,  8.0, 0.0214, 1.8276, 0.0214, 1.6932, 1.00, 0.034
  01-mar-05 19:00,   2.8,  8.0, 0.0214, 1.8312, 0.0214, 1.6973, 1.00, 0.081
  01-mar-05 20:00,   6.8,  8.0, 0.0214, 1.8313, 0.0214, 1.6993, 1.00,  .192

The code that I used was:

    names= ['Date','Conc','Flow','SZ','SB','RZ','RB','Fraction','Attenuation']
    df = pd.read_csv('Output13.csv', index_col=0, names=names, parse_dates=True)
    df1 = df[['Conc']].resample('10min').mean()

And I tried

   df=df.resample('1min',fill_method='bfill') 

thinking that that would fill in all the data points in the original file... but it didn't work.

Any suggestions? Thanks!

Upvotes: 1

Views: 609

Answers (2)

Diziet Asahi
Diziet Asahi

Reputation: 40697

Your data is a little sparse to get points every 10min when you only get one reading per hour at the end... Since you get missing points, you either have to use the data you have (using ffill or bfill), or interpolate the missing data.

df['Conc'].plot(label='original')
df['Conc'].resample('10T').ffill().plot(label='ffill')
df['Conc'].resample('10T').bfill().plot(label='bfill')
df['Conc'].resample('10T').mean().interpolate(method='linear').plot(label='linear interpolation')
df['Conc'].resample('10T').mean().interpolate(method='cubic').plot(label='cubic interpolation')
plt.legend(loc=4)

Upvotes: 3

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210842

is that what you want?

In [57]: df.resample('T').ffill()
Out[57]:
                     Conc  Flow      SZ      SB      RZ      RB  Fraction  Attenuation
Date
2005-03-01 17:10:00   1.6   7.9  0.0214  1.3536  0.0214  1.6726       1.0       30.567
2005-03-01 17:11:00   1.6   7.9  0.0214  1.3536  0.0214  1.6726       1.0       30.567
2005-03-01 17:12:00   1.6   7.9  0.0214  1.3536  0.0214  1.6726       1.0       30.567
2005-03-01 17:13:00   1.6   7.9  0.0214  1.3536  0.0214  1.6726       1.0       30.567
2005-03-01 17:14:00   1.6   7.9  0.0214  1.3536  0.0214  1.6726       1.0       30.567
2005-03-01 17:15:00 -13.1   7.9  0.0214  1.3540  0.0214  1.6729       1.0       30.550
2005-03-01 17:16:00 -13.1   7.9  0.0214  1.3540  0.0214  1.6729       1.0       30.550
2005-03-01 17:17:00 -13.1   7.9  0.0214  1.3540  0.0214  1.6729       1.0       30.550
2005-03-01 17:18:00 -13.1   7.9  0.0214  1.3540  0.0214  1.6729       1.0       30.550
2005-03-01 17:19:00 -13.1   7.9  0.0214  1.3540  0.0214  1.6729       1.0       30.550
2005-03-01 17:20:00   3.2   7.9  0.0214  1.3542  0.0214  1.6731       1.0       30.554
2005-03-01 17:21:00   3.2   7.9  0.0214  1.3542  0.0214  1.6731       1.0       30.554
2005-03-01 17:22:00   3.2   7.9  0.0214  1.3542  0.0214  1.6731       1.0       30.554
2005-03-01 17:23:00   3.2   7.9  0.0214  1.3542  0.0214  1.6731       1.0       30.554
2005-03-01 17:24:00   3.2   7.9  0.0214  1.3542  0.0214  1.6731       1.0       30.554
2005-03-01 17:25:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:26:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:27:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:28:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:29:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:30:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:31:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:32:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:33:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:34:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:35:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:36:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:37:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:38:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534
2005-03-01 17:39:00 -15.2   7.9  0.0214  1.3544  0.0214  1.6731       1.0       30.534

Upvotes: 1

Related Questions