Klamsi
Klamsi

Reputation: 73

Python: define individual binning

I am trying to define my own binning and calculate the mean value of some other columns of my dataframe over these bins. Unfortunately, it only works with integer inputs as you can see below. In this particular case "step_size" defines the step of one bin and I would like to use float values like 0.109 which corresponds to 0.109 seconds. Do you have any idea how I can do this? I think the problem is in the definition of "create_bins" but I cannot fix it... The goal should be to get this: [(0,0.109),(0.109,0,218),(0.218,0.327) ......]

Greets

# =============================================================================
# Define parameters
# =============================================================================
seconds_min = 0 
seconds_max = 9000 
step_size = 1 
bin_number = int((seconds_max-seconds_min)/step_size)


# =============================================================================
# Define function to create your own individual binning

# lower_bound defines the lowest value of the binning interval
# width defines the width of the binning interval
# quantity defines the number of bins
# =============================================================================
def create_bins(lower_bound, width, quantity):
    bins = []
    for low in range(lower_bound, 
                      lower_bound + quantity * width + 1, width):
        bins.append((low, low+width))
    return bins


# =============================================================================
# Create binning list
# =============================================================================
bin_list = create_bins(lower_bound=seconds_min,
                    width=step_size,
                    quantity=bin_number)

print(bin_list)

Upvotes: 0

Views: 146

Answers (3)

warped
warped

Reputation: 9481

no numpy:

max_bin=100
min_bin=0
step_size=0.109

number_of_bins = int(1+((max_bin-min_bin)/step_size)) # +1 to cover the whole interval

bins= []
for a in range(number_of_bins):
    bins.append((a*step_size, (a+1)*step_size))

Upvotes: 0

Roy2012
Roy2012

Reputation: 12503

Here's a simple way to do it using zip and numpy's arange. I've put the upper limit at 5, but you can, of course, choose other numbers.

list(zip(np.arange(0, 5, .109), np.arange(.109, 5, .109)))

The result is:

[(0.0, 0.109),
 (0.109, 0.218),
 (0.218, 0.327),
 (0.327, 0.436),
 (0.436, 0.545),
 (0.545, 0.654),
 (0.654, 0.763),
 ... 

Upvotes: 1

mabergerx
mabergerx

Reputation: 1213

The problem lies in the fact that the range function does not allow for float ranges.

You can use the numeric_range function in more_itertools for this:

from more_itertools import numeric_range

seconds_min = 0
seconds_max = 9
step_size = 0.109
bin_number = int((seconds_max-seconds_min)/step_size)
   
   

def create_bins(lower_bound, width, quantity):
    bins = []
    for low in numeric_range(lower_bound,
                      lower_bound + quantity * width + 1, width):
        bins.append((low, low+width))
    return bins
   
bin_list = create_bins(lower_bound=seconds_min,
                       width=step_size,
                       quantity=bin_number)
   
    
print(bin_list)
# (0.0, 0.109), (0.109, 0.218), (0.218, 0.327) ... ]

Upvotes: 2

Related Questions