Omi
Omi

Reputation: 79

How to bin data from multiple column using pandas/python at the same time?

I am working with a data frame that has 92 columns and 200000 rows. I want to bin and count data from each of these columns and put it in a new data frame for further plotting/analysis.

I'm using

bins = [-800, -70, -60, -50, -40, -30, -20, -5, 0]
df['Depth.1'].value_counts(bins=bins, sort = False)

which successfully bins data but only for one column at a time. Is it possible to do this for multiple columns in a data frame and put it into a new data frame?

Thanks

Upvotes: 1

Views: 1634

Answers (1)

Ben.T
Ben.T

Reputation: 29635

you can use apply to perform the same operation on each column. try

new_df = df.apply(lambda x: x.value_counts(bins=bins, sort=False))

With an example, if all the columns are not going to be binned:

#sample data
df = pd.DataFrame({'a':[3,6,2,7,3], 
                   'b':[2,1,5,8,9], 
                   'c':list('abcde')})

if you do the above method, you'll get an error as a column is of type string. So you can define a list of columns and do:

list_cols = ['a','b'] #only the numerical columns
new_df = df[list_cols].apply(lambda x: x.value_counts(bins=[0,2,5,10], sort=False))
print(new_df)
               a  b
(-0.001, 2.0]  1  2
(2.0, 5.0]     2  1
(5.0, 10.0]    2  2

Upvotes: 2

Related Questions