Reputation: 6615
I'm looking for a more efficient way to do this as I am new to python. I want a data frame of the cyl value and the counts - ideally without having to go and do the rename column. I'm coming from R.
What is happening is 'cyl' is the index if i don't use the to-frame.reset-index piece of code and when I do use the reset-index code it becomes a column called 'index' - which is really the cyl values, while the the 2nd column 'cyl' is really the frequency counts..
import pandas as pd
new_df = pd.value_counts(mtcars.cyl).to_frame().reset_index()
new_df.columns = ['cyl', 'frequency']
Upvotes: 1
Views: 2959
Reputation: 862661
I think you can omit to_frame()
:
new_df = pd.value_counts(mtcars.cyl).reset_index()
new_df.columns = ['cyl', 'frequency']
Sample:
mtcars = pd.DataFrame({'cyl':[1, 2, 2, 4, 4]})
print (mtcars)
cyl
0 1
1 2
2 2
3 4
4 4
new_df = pd.value_counts(mtcars.cyl).reset_index()
new_df.columns = ['cyl', 'frequency']
print (new_df)
cyl frequency
0 4 2
1 2 2
2 1 1
Upvotes: 1