Reputation: 1637
My desired output is the following:
count tally
1 2 //
2 3 ///
3 5 /////
4 3 ///
5 2 //
My code:
my_list = [1,1,2,2,2,3,3,3,3,3,4,4,4,5,5]
my_series = pd.Series(my_list)
values_counted = pd.Series(my_series.value_counts(),name='count')
# other calculated columns left out for SO simplicity
df = pd.concat([values_counted], axis=1).sort_index()
df['tally'] = values_counted * '/'
With the code above I get the following error:
masked_arith_op
result[mask] = op(xrav[mask], y)
numpy.core._exceptions.UFuncTypeError: ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U21'), dtype('<U21')) -> dtype('<U21')
In searching for solutions I found one on SO that said to try:
values_counted * float('/')
But that did not work.
In 'normal' Python outside of Dataframes the following code works:
10 * '/'
and returns
///////////
How can I achieve the same functionality in a Dataframe?
Upvotes: 1
Views: 832
Reputation: 18306
You can group the series by itself and then aggregate:
new_df = my_series.groupby(my_series).agg(**{"count": "size",
"tally": lambda s: "/" * s.size})
to get
>>> new_df
count tally
1 2 //
2 3 ///
3 5 /////
4 3 ///
5 2 //
Upvotes: 1
Reputation: 862761
Use lambda function for repeat values, your solution is simplify:
my_list = [1,1,2,2,2,3,3,3,3,3,4,4,4,5,5]
df1 = pd.Series(my_list).value_counts().to_frame('count').sort_index()
df1['tally'] = df1['count'].apply(lambda x: x * '/')
print (df1)
count tally
1 2 //
2 3 ///
3 5 /////
4 3 ///
5 2 //
Upvotes: 1