Reputation: 1946
I have a DataFrame that looks like:
import pandas as pd
df = pd.DataFrame(columns=['date', 'type', 'version'],
data=[
['2017-07-01', 'critical::issue::A', 'version1'],
['2017-07-01', 'critical::issue::A', 'version2'],
['2017-07-01', 'hardware::issue::B', 'version1'],
])
I'm returning the size of all the unique values for 'type' using the following;
sub_cat = ['critical::',
'hardware::',
'software::'
]
for cat in sub_cat:
x = df[df.type.str.startswith(cat)]
count = x.groupby('type').size()
if len(count) > 0:
print(count)
else:
print(cat, '0')
Results are correct but the output is sloppy:
type
critical::issue::A 2
dtype: int64
type
hardware::issue::B 1
dtype: int64
software:: 0
I'd like to format the output to make it more readable like the following example.
type
critical::issue::A 2
hardware::issue::B 1
software:: 0
Any suggestions?
Upvotes: 0
Views: 2153
Reputation: 18916
An alternative solution, if you just change:
print(count)
To:
print(count.to_string(header=False))
You get:
critical::issue::A 2
hardware::issue::B 1
software:: 0
So maybe add a print("type") before the loop and you are there?
Upvotes: 1
Reputation: 210842
Consider this Pandas approach:
In [79]: res = df.groupby('type').size()
In [80]: res
Out[80]:
type
critical::issue::A 2
hardware::issue::B 1
dtype: int64
In [81]: s = pd.Series(sub_cat)
In [82]: idx = s[~s.isin(df.type.str.extract(r'(\w+::)', expand=False).unique())].values
In [83]: res = res.append(pd.Series([0] * len(idx), index=idx))
In [84]: res
Out[84]:
critical::issue::A 2
hardware::issue::B 1
software:: 0
dtype: int64
Upvotes: 0
Reputation: 178
Here is your code with suggested changes:
import pandas as pd
df = pd.DataFrame(columns=['date', 'type', 'version'],
data=[
['2017-07-01', 'critical::issue::A', 'version1'],
['2017-07-01', 'critical::issue::A', 'version2'],
['2017-07-02', 'critical::issue::B', 'version3'],
['2017-07-01', 'hardware::issue::B', 'version1'],
])
sub_cat = ['critical::',
'hardware::',
'software::']
print("type")
for cat in sub_cat:
x = df[df.type.str.startswith(cat)]
count = x.groupby('type').size()
# 'count' is a Series object
for i in range(len(count)):
print("{}\t{}".format(count.index[i], count[i]))
if len(count) == 0:
print("{}\t{}".format(cat, 0))
It produces:
type
critical::issue::A 2
critical::issue::B 1
hardware::issue::B 1
software:: 0
Upvotes: 0
Reputation: 582
You could loop through the rows of your count
groupby variable to output the lines 1 by 1:
for cat in sub_cat:
x = df[df.type.str.startswith(cat)]
count = x.groupby('type').size()
if len(count) > 0:
for ind, row in count.iteritems():
print(ind, row)
else:
print(cat, '0')
Output is as follows:
critical::issue::A 2
hardware::issue::B 1
software:: 0
Upvotes: 0