sariii
sariii

Reputation: 2150

show text description in x axis rather than numbers using pandas matplotlib

I have written code to show my data set as bar chart. this is my code: I have read my data from .csv file in this way:

names = ["Clinic Number","Question Text","Answer Text","Answer Date","Class"]
data = pd.read_csv('ADLCI.csv', names = names)

And then

grouped = data.groupby(['Question Text','Answer Text']).size().reset_index(name='counts')

import matplotlib.pyplot as plt
plt.figure()

grouped.plot(kind='bar', title ="Functional Status Count", figsize=(15, 10), legend=True, fontsize=12)
plt.show()

This is also the result of data frame I have which I want to show as bar chart.

                         Question Text Answer Text  counts
0                          CI function          No     513
1                          CI function         Yes     373
2                             bathing?          No    2827
3                             bathing?         Yes     408
4                            dressing?          No    2824
5                            dressing?         Yes     423
6                              feeding          No    2851
7                              feeding         Yes     160
8                         housekeeping          No    2803
9                         housekeeping         Yes     717
10                      preparing food          No    2604
11                      preparing food         Yes     593
12  responsibility for own medications          No    2793
13  responsibility for own medications         Yes     625
14                            shopping          No      35
15                            shopping         Yes      49
16                           toileting          No    2843
17                           toileting         Yes     239
18                        transferring          No    2834
19                        transferring         Yes     904
20                using transportation          No    2816
21                using transportation         Yes     483

the first column that is number has been added automatically, Actually I do not have that in my data set.

Here is the bar chart created by this code. enter image description here

As you see in the bar chart, all bars has the same color. also the x axis is the number I was saying. but I dont want in this shape. the thing I want is look like this link:

Im going to explain what changes I want to the picture I have uploaded here.

Instead of 0 and 1 ... in the x axis, it should depict the Question Text column. In detail, the bar chart in x axis will be: as we see in the dataframe there is two CI function one for yes and one for No. I want CI function instead of 0 and 1 with two different colors one pointing to the count of No 1596 and one different color pointing to Yes 1376.

The next item will be bathing?, again one bar pointing to 17965 and another one to 702.

With this I should have nearly ten bars, each contains two bars stick with each other like the link I put above.

I tried various ways like the above link but mine not showing like that or getting error.

Thanks :)

Update 1 when I applied your code:

import matplotlib.pyplot as plt
data.groupby(['Question Text','Answer Text']).sum().unstack().plot(kind='bar')
plt.show()

I got this error:

  Traceback (most recent call last):
  File "C:/Users/M193053/PycharmProjects/ADL-distribution/test.py", line 52, in <module>
    data.groupby(['Question Text','Answer Text']).sum().unstack().plot(kind='bar')
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 2941, in __call__
    sort_columns=sort_columns, **kwds)
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 1977, in plot_frame
    **kwds)
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 1804, in _plot
    plot_obj.generate()
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 258, in generate
    self._compute_plot_data()
  File "C:\Users\M193053\Documents\Anaconda3\envs\conda3\lib\site-packages\pandas\plotting\_core.py", line 373, in _compute_plot_data
    'plot'.format(numeric_data.__class__.__name__))
TypeError: Empty 'DataFrame': no numeric data to plot

but when I use this code:

grouped = data.groupby(['Question Text','Answer Text']).size().reset_index(name='counts')

import matplotlib.pyplot as plt
grouped.groupby(['Question Text','Answer Text']).sum().unstack().plot(kind='bar')
plt.show()

It seems ok to me like this: enter image description here

but it does not seem logical to apply two groupby. because of that Im not sure still what should I do. Thaks for taking time :)

Update two

this is my data frame, has been got with this code:

grouped = data.groupby(['Question Text','Answer Text']).size().reset_index(name='counts')

0                          CI function          No     513
1                          CI function         Yes     373
2                             bathing?          No    2827
3                             bathing?         Yes     408
4                            dressing?          No    2824
5                            dressing?         Yes     423
6                              feeding          No    2851
7                              feeding         Yes     160
8                         housekeeping          No    2803
9                         housekeeping         Yes     717
10                      preparing food          No    2604
11                      preparing food         Yes     593
12  responsibility for own medications          No    2793
13  responsibility for own medications         Yes     625
14                            shopping          No      35
15                            shopping         Yes      49
16                           toileting          No    2843
17                           toileting         Yes     239
18                        transferring          No    2834
19                        transferring         Yes     904
20                using transportation          No    2816
21                using transportation         Yes     483

and this the data frame, has got from combination of your code and mine:

grouped = data.groupby(['Question Text','Answer Text']).size().reset_index(name='counts')
print(grouped)
import matplotlib.pyplot as plt
final = grouped.groupby(['Question Text','Answer Text']).sum()
print(final)


Question Text                      Answer Text        
CI function                        No              513
                                   Yes             373
bathing?                           No             2827
                                   Yes             408
dressing?                          No             2824
                                   Yes             423
feeding                            No             2851
                                   Yes             160
housekeeping                       No             2803
                                   Yes             717
preparing food                     No             2604
                                   Yes             593
responsibility for own medications No             2793
                                   Yes             625
shopping                           No               35
                                   Yes              49
toileting                          No             2843
                                   Yes             239
transferring                       No             2834
                                   Yes             904
using transportation               No             2816
                                   Yes             483

Update 3

Original data frame there is 200000 rows like this :

1                             bathing?          No       3529933
2                            dressing?          No       3529933
3                              feeding          No       3529933
4                         housekeeping          No       3529933
5   responsibility for own medications          No       3529933
6                 using transportation          No       3529933
7                            toileting          No       3529933
8                         transferring          No       3529933
10                      preparing food          No       3529933
11                            bathing?         NaN       2864155
12                           dressing?         NaN       2864155
13                             feeding         NaN       2864155
14                        housekeeping         NaN       2864155
15  responsibility for own medications         NaN       2864155
16                           toileting         NaN       2864155
17                        transferring         NaN       2864155
19                      preparing food         NaN       2864155
20                using transportation         Yes       2864155
21                            bathing?         NaN       2921299
22                           dressing?         NaN       2921299

Upvotes: 1

Views: 1557

Answers (1)

Joe
Joe

Reputation: 12417

You can do so(df is the dataframe you wrote):

import matplotlib
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
df.groupby(['Question Text','Answer Text']).sum().unstack().plot(kind='bar')
plt.show()

Output: enter image description here You can also rotate the xlabel in this way:

plt.xticks(rotation=45)

but I suggest you to make the labels shorter to make it more clear

Upvotes: 1

Related Questions