Billy
Billy

Reputation: 1

How to do two categories in a single bar chart (subplot)

I need help with my code for a bar chart. My first problem is trying to have the code sort between Active and Inactive wells. My second problem is creating the two subcategories (Active and Inactive) on the same bar chart. Here is what I have written so far. Any help would be great.

Data for the code

df = pd.read_csv('Data.CSV')
Active= df['Wells Status']
Inactive= df['Wells Status']
  

fig, ax = plt.subplots()
rects1 = ax.bar(x - width/2, Active, width, label='Active')
rects2 = ax.bar(x + width/2, Inactive, width, label='Inactive')


plt.xlabel('County', fontsize = 20)
plt.ylabel('Total of Wells', fontsize = 20)
plt.title('Wells by County',fontsize=30)
     
    plt.show()

Upvotes: 0

Views: 115

Answers (1)

Kooch
Kooch

Reputation: 11

I think that if I am understanding correctly you are trying to count the number of active and inactive wells in a given county.

To start off there is a typo in the column you are calling. In your data the column is 'Well Status' instead of 'Wells Status'.

Next you will need to sort the data based on ACTIVE/INACTIVE status instead of just calling the column with df['Well Status']. To do this you can use the following lines of code.

Active = df[df['Well Status'].str.match('ACTIVE')]
Inactive = df[df['Well Status'].str.match('INACTIVE')]

This will return the entire DataFrame, but just the rows where the column 'Well Status' equals active/inactive respectively.

To answer your second problem we will need to take our two Active and Inactive and sort by county. For the data you gave provided there is only one county so you could just use the following.

terry_active = Active[Active['County'].str.match('TERRY (TX)')]
terry_inactive = Inactive[Inactive['County'].str.match('TERRY (TX)')]

If there are more counties then you could use the same method but just change the variable names and then the county/parish that you're sorting by. Now this is of course hard coded, so for 3+ counties I would set up a function or a for loop to sort through each of the counties because in my expierence hard coding things is really only beneficial for quick projects. I generally try to write more generalized code that is reusable, but in a quick pinch some hard coding works, and has usually helped me out in writing more generalized code.

Then to plot you could use the following code.

counties = ['Terry (TX)']
x = np.arange(len(counties))
fig, ax = plt.subplots()
ax.bar(x - 0.5/2, len(terry_active), label='Active', width=0.5)
ax.bar(x + 0.5/2, len(terry_inactive), label='Inactive', width=0.5)
ax.set_xticks(x)
ax.set_xticklabels(counties)
ax.set_xlabel('County')
ax.set_ylabel('# of Wells')
ax.set_title('Wells by County')

plt.legend()
plt.show()

This is again hard coded. To do more than the one county in the provided data you would need to the other counties to the counties list, and then repeat the steps to sort find the total number of wells. Then just add however many more lines of ax.bar() lines of code so that each bar is made.

Upvotes: 1

Related Questions