TestGuest
TestGuest

Reputation: 603

subplotting with different number of subplots per row in python

Again, I am very new to python. Below I provide my code (for classification with feature selection), not the data since it is rather high dimensional, but I believe that the problem is quite data-independent. My question is two-fold: I want axis labels for all subplots, and I would like to know how I can subplot where the number of subplots can be different per row (I have 14 subplots, currently in three rows):

import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.model_selection import StratifiedKFold
from sklearn.feature_selection import RFECV
from sklearn.datasets import make_classification
from sklearn import preprocessing
import scipy.io as sio
import numpy as np
import os

allData = sio.loadmat('Alldatav2.mat')
allFeatures = allData['featuresAll2']

# loop over subjects
n_subject = [0,1,2,3,4,5,6,7,8,9,10,11,12,13]

fig, axs = plt.subplots(3,5,figsize=(15, 6))
plt.xlabel("Number of features selected")
plt.ylabel("Cross validation score (nb of correct classifications)")
fig.subplots_adjust()
axs = axs.ravel()

for i, j in zip(n_subject, range(15)):
    #print("For Subject : ", i+1)
    y = allData['labels']
    X = allFeatures[i*120:(i+1)*120,:]

    svc = SVC(kernel="linear",C=1)
    rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(2),
              scoring='accuracy')
    rfecv.fit(X, y.ravel())


    axs[j].plot(range(1, len(rfecv.grid_scores_) + 1), rfecv.grid_scores_)
plt.show()


# loop over subjects
def mean(numbers):
    return float(sum(numbers)) / max(len(numbers), 1)

n_subject = [0,1,2,3,4,5,6,7,8,9,10,11,12,13]
avg_scores = []

for i in n_subject:
    print("For Subject : ", i+1)
    y = allData['labels']
    X = allFeatures[i*120:(i+1)*120,:]

    svc = SVC(kernel="linear",C=1)
    rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(10),
              scoring='accuracy')
    rfecv.fit(X, y.ravel())
    print("Optimal number of features : %d" % rfecv.n_features_)
    print("Ranking of Features : ", rfecv.ranking_)
    avg_score = rfecv.grid_scores_.max()
    print("Best CV Score : ", avg_score)
    avg_scores.append(avg_score)
    print("------------------------------------------")
print("Average Accuracy over all Subjects : ", mean(avg_scores))

Upvotes: 1

Views: 2035

Answers (1)

Vikrant Kamble
Vikrant Kamble

Reputation: 11

For the labels for each subplots, you can first create a list that contains those labels.

xlabelList = [xlabel0, xlabel1 ....,xlabel13]
ylabelList = [ylabel0, ylabel1,....,ylabel13]

Also you don't need to define extra variable n_subject for looping. For plotting I will make the following changes:

for j in range(14):

    #print("For Subject : ", j+1)
    y = allData['labels']
    X = allFeatures[j*120:(j+1)*120,:]

    svc = SVC(kernel="linear",C=1)
    rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(2),
          scoring='accuracy')
    rfecv.fit(X, y.ravel())

    locInd = np.unravel_index(j, (3,5))    
    axs[locInd].plot(range(1, len(rfecv.grid_scores_) + 1), rfecv.grid_scores_)
    axs[locInd].set_xlabel(xlabelList[j])
    axs[locInd].set_ylabel(ylabelList[j])
plt.show()

Upvotes: 1

Related Questions