Reputation: 69
I am new to Machine Learning and I am dealing with a quite complicated issue. I have a 3D numpy array called "psd_data" with EEG Data from a human subject that performed Motor Imagery trials. The array has size of (240, 16, 129) which stands for (trials, channels, PSD features). I also have an 1D numpy array called labels with the label of each trial and has a size of (240,).
I need to perform automatically feature selection and then classification and so far I am having trouble with the feature selection. I tried this:
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2
X = psd_data #independent columns
y = labels #target - SelectKBest class to extract top 15 best features
bestfeatures = SelectKBest(score_func=chi2, k=15)
fit = bestfeatures.fit(X,y)
dfscores = pd.DataFrame(fit.scores_)
dfcolumns = pd.DataFrame(X.columns)
#concat two dataframes for better visualization
featureScores = pd.concat([dfcolumns,dfscores],axis=1)
featureScores.columns = ['Specs','Score'] #naming the dataframe columns
print(featureScores.nlargest(15,'Score')) #print 15 best features
But I am getting an error:
ValueError: Found array with dim 3. Estimator expected <= 2.
Do you have any suggestions on how to manipulate the 3D array "psd_data" correctly in order to get a useful result?
Upvotes: 0
Views: 331
Reputation: 51
What worked with me is down below:
#Reduce features
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import mutual_info_classif
topK = 20
SKB = SelectKBest(mutual_info_classif, k=topK)
num_instances, num_time_steps, num_features = train_data.shape
train_data = np.reshape(train_data, newshape=[-1, num_features])
new_trClass = np.resize(y_train, num_time_steps*num_instances)
train_data_skb = SKB.fit_transform(train_data, new_trClass)
train_data_skb = np.reshape(train_data_skb, newshape=(num_instances, num_time_steps, topK))
num_instances, num_time_steps, num_features = test_data.shape
test_data = np.reshape(test_data, newshape=(-1, num_features))
test_data_skb = SKB.transform(test_data)
test_data_skb = np.reshape(test_data_skb, newshape=(num_instances, num_time_steps, topK))
feat_indices = SKB.get_support()
Simply, you need to reshape your arrays to match with 2-dimesions. Hope this helps.
Upvotes: 0