Reputation: 1
I'm trying to create a function for some deep learning issues for satellite images classification. I have searched through a lot of libraries and I haven't found my needs I tried this sikit-learn but I feel that it is not what I need
Any hint for a specialised function that I may not see?
Upvotes: 0
Views: 1017
Reputation: 377
It seems to be a common problem: stratify_by
is there but partition_by
is not, meaning that the two sets should be non-overlapping on the value of a specific variable, such as video_id
or patient_id
.
Upvotes: 0
Reputation: 2553
This should do the trick. You can use the permutation array on the X and y data separately if you like.
num_tr, num_va = int(len(data)*0.5), int(len(data)*0.2)
perm = np.random.permutation(len(data))
tr_data = data[perm[:num_tr]]
va_data = data[perm[num_tr:num_tr+num_va]]
te_data = data[perm[num_tr+num_va:]]
Upvotes: 0
Reputation: 302
The sklearn train_test_split
seems to fit all your needs.
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
Upvotes: 0