How to one hot encode unordered discrete data in python?

Question

Problem

There seems no easy way to one-hot-encode data that has no order. My question is, what is the best way to one-hot-encode values that have no particular order? And if there is no standardised way to do this, why should one-hot-encoded features be ordered?

Example

I am trying to one-hot-encode a set of features where the values are custom objects. My object looks like this:

class MyObject(object)
    def __init__(self, identity):
        self.identity = identity

    def __hash__(self):
        return self.identity

    def __eq__(self, other):
        return self.identity == other.identity

In this setting each instance of MyObject can be compared on equality. Suppose we have the following list of objects:

objects = [MyObject(0), MyObject(1), MyObject(0)]

The function set(objects) yields a set of 2 objects, namely MyObject(0) and MyObject(1). This is indeed the behaviour that I expect. Therefore, when I try to one-hot-encode this data, I would expect something in the form of:

index   MyObject_0, MyObject_1
    0            1           0
    1            0           1
    2            1           0

However, all solutions that I tried require data to be one-hot-encoded to have some sort of order, whereas that is undefined in my case. I think it should still be possible to have a one-hot-encoding if the order is undefined as in that case it does not matter which one-hot-encoded feature is before the other.

Attempted solutions

Pandas dataframe

My first attempted solution was using pandas' get_dummies() function.

import pandas as pd

objects   = [MyObject(0), MyObject(1), MyObject(0)]
dataframe = pd.DataFrame({'MyObjectFeature': objects})
dummies   = pd.get_dummies(dataframe)

However, this example gives a TypeError:

TypeError: 'values' is not ordered, please explicitly specify the categories order by passing in a categories argument.

Scikit-learn LabelEncoder & OneHotEncoder

My second attempt was using Scikit-learn's LabelEncoder to encode the values before putting them into a OneHotEncoder object. However, in the LabelEncoder the same problem as using Pandas dataframes arises.

from sklearn.preprocessing  import LabelEncoder, OneHotEncoder

objects = [MyObject(0), MyObject(1), MyObject(0)]
encoder = LabelEncoder()
dummies = encoder.fit_transform(objects)

This example also gives a TypeError:

TypeError: '<' not supported between instances of 'MyObject' and 'MyObject'

Custom solution

I also created my own UnorderedLabelEncoder object to encode labels without requiring an order. This works fine, but I would like to know if there is a standard solution to my problem, i.e. using well-known libraries. Or if this is not the case, I would like to know if there is a reason for requiring ordered features?

class UnorderedLabelEncoder(object):

    def __init__(self):
        """ CustomLabelEncoder is capable of handling any
            hashable object including None values.
            """
        self.classes_ = dict()

    def fit(self, y):
        """ Fit label encoder.

            Parameters
            ----------
            y : array-like of shape (n_samples,)
                Target values.

            Returns
            -------
            self : returns an instance of self.
            """
        self.classes_ = {o:i for i, o in enumerate(set(y))}
        return self

    def fit_transform(self, y):
        """ Fit label encoder and return encoded labels.

            Parameters
            ----------
            y : array-like of shape [n_samples]
                Target values.

            Returns
            -------
            y : array-like of shape [n_samples]
        """
        self.fit(y)
        return self.transform(y)

    def transform(self, y):
        """ Transform labels to normalized encoding.

            Parameters
            ----------
            y : array-like of shape [n_samples]
                Target values.

            Returns
            -------
            y : array-like of shape [n_samples]
        """
        return np.array([self.classes_.get(x, -1) for x in y])

Question

Just to reiterate: My question is, what is the best way to one-hot-encode values that have no particular order? And if there is no standardised way to do this, why should one-hot-encoded features be ordered?