Data Cardinality error but all x trains have the same length as y labels

Question

Whenever I run model.fit, I get a data cardinality error. However, my x trains and y trains have the same lengths. This is the exact message:

ValueError: Data cardinality is ambiguous: x sizes: 100, 158, 105, 158, 104, 117, 47, 94, 76, 69, 84, 325, 192, 564, 70, 77, 232, 677, 91, 177, 79, 122, 113, 142, 169, 208, 83, 142, 73, 202, 41, 247, 117, 149, 169, 91, 60, 76, 194, 232, 156, 61, 191, 164, 79, 218, 106, 52, 604, 181, 154, 100, 108, 41, 44, 97, 162, 57, 224, 78, 296, 171, 43, 184, 114, 485, 107, 219, 96, 257, 127, 82, 150, 294, 47, 121, 74, 130, 331, 76, 368, 83, 223, 42, 169, 222, 109, 115, 86, 221, 191, 132, 598, 281, 329, 100, 38, 61, 60, 49, 725, 158, 105, 158, 104, 117, 47, 94, 76, 69, 84, 325, 192, 564, 70, 77, 232, 677, 91, 177, 79, 122, 113, 142, 169, 208, 83, 142, 73, 202, 41, 247, 117, 149, 169, 91, 60, 76, 194, 232, 156, 61, 191, 164, 79, 218, 106, 52, 604, 181, 154, 100, 108, 41, 44, 97, 162, 57, 224, 78, 296, 171, 43, 184, 114, 485, 107, 219, 96, 257, 127, 82, 150, 294, 47, 121, 74, 130, 331, 76, 368, 83, 223, 42, 169, 222, 109, 115, 86, 221, 191, 132, 598, 281, 329, 100, 38, 61, 60, 49, 725, 158, 105, 158, 104, 117, 47, 94, 76, 69, 84, 325, 192, 564, 70, 77, 232, 677, 91, 177, 79, 122, 113, 142, 169, 208, 83, 142, 73, 202, 41, 247, 117, 149, 169, 91, 60, 76, 194, 232, 156, 61, 191, 164, 79, 218, 106, 52, 604, 181, 154, 100, 108, 41, 44, 97, 162, 57, 224, 78, 296, 171, 43, 184, 114, 485, 107, 219, 96, 257, 127, 82, 150, 294, 47, 121, 74, 130, 331, 76, 368, 83, 223, 42, 169, 222, 109, 115, 86, 221, 191, 132, 598, 281, 329, 100, 38, 61, 60, 49, 725 y sizes: 100

Here is my code:


#model = load_model("./project/dependencies/multimodal_sentiment/model")
# Load the IEMOCAP Emotion Recognition dataset
model = load_model("./project/dependencies/multimodal_sentiment/model", compile = False)
model.compile(optimizer=Adam(), loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
download_config = DownloadConfig(cache_dir="./cache_dir")

datafile = open("./project/dependencies/multimodal_sentiment/compiled_training_data.pkl", "rb")
dataset = pickle.load(datafile)
datafile.close()

reshaped_data = [
    [i[0] for i in dataset], 
    [i[1] for i in dataset],
]

pitch_train = [i["pitch_data"] for i in reshaped_data[0]]
volume_train = [i["volume_data"] for i in reshaped_data[0]]
mfcc_train = [i["mfcc_data"] for i in reshaped_data[0]]
dialogue_train = np.array([i["dialogue_embedding"] for i in reshaped_data[0]])
y_labels = np.array(reshaped_data[1])

print(f"pitch: {set(map(type, pitch_train))}, volume: {set(map(type, volume_train))}, mfcc: {set(map(type, mfcc_train))}, dialogue: {set(map(type, dialogue_train))}, y_labels: {set(map(type, y_labels))}")
print("Length of pitch_train:", len(pitch_train))
print("Length of volume_train:", len(volume_train))
print("Length of mfcc_train:", len(mfcc_train))
print("Length of dialogue_train:", len(dialogue_train))
print("Length of y_labels:", len(y_labels))


history = model.fit(
    {
        "pitch_data": pitch_train,
        "volume_data": volume_train,
        "mfcc_data": mfcc_train,
        "dialogue_embedding": dialogue_train
    },
    y_labels,
    epochs=5,
    batch_size=32
)

The 1st print statement returns numpy ndarrays for all trains, and the length print statements all return 100. Each dialogue_embedding (in dialogue_train) has a fixed length of 384, while pitch's and volume's length are variable, but equivalent. Here are the shapes of each feature in the 1st sample of each train:

pitch_data (158,) volume_data (158,) mfcc_data (158, 13) dialogue_embedding (384,)

This is how I am receiving inputs in my model:

pitch_input = Input(shape=(None, 1), name="pitch_data")

volume_input = Input(shape=(None, 1), name="volume_data")

mfcc_input = Input(shape=(None, 13), name="mfcc_data")

dialogue_embedding_input = Input(shape=(384, 1), name="dialogue_embedding")

I just started ML, so let me know if there is any other additional information I can provide to help diagnose the issue.

Data Cardinality error but all x trains have the same length as y labels

Answers (0)

Related Questions