new_one
new_one

Reputation: 105

InvalidArgumentError: Expected image (JPEG, PNG, or GIF), got unknown format starting with 'B2.jpg'

I have a dataset of tfrecords that I'm trying to parse.
I am using this code to parse it:

image_size = [224,224]
def read_tfrecord(tf_record):
    features = {
        "filename": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
        "fun": tf.io.FixedLenFeature([], tf.string),  
        "label": tf.io.VarLenFeature(tf.int64),
    }
    tf_record = tf.parse_single_example(tf_record, features)

    filename = tf.image.decode_jpeg(tf_record['filename'], channels=3)
    filename = tf.cast(filename, tf.float32) / 255.0  # convert image to floats in [0, 1] range
    filename = tf.reshape(filename, [*image_size, 3]) # explicit size will be needed for TPU

    label = tf.cast(tf_record['label'],tf.float32)

    return filename, label

def load_dataset(filenames):
    option_no_order = tf.data.Options()
    option_no_order.experimental_deterministic = False

    dataset = tf.data.Dataset.from_tensor_slices(filenames)
    dataset = dataset.with_options(option_no_order)
    #dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=16)
    dataset = dataset.interleave(tf.data.TFRecordDataset, cycle_length=32, num_parallel_calls=AUTO) # faster
    dataset = dataset.map(read_tfrecord, num_parallel_calls=AUTO)
    return dataset

train_data=load_dataset(train_filenames)
val_data=load_dataset(val_filenames)
test_data=load_dataset(test_filenames)

After running this code I get: train_data
<DatasetV1Adapter shapes: ((224, 224, 3), (?,)), types: (tf.float32, tf.float32)>
I was trying to see the images in the dataset with:

def display_9_images_from_dataset(dataset):
    subplot=331
    plt.figure(figsize=(13,13))
    images, labels = dataset_to_numpy_util(dataset, 9)
    for i, image in enumerate(images):
        title = CLASSES[np.argmax(labels[i], axis=-1)]
        subplot = display_one_flower(image, title, subplot)
        if i >= 8:
            break;

    plt.tight_layout()
    plt.subplots_adjust(wspace=0.1, hspace=0.1)
    plt.show()



def dataset_to_numpy_util(dataset, N):
    dataset = dataset.batch(N)

    if tf.executing_eagerly():
    # In eager mode, iterate in the Datset directly.
        for images, labels in dataset:
            numpy_images = images.numpy()
            numpy_labels = labels.numpy()
            break;

    else: # In non-eager mode, must get the TF note that 
        # yields the nextitem and run it in a tf.Session.
        get_next_item = dataset.make_one_shot_iterator().get_next()
        with tf.Session() as ses:
            numpy_images, numpy_labels = ses.run(get_next_item)

    return numpy_images, numpy_labels


display_9_images_from_dataset(train_data)

But I get the error: InvalidArgumentError: Expected image (JPEG, PNG, or GIF), got unknown format starting with 'B2.jpg' [[{{node DecodeJpeg}}]] [[IteratorGetNext_3]]

I'm a bit confused, one because it says that the file is jpg format and it asks for jpeg, which from my understanding are the same.
And also because I'm not sure how to view the images or even know if I parsed it correctly.

Upvotes: 0

Views: 6631

Answers (1)

karandeep36
karandeep36

Reputation: 336

Extensions ".jpg" and ".jpeg" are different in terms of the validation check done by the API which is consuming it.

tf.image.decode_jpeg takes images with ".jpeg" extensions.

Try renaming your .jpg images with .jpeg extensions and it should start working.

Upvotes: 2

Related Questions