BeginnersMindTruly
BeginnersMindTruly

Reputation: 731

Convert image bytestring entry in pandas dataframe to image in opencv

I am trying to convert image data saved in a rosbag file to numpy arrays and opencv images for further processing. I can not use cv_bridge or any of the other ROS utils.

I read the rosbag using the bagpy module here. And convert the data to a pandas dataframe:

import numpy as np
import cv2
import bagpy
from bagpy import bagreader
import matplotlib.pyplot as plt
import pandas as pd
import csv

b = bagreader('camera.bag')
image_csv = b.message_by_topic('/left/image')
df_limage = pd.read_csv('camera/left-image.csv')

Because the rosbag stores images as type bytestring, the df_limage dataframe looks like:

>>> df_limage.head()
time           height    width   encoding    is_bigendian    data
1.593039e+09   1080      1920    rgb8        0               b' \'\n"*\x0c$\'\x14\x1f...

When I try to examine the image stored in the data column, I see that each image is stored as a string:

>>> type(df_limage['data'][0])
str
>>> len(df_limage['data'][0])
15547333
>>> print(df_limage['data'][0])
b' \'\n"*\x0c$\'\x14\x1f#\x0f\x1d!\x12 %\x16\x1f\'\x0e\x1c%\x0b\x1c&\x12\x19#\x10\x1e#\x13\x1f$\x14##\x16!!\x13$$"$$"&*\x12$(\x1...

When I try to decode this using code from this answer, I get warnings and NoneType returns:

>>> nparr = np.fromstring(df_limage['data'][0], np.uint8)
DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
>>> img_np = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
>>> type(img_np)
NoneType

I think this is because the string isn't being read correctly as a bytestring and nparr hasn't been reshaped into a 3-channel RGB image of dimensions (1080 x 1920). The size of nparr is 15547333, so it can't be reshaped into a (1080 x 1920 x 3) image which leads me to believe that the np.fromstring call isn't correct.

How do I take a binarystring that is represented as string with a leading "b'", convert that back to a binarystring so I can then convert it into an array, and then an opencv image?

Thanks

Upvotes: 0

Views: 1118

Answers (2)

BeginnersMindTruly
BeginnersMindTruly

Reputation: 731

This is what I ended up having to do (without using the ast library):

>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt

# read image data as raw string from csv
>>> df = pd.read_csv('camera_data.csv')
>>> df.head()
    Time    data
0   11578   b' \'\n"*\x0c$\'\x14\x1f#\x0f\x1d!\x12 %\x16...
1   11579   b'\x19)\n\x15%\x07 (\x0f\x1d&\x0c\x16$\x18\x15...
2   11580   b'\x1a)\x04\x17&\x01\x17&\x13\x16%\x12\x1f...
3   11581   b'\x18%\x03\x19&\x04!$\x03\x1f"\x01\x1e#\x11\...

# access the raw string representation of first image string in column df['data']
# raw string appears as: 'b\' \\\'\\n"*\\x0c$\\\'\\x14\\x1f#...'
raw_string = df_left_image['data'][0]

# convert to byte string with escape characters included
byte_string = raw_string[2:-1].encode('latin1')

# remove escaped characters
escaped_string = byte_string.decode('unicode_escape')

# convert back to byte string without escaped characters
byte_string = escaped_string.encode('latin1')

# convert string to numpy array
# this will throw a warning to use np.frombuffer
nparr = np.fromstring(byte_string, np.uint8)

# convert to 3 channel rgb image array of (H x W x 3)
rgb = nparr.reshape((1080, 1920, -1))

# show image in matplotlib
plt.imshow(rgb)

Upvotes: 0

Mark Setchell
Mark Setchell

Reputation: 207630

Your image is pure rgb8 pixels in a bytes type. That means:

  • it isn't a str and you shouldn't treat it as such, and
  • it isn't JPEG-encoded or PNG-encoded, so you shouldn't be passing it to cv2.imdecode() because that decompresses images and turns them into Numpy arrays of pixels, which is nearly what you already have.

So, you have a number of contiguous bytes representing pixels. The length of your bytes should be 1920x1080x3, i.e. one byte per channel for 3 channels of 1080p dimensions. We need to make a Numpy array and then reshape it from a long line into 1080p:

na = np.frombuffer(YOURBYTES).reshape((1080,1920,3))

General rule:

Part 1

You should generally only be calling cv2.imdecode() on things that look like either a PNG:

b'\x89PNG\r\n\x1a\n\x00\x00...'

or a JPEG:

b'\xff\xd8\xff\xe0\x00\x10JFIF...'

or a TIFF ( b'II' or b'MM') or BMP (b'BM') magic signature.

Part 2

If your buffer begins with a base64-encoded version of either of the above, i.e. iVBORw0KGgo= (PNG) or /9 (JPEG), you need to base64-decode, then call cv2.imdecode() the result of that.

from base64 import b64decode
import numpy as np
import cv2

# Extract JPEG-encoded image from base64-encoded string
JPEG = b64decode(YOURDATA)

# Decode JPEG back into Numpy array
na = cv2.imdecode(np.frombuffer(JPEG,dtype=np.uint8), cv2.IMREAD_COLOR)

Part 3

If your data is bytes type and already has the same length as the dimensions of your image, i.e. len(YOURBYTES) == height*width*nChannels like you have, that means it is pure, uncompressed pixels, so you just need the first part of this answer:

na = np.frombuffer(YOURBYTES).reshape((1080,1920,3))

Note that, unlike in Parts 1 and 2 above, the reshaping is necessary here because there was no JPEG or PNG metadata telling us the height and width of the image.

Upvotes: 0

Related Questions