Reputation: 731
I am trying to convert image data saved in a rosbag
file to numpy
arrays and opencv
images for further processing. I can not use cv_bridge
or any of the other ROS utils.
I read the rosbag
using the bagpy
module here. And convert the data to a pandas
dataframe:
import numpy as np
import cv2
import bagpy
from bagpy import bagreader
import matplotlib.pyplot as plt
import pandas as pd
import csv
b = bagreader('camera.bag')
image_csv = b.message_by_topic('/left/image')
df_limage = pd.read_csv('camera/left-image.csv')
Because the rosbag
stores images as type bytestring
, the df_limage
dataframe looks like:
>>> df_limage.head()
time height width encoding is_bigendian data
1.593039e+09 1080 1920 rgb8 0 b' \'\n"*\x0c$\'\x14\x1f...
When I try to examine the image stored in the data column, I see that each image is stored as a string:
>>> type(df_limage['data'][0])
str
>>> len(df_limage['data'][0])
15547333
>>> print(df_limage['data'][0])
b' \'\n"*\x0c$\'\x14\x1f#\x0f\x1d!\x12 %\x16\x1f\'\x0e\x1c%\x0b\x1c&\x12\x19#\x10\x1e#\x13\x1f$\x14##\x16!!\x13$$"$$"&*\x12$(\x1...
When I try to decode this using code from this answer, I get warnings and NoneType
returns:
>>> nparr = np.fromstring(df_limage['data'][0], np.uint8)
DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
>>> img_np = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
>>> type(img_np)
NoneType
I think this is because the string isn't being read correctly as a bytestring
and nparr
hasn't been reshaped into a 3-channel RGB image of dimensions (1080 x 1920). The size of nparr
is 15547333, so it can't be reshaped into a (1080 x 1920 x 3) image which leads me to believe that the np.fromstring
call isn't correct.
How do I take a binarystring that is represented as string with a leading "b'", convert that back to a binarystring so I can then convert it into an array, and then an opencv image?
Thanks
Upvotes: 0
Views: 1118
Reputation: 731
This is what I ended up having to do (without using the ast library):
>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
# read image data as raw string from csv
>>> df = pd.read_csv('camera_data.csv')
>>> df.head()
Time data
0 11578 b' \'\n"*\x0c$\'\x14\x1f#\x0f\x1d!\x12 %\x16...
1 11579 b'\x19)\n\x15%\x07 (\x0f\x1d&\x0c\x16$\x18\x15...
2 11580 b'\x1a)\x04\x17&\x01\x17&\x13\x16%\x12\x1f...
3 11581 b'\x18%\x03\x19&\x04!$\x03\x1f"\x01\x1e#\x11\...
# access the raw string representation of first image string in column df['data']
# raw string appears as: 'b\' \\\'\\n"*\\x0c$\\\'\\x14\\x1f#...'
raw_string = df_left_image['data'][0]
# convert to byte string with escape characters included
byte_string = raw_string[2:-1].encode('latin1')
# remove escaped characters
escaped_string = byte_string.decode('unicode_escape')
# convert back to byte string without escaped characters
byte_string = escaped_string.encode('latin1')
# convert string to numpy array
# this will throw a warning to use np.frombuffer
nparr = np.fromstring(byte_string, np.uint8)
# convert to 3 channel rgb image array of (H x W x 3)
rgb = nparr.reshape((1080, 1920, -1))
# show image in matplotlib
plt.imshow(rgb)
Upvotes: 0
Reputation: 207630
Your image is pure rgb8
pixels in a bytes
type. That means:
str
and you shouldn't treat it as such, andcv2.imdecode()
because that decompresses images and turns them into Numpy arrays of pixels, which is nearly what you already have.So, you have a number of contiguous bytes representing pixels. The length of your bytes should be 1920x1080x3, i.e. one byte per channel for 3 channels of 1080p dimensions. We need to make a Numpy array and then reshape it from a long line into 1080p:
na = np.frombuffer(YOURBYTES).reshape((1080,1920,3))
You should generally only be calling cv2.imdecode()
on things that look like either a PNG:
b'\x89PNG\r\n\x1a\n\x00\x00...'
or a JPEG:
b'\xff\xd8\xff\xe0\x00\x10JFIF...'
or a TIFF ( b'II'
or b'MM'
) or BMP (b'BM'
) magic signature.
If your buffer begins with a base64-encoded version of either of the above, i.e. iVBORw0KGgo=
(PNG) or /9
(JPEG), you need to base64-decode, then call cv2.imdecode()
the result of that.
from base64 import b64decode
import numpy as np
import cv2
# Extract JPEG-encoded image from base64-encoded string
JPEG = b64decode(YOURDATA)
# Decode JPEG back into Numpy array
na = cv2.imdecode(np.frombuffer(JPEG,dtype=np.uint8), cv2.IMREAD_COLOR)
If your data is bytes
type and already has the same length as the dimensions of your image, i.e. len(YOURBYTES) == height*width*nChannels
like you have, that means it is pure, uncompressed pixels, so you just need the first part of this answer:
na = np.frombuffer(YOURBYTES).reshape((1080,1920,3))
Note that, unlike in Parts 1 and 2 above, the reshaping is necessary here because there was no JPEG or PNG metadata telling us the height and width of the image.
Upvotes: 0