How to store csv values form a file to a numpy array in Python?

Question

I've written a Python script that reads a b&w bitmap image and stores the value of each pixel as a hex value from 0x00 to 0xFF in a .txt file. The values are stored as a continuous 1D array separated by commas, and in order to not have very wide files the lines are ~~only~~ as maximum as 16 elements in length, e.g.:

v01, v02, ... , v15, v16,
v17, v18, ... , v31, v32,
...

0x00, 0x00, ... , 0x00, 0x00,
0x00, 0x00, ... , 0x00, 0x00,
...

Notice that the last element of each line also has a comma

Of course the .txt file doesn't keep the original dimensions of the bitmap, but it is not an issue because it will be later used in a micro-controller firmware, which knows the original dimensions and takes care of properly reading the 1D array.

Now, in order to verify that the conversion is done properly, I need to write a script that reads the file and stores the values in a numpy array that is used to display the image later with "matplotlib". I've tried the following code:

my_data = genfromtxt('file.txt', delimiter=',')
print(my_data)

The issue with that is, apart of the wrong dimensions, that the hex values are not read as numbers and that the element after the last coma of the row is also read (the break character I guess). I get something like:

[nan, nan, ... , nan, nan," "
...]

I need a way of reading the .txt file, converting the values from a "0x00" format to a numeric value and storing then in a m x n numpy array (m & n are known parameters, the original bitmap size):

[[0, 0, ... , 0, 0]
 [0, 0, ... , 0, 0]
 ...]

Any suggestions on how to do so?

Update

While writing the question I was only working with files that were multiples of 16 pixels in width, that guaranteed that my csv outputs had always 16 elements in all the rows. But after some testing I came across a picture, the size of which made the last row of the csv to be less than 16 elements. In that case I was not able to use the solution provided by @taras, but still the answer was correct as per my initial question.

Finally I ended up with the following code, maybe not as elegant but does the trick:

with open(filename,"r") as f:
        pixels=[x.split(',') for x in f.readlines()]
        for p in pixels:
            del p[-1]
        pixels = [int(p,16) for row in pixels for p in row]
        pixels = np.asarray(pixels, dtype=np.uint8).reshape(h,w)

I'm keeping both answers in case somebody finds them useful.

taras · Accepted Answer

Since you have a fixed number of columns you can make use of it to read the first 16 columns only (it will let you to strip a trailing comma) and convert each column from hex using converters dict with int(x, 16):

import numpy as np

fname = 'file.txt'
num_cols = 16
np.loadtxt(fname, usecols=range(num_cols), dtype=np.uint8, delimiter=',',
           converters={k: lambda x: int(x, 16) for k in range(num_cols)})

Edit:
If the number of elements in the file is not a multiple of 16, you can use regular python code to preprocess data and then convert it into numpy array:

import numpy as np

fname = 'file.txt'
with open(fname) as fp:
    data = fp.read().replace('
', '')
np.array([int(x, 16) for x in data.split(',')])

How to store csv values form a file to a numpy array in Python?

Answers (1)

Related Questions