Sooraj
Sooraj

Reputation: 615

Matlab slow while reading 24 bit integers

I found that reading data, which are packed in 24 bit integer format, using Matlab 'fread' with 'int24' option takes a lot of time. I found that, if I read data in 'int32' or 'int16' or 'int8', the reading time is extremely faster as compared to 'int24'. Is there a better way to reduce the reduce the time for reading 24 bit integer data?

To get a feel of the problem, a sample code is given below.

clear all; close all; clc;

% generate some data and write it as a binary file
n=10000000;
x=randn(n,1);
fp=fopen('file1.bin', 'w');
fwrite(fp, x);
fclose(fp);

% read data in 24-bit format and measure the time
% please note that the data we get here will be different from 'x'.
% The sole purpose of the code is to demonstrate the delay in reading
% 'int24'

tic;
fp=fopen('file1.bin', 'r');
y1=fread(fp, n, 'int24');
fclose(fp);
toc;


% read data in 32-bit format and measure the time

% please note that the data we get here will be different from 'x'.
% The sole purpose of the code is to demonstrate the delay in reading
% 'int24'
tic;
fp=fopen('file1.bin', 'r');
y2=fread(fp, n, 'int32');
fclose(fp);
toc;

The output reads: Elapsed time is 1.066489 seconds. Elapsed time is 0.047944 seconds.

Though the 'int32' version reads more data (32*n bits), it is 25 times faster than 'int24' reading.

Upvotes: 5

Views: 1780

Answers (2)

gnovice
gnovice

Reputation: 125864

I was able to achieve about a 4x speedup by reading the data as unsigned 8-bit integers and combining each set of three bytes into the equivalent 24-bit number. Note that this assumes unsigned, little-endian values, so you would have to modify it to account for signed or big-endian data:

>> tic;
>> fp = fopen('file1.bin', 'r');
>> y1 = fread(fp, n, 'bit24');
>> fclose(fp);
>> toc;
Elapsed time is 0.593552 seconds.

>> tic;
>> fp = fopen('file1.bin', 'r');
>> y2 = double(fread(fp, n, '*uint8'));  % This is fastest, for some reason
>> y2 = [1 256 65536]*reshape([y2; zeros(3-rem(numel(y2), 3), 1)], 3, []);
>> fclose(fp);
>> toc;
Elapsed time is 0.143388 seconds.

>> isequal(y1,y2.')  % Test for equality of the values

ans =

     1

In the code above I just padded y2 with zeroes to match the size of y1. The vector y2 also ends up being a row vector instead of a column vector, which can be changed with a simple transpose if needed. For some reason having fread first output the values as uint8, then converting them to double was faster than any other option (i.e. outputting directly to double by making the last argument 'uint8' or 'uint8=>double').

Upvotes: 2

HebeleHododo
HebeleHododo

Reputation: 3640

First, it is better to use bitn instead of int*.

If you change int24 to bit32, the code will run just as slow. So I think it is not about how many bits you read, rather the internal nature of using bitn.

Upvotes: 1

Related Questions