Reputation: 615
I found that reading data, which are packed in 24 bit integer format, using Matlab 'fread' with 'int24' option takes a lot of time. I found that, if I read data in 'int32' or 'int16' or 'int8', the reading time is extremely faster as compared to 'int24'. Is there a better way to reduce the reduce the time for reading 24 bit integer data?
To get a feel of the problem, a sample code is given below.
clear all; close all; clc;
% generate some data and write it as a binary file
n=10000000;
x=randn(n,1);
fp=fopen('file1.bin', 'w');
fwrite(fp, x);
fclose(fp);
% read data in 24-bit format and measure the time
% please note that the data we get here will be different from 'x'.
% The sole purpose of the code is to demonstrate the delay in reading
% 'int24'
tic;
fp=fopen('file1.bin', 'r');
y1=fread(fp, n, 'int24');
fclose(fp);
toc;
% read data in 32-bit format and measure the time
% please note that the data we get here will be different from 'x'.
% The sole purpose of the code is to demonstrate the delay in reading
% 'int24'
tic;
fp=fopen('file1.bin', 'r');
y2=fread(fp, n, 'int32');
fclose(fp);
toc;
The output reads: Elapsed time is 1.066489 seconds. Elapsed time is 0.047944 seconds.
Though the 'int32' version reads more data (32*n bits), it is 25 times faster than 'int24' reading.
Upvotes: 5
Views: 1780
Reputation: 125864
I was able to achieve about a 4x speedup by reading the data as unsigned 8-bit integers and combining each set of three bytes into the equivalent 24-bit number. Note that this assumes unsigned, little-endian values, so you would have to modify it to account for signed or big-endian data:
>> tic;
>> fp = fopen('file1.bin', 'r');
>> y1 = fread(fp, n, 'bit24');
>> fclose(fp);
>> toc;
Elapsed time is 0.593552 seconds.
>> tic;
>> fp = fopen('file1.bin', 'r');
>> y2 = double(fread(fp, n, '*uint8')); % This is fastest, for some reason
>> y2 = [1 256 65536]*reshape([y2; zeros(3-rem(numel(y2), 3), 1)], 3, []);
>> fclose(fp);
>> toc;
Elapsed time is 0.143388 seconds.
>> isequal(y1,y2.') % Test for equality of the values
ans =
1
In the code above I just padded y2
with zeroes to match the size of y1
. The vector y2
also ends up being a row vector instead of a column vector, which can be changed with a simple transpose if needed. For some reason having fread
first output the values as uint8
, then converting them to double
was faster than any other option (i.e. outputting directly to double
by making the last argument 'uint8'
or 'uint8=>double'
).
Upvotes: 2
Reputation: 3640
First, it is better to use bitn
instead of int*
.
If you change int24
to bit32
, the code will run just as slow. So I think it is not about how many bits you read, rather the internal nature of using bitn
.
Upvotes: 1