Pratik Thaker
Pratik Thaker

Reputation: 657

Read byte data from a file into multiple integers

I'm trying to read data from a file containing continuous byte values for 4 byte integers. For example, the integers 1, 2, 3 would be stored in a file containing the bytes:

00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000010 00000000 00000000 00000000 00000011 

I want to read this and assign each number to a different variable, for example a = 1, b = 2, and c = 3. How do I do this?

Any help will be appreciated with how to use the read and unpack commands. Also, if in the solution you give you can give a very brief explanation of why your code works.

This file is generated by a program written in Java. I am dumping bytes because speed is key, but if the process of reading into separate integers gets easier by adding a delimiter byte or something similar, I'd be open to that suggestion as well.

Upvotes: 1

Views: 852

Answers (3)

Darek Nędza
Darek Nędza

Reputation: 1420

Here is way not using unpack.

After you read this line into string(str):

arr = [] 
str = str.gsub(/\s/, '') #delete every space
len = str.length #get length of string
i = 0

while i<len #iterate over string until end(starting from 0)
    arr<<str[i...(i+16)].to_i(2) # "unpacking" 16 characters using range: 'string'[0...2] is 'st' & changing it into Integer with base 2(`to_i(base)`)
    i += 16 #going to next number(in your case 16 characters later)
end

When you store numbers in format like this "1 2 3", you your code should be faster because(as for my solution) you don't need to use gsub nor calculating where is number.
Nevertheless, I suggest you to benchmark codes you got from this topic. And if you are aiming for speed, you can try extend your code with C.

Here is ruby solution:

str = "1 2 3 4"
arr = str.split #split string on space (it's the same as `str.split(' ')` 
#result: ["1", "2", "3", "4"]
numbers = arr.collect{|el| el.to_i} #for each string in `arr` it calls `to_i` and store result in new array(not `arr`)
#[1, 2, 3, 4]

Of course, you can do one-liner like this:

numbers = str.split.collect &:to_i 

or like this:

numbers = str.split.collect(|el| el.to_i}

Upvotes: 1

Patrick Oscity
Patrick Oscity

Reputation: 54674

I recommend using the bindata gem:

require 'bindata'

class MyBinaryFormat < BinData::Record
  uint32 :a
  uint32 :b
  uint32 :c
end

io = File.open('/path/to/binary/file')
result = MyBinaryFormat.read(io)

puts result.a  # 1
puts result.b  # 2
puts result.c  # 3

If you cannot use gems, you can use String#unpack. You will need to use the N format which stands for "Integer, 32-bit unsigned, network (big-endian) byte order" (see Ruby Documentation). By using * you tell Ruby to convert the bytes into the specified type until it runs out of data. Here's how you would use it:

io = File.open('/path/to/binary/file')
a, b, c = io.read(12).unpack('N*')  #=> 1, 2, 3

If you need to read more, adjust the parameter to read (here 3*4 = 12 bytes), accordingly.

Upvotes: 3

Малъ Скрылевъ
Малъ Скрылевъ

Reputation: 16507

You can use special string operators to calculate a number from the binary. Your file contains the following:

00000000 00000001 00000000 00000010 00000000 00000011 

And code looks like this:

# => ["00000000", "00000001", "00000000", "00000010", "00000000", "00000011"]
values =
IO.read( '1.1' ).split( /\s+/ ).map do| binary | # reading the file and splitting into an array by space
   i = -1
   binary.split( '' ).reverse.reduce( 0 ) do| sum, digit | # reduction binary into a digit
      i += 1
      sum + ( digit.to_i << i ) # sum by a digit
   end
end
=> [0, 1, 0, 2, 0, 3]

And for following code passes all values stroed previouly in the array into the function proc_func expanding the arguments:

def proc_func a, b, c, d, e, f
   puts a, b, c, d, e, f
end

proc_func *values

# 0
# 1
# 0
# 2
# 0
# 3

Upvotes: 1

Related Questions