obskyr
obskyr

Reputation: 1458

How to read a certain number of characters (as opposed to bytes) in Crystal?

In Crystal, if I have a string (or a file), how do I read a certain number of characters at a time? Using functions like IO#read, IO#gets, IO#read_string, and IO#read_utf8, one can specify a certain number of bytes to read, but not a certain number of UTF-8 characters (or ones of another encoding).

In Python, for example, one might do this:

from io import StringIO

s = StringIO("abcdefgh")
while True:
    chunk = s.read(4)
    if not chunk: break

Or, in the case of a file, this:

with open("example.txt", 'r') as f:
    while True:
        chunk = f.read(4)
        if not chunk: break

Generally, I'd expect IO::Memory to be the class to use for the string case, but as far as I can tell, its methods don't allow for this. How would one do this in an efficient and idiomatic fashion (for both strings and files – perhaps the answer is different for each) in Crystal?

Upvotes: 3

Views: 283

Answers (4)

nolyoly
nolyoly

Reputation: 136

In addition to the answers already given, for strings in Crystal, you can read X amount of characters with a range like this:

  my_string = "A foo, a bar."
  my_string[0..5] => "A foo"

Upvotes: 0

Johannes Müller
Johannes Müller

Reputation: 5661

There currently is no short cut implementation for this available in Crystal.

You can read individual chars with IO#read_char or consecutive ones with IO#each_char.

So a basic implementation would be:

io = IO::Memory.new("€abcdefgh") 

string = String.build(4) do |builder|
  4.times do
    builder << io.read_char
  end
end

puts string

Whether you use a memory IO or a file or any other IO is irrelevant, the behaviour is all the same.

Upvotes: 3

Peter Bauer
Peter Bauer

Reputation: 294

io = IO::Memory.new("€€€abc€€€")   #UTF-8 string from memory
or
io = File.open("test.txt","r")     #UTF-8 string from file 
iter = io.each_char.each_slice(4)  #read max 4 chars at once
iter.each { |slice|                #into a slice
  puts slice
  puts slice.join                  #join to a string
  } 

output:
['€', '€', '€', 'a']
€€€a
['b', 'c', '€', '€']
bc€€
['€']
€

Upvotes: 1

marzhaev
marzhaev

Reputation: 497

This workaround seems to work for me:

io = IO::Memory.new("abcdefghz")
chars_to_read = 2 # Number of chars to read
while true
    chunk = io.gets(chars_to_read) # Grab the chunk of type String?
    break if chunk.nil? # Break if nothing else to read aka `nil`
end

Upvotes: -1

Related Questions