Reputation: 343
Using Rust, I want to take a slice of bytes from a vec and display them as hex, on the console, I can make this work using the itertools format function, and println!, but I cannot figure out how it works, here is the code, simplified...
use itertools::Itertools;
// Create a vec of bytes
let mut buf = vec![0; 1024];
... fill the vec with some data, doesn't matter how, I'm reading from a socket ...
// Create a slice into the vec
let bytes = &buf[..5];
// Print the slice, using format from itertools, example output could be: 30 27 02 01 00
println!("{:02x}", bytes.iter().format(" "));
(as an aside, I realize I can use the much simpler itertools join function, but in this case I don't want the default 0x## style formatting, as it is somewhat bulky)
How on earth does this work under the covers? I know itertools format is creating a "Format" struct, and I can see the source code here, https://github.com/rust-itertools/itertools/blob/master/src/format.rs , but I am none the wiser. I suspect the answer has to do with "macro_rules! impl_format" but that is just about where my head explodes.
Can some Rust expert explain the magic? I hate to blindly copy paste code without a clue, am I abusing itertools, maybe there a better, simpler way to go about this.
Upvotes: 1
Views: 3274
Reputation: 65832
I suspect the answer has to do with "macro_rules! impl_format" but that is just about where my head explodes.
The impl_format!
macro is used to implement the various formatting traits.
impl_format!{Display Debug
UpperExp LowerExp UpperHex LowerHex Octal Binary Pointer}
The author has chosen to write a macro because the implementations all look the same. The way repetitions work in macros means that macros can be very helpful even when they are used only once (here, we could do the same by invoking the macro once for each trait, but that's not true in general).
Let's expand the implementation of LowerHex
for Format
and look at it:
impl<'a, I> fmt::LowerHex for Format<'a, I>
where I: Iterator,
I::Item: fmt::LowerHex,
{
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
self.format(f, fmt::LowerHex::fmt)
}
}
The fmt
method calls another method, format
, defined in the same module.
impl<'a, I> Format<'a, I>
where I: Iterator,
{
fn format<F>(&self, f: &mut fmt::Formatter, mut cb: F) -> fmt::Result
where F: FnMut(&I::Item, &mut fmt::Formatter) -> fmt::Result,
{
let mut iter = match self.inner.borrow_mut().take() {
Some(t) => t,
None => panic!("Format: was already formatted once"),
};
if let Some(fst) = iter.next() {
cb(&fst, f)?;
for elt in iter {
if self.sep.len() > 0 {
f.write_str(self.sep)?;
}
cb(&elt, f)?;
}
}
Ok(())
}
}
format
takes two arguments: the formatter (f
) and a formatting function (cb
for callback). The formatting function here is fmt::LowerHex::fmt
. This is the fmt
method from the LowerHex
trait; how does the compiler figure out which LowerHex
implementation to use? It's inferred from format
's type signature. The type of cb
is F
, and F
must implement FnMut(&I::Item, &mut fmt::Formatter) -> fmt::Result
. Notice the type of the first argument: &I::Item
(I
is the type of the iterator that was passed to format
). LowerHex::fmt
' signature is:
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result;
For any type Self
that implements LowerHex
, this function will implement FnMut(&Self, &mut fmt::Formatter) -> fmt::Result
. Thus, the compiler infers that Self == I::Item
.
One important thing to note here is that the formatting attributes (e.g. the 02
in your formatting string) are stored in the Formatter
. Implementations of e.g. LowerHex
will use methods such as Formatter::width
to retrieve an attribute. The trick here is that the same formatter is used to format multiple values (with the same attributes).
In Rust, methods can be called in two ways: using method syntax and using function syntax. These two functions are equivalent:
use std::fmt;
pub fn method_syntax(f: &mut fmt::Formatter) -> fmt::Result {
use fmt::LowerHex;
let x = 42u8;
x.fmt(f)
}
pub fn function_syntax(f: &mut fmt::Formatter) -> fmt::Result {
let x = 42u8;
fmt::LowerHex::fmt(&x, f)
}
When format
is called with fmt::LowerHex::fmt
, this means that cb
refers to fmt::LowerHex::fmt
. format
must use function call because there's no guarantee that the callback is even a method!
am I abusing itertools
Not at all; in fact, this is precisely how format
is meant to be used.
maybe there a better, simpler way to go about this
There are simpler ways, sure, but using format
is very efficient because it doesn't allocate dynamic memory.
Upvotes: 2