BitTickler
BitTickler

Reputation: 11875

Is there a way to init a non-trivial static std::collections::HashMap without making it static mut?

In this code, A does not need to be static mut, but the compiler forces B to be static mut:

use std::collections::HashMap;
use std::iter::FromIterator;

static A: [u32; 21] = [
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
];
static mut B: Option<HashMap<u32, String>> = None;

fn init_tables() {
    let hm = HashMap::<u32, String>::from_iter(A.iter().map(|&i| (i, (i + 10u32).to_string())));
    unsafe {
        B = Some(hm);
    }
}

fn main() {
    init_tables();
    println!("{:?} len: {}", A, A.len());
    unsafe {
        println!("{:?}", B);
    }
}

This is the only way I have found to get close to what I actually want: a global, immutable HashMap to be used by several functions, without littering all my code with unsafe blocks.

I know that a global variable is a bad idea for multi-threaded applications, but mine is single threaded, so why should I pay the price for an eventuality which will never arise?

Since I use rustc directly and not cargo, I don't want the "help" of extern crates like lazy_static. I tried to decypher what the macro in that package does, but to no end.

I also tried to write this with thread_local() and a RefCell but I had trouble using A to initialize B with that version.

In more general terms, the question could be "How to get stuff into the initvars section of a program in Rust?"

If you can show me how to initialize B directly (without a function like init_tables()), your answer is probably right.

If a function like init_tables() is inevitable, is there a trick like an accessor function to reduce the unsafe litter in my program?

Upvotes: 1

Views: 253

Answers (1)

edwardw
edwardw

Reputation: 13942

How to get stuff into the initvars section of a program in Rust?

Turns out rustc puts static data in .rodata section and static mut data in .data section of the generated binary:

#[no_mangle]
static DATA: std::ops::Range<u32> = 0..20;

fn main() { DATA.len(); }
$ rustc static.rs
$ objdump -t -j .rodata static
static:     file format elf64-x86-64

SYMBOL TABLE:
0000000000025000 l    d  .rodata    0000000000000000              .rodata
0000000000025490 l     O .rodata    0000000000000039              str.0
0000000000026a70 l     O .rodata    0000000000000400              elf_crc32.crc32_table
0000000000026870 l     O .rodata    0000000000000200              elf_zlib_default_dist_table
0000000000026590 l     O .rodata    00000000000002e0              elf_zlib_default_table
0000000000025060 g     O .rodata    0000000000000008              DATA
0000000000027f2c g     O .rodata    0000000000000100              _ZN4core3str15UTF8_CHAR_WIDTH17h6f9f810be98aa5f2E

So changing from static mut to static at the source code level significantly changes the binary generated. The .rodata section is read-only and trying to write to it will seg fault the program.

If init_tables() is of the judgement day category (inevitable)

It is probably inevitable. Since the default .rodata linkage won't work, one has to control it directly:

use std::collections::HashMap;
use std::iter::FromIterator;

static A: std::ops::Range<u32> = 0..20;
#[link_section = ".bss"]
static B: Option<HashMap<u32, String>> = None;

fn init_tables() {
    let data = HashMap::from_iter(A.clone().map(|i| (i, (i + 10).to_string())));
    unsafe {
        let b: *mut Option<HashMap<u32, String>> = &B as *const _ as *mut _;
        (&mut *b).replace(data);
    }
}

fn main() {
    init_tables();
    println!("{:?} len: {}", A, A.len());
    println!("{:#?} 5 => {:?}", B, B.as_ref().unwrap().get(&5));
}

I don't want the "help" of extern crates like lazy_static

Actually lazy_static isn't that complicated. It has some clever use of the Deref trait. Here is a much simplified standalone version and it is more ergonomically friendly than the first example:

use std::collections::HashMap;
use std::iter::FromIterator;
use std::ops::Deref;
use std::sync::Once;

static A: std::ops::Range<u32> = 0..20;
static B: BImpl = BImpl;
struct BImpl;
impl Deref for BImpl {
    type Target = HashMap<u32, String>;

    #[inline(always)]
    fn deref(&self) -> &Self::Target {
        static LAZY: (Option<HashMap<u32, String>>, Once) = (None, Once::new());
        LAZY.1.call_once(|| unsafe {
            let x: *mut Option<Self::Target> = &LAZY.0 as *const _ as *mut _;
            (&mut *x).replace(init_tables());
        });

        LAZY.0.as_ref().unwrap()
    }
}

fn init_tables() -> HashMap<u32, String> {
    HashMap::from_iter(A.clone().map(|i| (i, (i + 10).to_string())))
}

fn main() {
    println!("{:?} len: {}", A, A.len());
    println!("{:#?} 5 => {:?}", *B, B.get(&5));
}

Upvotes: 2

Related Questions