Reputation: 97048
I have a large JSON file containing a medium number of very large strings and I don't want to store it all in memory at once.
Is it possible to use serde-json
(or similar) to load it, except for these large strings. For the large strings I want it to load them lazily. For example it could store the file offset and length instead of the actual string, and then provide a function to actually read the string.
Upvotes: -1
Views: 79
Reputation: 1048
You will need a custom parse for storing the parsing data lazily, but
use serde_json::from_reader(_)
help a bit, along with std::io::BufReader
Edit
change
print_mem_stat()
to Peak allocation.
use std::io::{BufReader, BufWriter, Read};
use peak_alloc::PeakAlloc;
#[global_allocator]
static PEAK_ALLOC: PeakAlloc = PeakAlloc;
fn print_mem_stat() {
let current_mem = PEAK_ALLOC.current_usage_as_mb();
println!("This program currently uses {} MB of RAM.", current_mem);
let peak_mem = PEAK_ALLOC.peak_usage_as_gb();
println!("The max amount that was used {}", peak_mem);
println!();
}
fn main() {
println!("start");
print_mem_stat();
let path = std::env::current_dir().unwrap().join("data.json");
{
let data = (0..(1<<26)).collect::<Vec<_>>();
let f = std::fs::OpenOptions::new().write(true).truncate(true).create(true).open(&path).unwrap();
let wtr = BufWriter::new(f);
serde_json::to_writer_pretty(wtr, &data).unwrap();
println!("writing data");
print_mem_stat();
}
// parse form a BufReader;
{
let f = std::fs::OpenOptions::new().read(true).open(&path).unwrap();
let rdr = BufReader::new(f);
let v: Vec<i32> = serde_json::from_reader(rdr).unwrap();
println!("reading data from reader");
print_mem_stat();
}
// read to a string before parsing
{
let f = std::fs::OpenOptions::new().read(true).open(&path).unwrap();
let mut rdr = BufReader::new(f);
let mut content = String::new();
rdr.read_to_string(&mut content).unwrap();
let v: Vec<i32> = serde_json::from_str(&content).unwrap();
println!("reading data as from string");
print_mem_stat();
}
std::fs::remove_file(path).unwrap();
}
start
This program currently uses 0.0010881424 MB of RAM.
The max amount that was used 0.0000010626391
writing data
This program currently uses 256.00116 MB of RAM.
The max amount that was used 0.25000876
reading data from reader
This program currently uses 256.00116 MB of RAM.
The max amount that was used 0.37500876
reading data as from string
This program currently uses 1013.4126 MB of RAM.
The max amount that was used 1.1146607
Upvotes: -2