Reputation: 9129
I have a big (~2Gb) yaml
file. I use yaml-cpp
library and YAML::Loadfile
function. But I have an issue of RAM shortage.
What is the easiest way to split this file into several small ones in a way that each small file would be a valid yaml
file (maybe by capabilities of linux)?
Upvotes: 1
Views: 3657
Reputation: 76722
If you have multiple documents in your file and then you could split upon ---
at the beginning of the line.
If you don't have multiple documents (or if you have multiple ones, but they are still too big), your document either has mapping at the top level or a sequence (in theory you can also have a multi-line scalar, but that is unlikely).
If the toplevel of your document has flow style (mapping with { }
, sequence with [ ]
) then things are very dependent on how the layout is. But if it has block style then you can easily find the individual keys of the top-level map, or the elements of the sequence. They all have the same indentation as the first element (most likely zero indent).
Split your YAML document based on the above information and process each element on its own.
Upvotes: 2