Reputation: 13
I am trying to use crate docx_rust
to merge a list of Word documents into one. I try this but I am having a lifetime problem with doc
because it's dropped at the end of the for
loop.
use docx_rust::{Docx, DocxFile};
fn merge_docs<'a>(file_paths: &[&str], output_path: &str) -> Result<(), Box<dyn std::error::Error>> {
let mut merged_doc = Docx::default();
for path in file_paths {
let doc = DocxFile::from_file(Path::new(path))?; // No Rc here initially
let mut p_doc = doc.parse().unwrap();
merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
}
merged_doc.write_file(Path::new(output_path))?;
Ok(())
}
I tried to push each doc opened into a vector, but it does not extend the lifetime.
I tried to store them into a Rc::new()
, but it did not work.
I will welcome any solution!
Upvotes: 0
Views: 92
Reputation: 680
I tried running the above code on my local machine. What worked for me was
use docx_rust::{Docx, DocxFile};
use std::path::Path;
fn merge_docs<'a>(file_paths: &[&str], output_path: &str) -> Result<(), Box<dyn std::error::Error>> {
let mut merged_doc = Docx::default();
let mut docs: Vec<DocxFile> = Vec::new();
for path in file_paths {
let doc = DocxFile::from_file(Path::new(path))?; // No Rc here initially
docs.push(doc);
}
for doc in &docs{
let mut p_doc = doc.parse().unwrap();
merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
}
merged_doc.write_file(Path::new(output_path))?;
Ok(())
}
As can be seen, the differences are:
There are 2 loops instead of 1.
The lifetime of the Docx has been extended by first loading it all into a vector
We then take a reference over each element of the vector.
And then perform the parse and extend operation
This ensures that the vector of docs is responsible for maintaining the lifetime of the DocxFile thus seemingly extending the lifetime
Now when we take a reference of the DocxFile in p_doc/the 2nd loop, the variable we are taking a reference of is not going to move.
A reason for the issues are that p_doc takes a reference to doc:
To reduce allocations, DocxFile::parse returns a Docx struct contains references to DocxFile itself. It means you have to make sure that DocxFile lives as long as its returned Docx:
So with the original code when ever we reach the end of the block, doc's lifetime is over but merged_doc via extend is still holding on to references to the original doc.
{
let doc = DocxFile::from_file(Path::new(path))?; // No Rc here initially
let mut p_doc = doc.parse().unwrap();
merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
}
So due to lifetime issues, the above code will not work. Which means we would have o extend the lifetime.
If you try to push to a vec to extend the lifetime, Rust complains because merged_doc via p_doc is still holding onto a reference to doc so the borrow checker feels due to a move, that reference link could be invalidated. So this approach too does not work.
{
let doc = DocxFile::from_file(Path::new(path))?; // No Rc here initially
let mut p_doc = doc.parse()?;
merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
docs.push(doc);
}
This is why the 2 loop approach works. As we are first inserting into the vec and only later taking references thus ensuring that no references can be invalidated.
For further details, check https://doc.rust-lang.org/error_codes/E0505.html
Here, the function eat takes ownership of x. However, x cannot be moved because the borrow to _ref_to_val needs to last till the function borrow. To fix that you can do a few different things:
Try to avoid moving the variable.
Release borrow before move.
Implement the Copy trait on the type.
Our solution uses Try to avoid moving the variable.
P.S. The borrow checker will also complain the vector approach if we attempt to let's say push to the vector after the loop with the borrows.
Lets look at this simple example
for path in file_paths {
let doc = DocxFile::from_file(Path::new(path))?; // No Rc here initially
docs.push(doc);
}
for doc in &docs{
let mut p_doc = doc.parse()?;
merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
}
{
let doc = DocxFile::from_file("origin.docx").unwrap();
docs.push(doc);
}
Upvotes: 2