Seruccia
Seruccia

Reputation: 13

Why I can't extend the lifetime of the variable of my function?

I am trying to use crate docx_rust to merge a list of Word documents into one. I try this but I am having a lifetime problem with doc because it's dropped at the end of the for loop.

use docx_rust::{Docx, DocxFile};

fn merge_docs<'a>(file_paths: &[&str], output_path: &str) -> Result<(), Box<dyn std::error::Error>> {
    let mut merged_doc = Docx::default();

    for path in file_paths {
        let doc = DocxFile::from_file(Path::new(path))?;  // No Rc here initially
        let mut p_doc = doc.parse().unwrap();
        merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
    }

    merged_doc.write_file(Path::new(output_path))?;

    Ok(())
}

I tried to push each doc opened into a vector, but it does not extend the lifetime. I tried to store them into a Rc::new(), but it did not work.

I will welcome any solution!

Upvotes: 0

Views: 92

Answers (1)

pratikpc
pratikpc

Reputation: 680

I tried running the above code on my local machine. What worked for me was

use docx_rust::{Docx, DocxFile};
use std::path::Path;

fn merge_docs<'a>(file_paths: &[&str], output_path: &str) -> Result<(), Box<dyn std::error::Error>> {
    let mut merged_doc = Docx::default();
    let mut docs: Vec<DocxFile> = Vec::new();

    for path in file_paths {
        let doc = DocxFile::from_file(Path::new(path))?;  // No Rc here initially
        docs.push(doc);
    }
    for doc in &docs{
        let mut p_doc = doc.parse().unwrap();
        merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
    }

    merged_doc.write_file(Path::new(output_path))?;

    Ok(())
}

As can be seen, the differences are:

  • There are 2 loops instead of 1.

  • The lifetime of the Docx has been extended by first loading it all into a vector

  • We then take a reference over each element of the vector.
    And then perform the parse and extend operation

  • This ensures that the vector of docs is responsible for maintaining the lifetime of the DocxFile thus seemingly extending the lifetime

  • Now when we take a reference of the DocxFile in p_doc/the 2nd loop, the variable we are taking a reference of is not going to move.
    A reason for the issues are that p_doc takes a reference to doc:

    To reduce allocations, DocxFile::parse returns a Docx struct contains references to DocxFile itself. It means you have to make sure that DocxFile lives as long as its returned Docx:

  • So with the original code when ever we reach the end of the block, doc's lifetime is over but merged_doc via extend is still holding on to references to the original doc.

      {
          let doc = DocxFile::from_file(Path::new(path))?;  // No Rc here initially
          let mut p_doc = doc.parse().unwrap();
          merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
      }
    
  • So due to lifetime issues, the above code will not work. Which means we would have o extend the lifetime.

  • If you try to push to a vec to extend the lifetime, Rust complains because merged_doc via p_doc is still holding onto a reference to doc so the borrow checker feels due to a move, that reference link could be invalidated. So this approach too does not work.

      {
              let doc = DocxFile::from_file(Path::new(path))?;  // No Rc here initially
              let mut p_doc = doc.parse()?;
              merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
              docs.push(doc);
      } 
    
  • This is why the 2 loop approach works. As we are first inserting into the vec and only later taking references thus ensuring that no references can be invalidated.

For further details, check https://doc.rust-lang.org/error_codes/E0505.html

Here, the function eat takes ownership of x. However, x cannot be moved because the borrow to _ref_to_val needs to last till the function borrow. To fix that you can do a few different things:
Try to avoid moving the variable.
Release borrow before move.
Implement the Copy trait on the type.

Our solution uses Try to avoid moving the variable.

P.S. The borrow checker will also complain the vector approach if we attempt to let's say push to the vector after the loop with the borrows.

Lets look at this simple example

for path in file_paths {
    let doc = DocxFile::from_file(Path::new(path))?;  // No Rc here initially
    docs.push(doc);
}
for doc in &docs{
    let mut p_doc = doc.parse()?;
    merged_doc.document.body.content.extend(p_doc.document.body.content.drain(..));
}
{
    let doc = DocxFile::from_file("origin.docx").unwrap();
    docs.push(doc);
}

Upvotes: 2

Related Questions