m-ketan
m-ketan

Reputation: 1308

can't read .docx file after uploading (nodejs)

So I'm trying to upload and read a .docx file to an express server using express-fileupload package. The upload part is working fine but somehow I'm not able to read the file as it prints unreadable gibberish text. Following is the code:

app.post('/upload', (req, res, next) => {
  let file = req.files.file;

  file.mv(`${__dirname}/public/${req.body.filename}`, function(err) {
    if (err) {
      return res.status(500).send(err);
    }

    fs.readFile(`${__dirname}/public/${req.body.filename}`, 'utf8', function (err,data) {
      if (err) {
        return console.log(err);
      }
      console.log(data) // prints broken text/gibberish;
    });

    res.json({data to be returned});
  });

});

What I want is to be able to read the .docx file and do operations on the text inside it.

Upvotes: 0

Views: 706

Answers (1)

ThiefMaster
ThiefMaster

Reputation: 318548

docx file don't contain human-readable text. They are actually ZIP files containing many different XML files - but even the text content of the XML files won't be easy to work with.

If you want to read or even modify text inside a docx file you need to find a library that can read/write the format.

Upvotes: 3

Related Questions