Andy Wu
Andy Wu

Reputation: 1

Vue Chrome Extension with Tesseract.js

recently I've been working on a chrome extension that uses vue as the frontend. The vue boilerplate that enables the extension to run on the browser uses webpack and is downloaded with:

vue init kocal/vue-web-extension name

and gives this project structure :

.
├── dist
│   └── <the built extension>
├── node_modules
│   └── <one or two files and folders>
├── package.json
├── package-lock.json
├── scripts
│   ├── build-zip.js
│   └── remove-evals.js
├── src
│   ├── background.js
│   ├── icons
│   │   ├── icon_128.png
│   │   ├── icon_48.png
│   │   └── icon.xcf
│   ├── manifest.json
│   └── popup
│       ├── App.vue
│       ├── popup.html
│       └── popup.js
└── webpack.config.js

The problem with this setup is that now I'm trying to implement OCR using tesseract.js and because chrome extensions don't let you use CDNs or outside libraries I need to download tesseract.js files locally. I looked through this link about downloading locally and also referenced tesseract.js' example on using tesseract.js with chrome extension (https://github.com/jeromewu/tesseract.js-chrome-extension), however when I'm loading the library I keep encountering the problem

tesseract.min.js:688 Uncaught Error: ReferenceError: window is not defined
    at eval (tesseract.min.js:688)
    at Worker.e.onmessage (tesseract.min.js:1579)

The current tesseract code I have right now in a vue file is (App.vue) and the problem seems to happen on await worker.load():

const { createWorker } = Tesseract;
  const worker = createWorker({
    workerPath: chrome.runtime.getURL("js/worker.min.js"),
    langPath: chrome.runtime.getURL("traineddata"),
    corePath: chrome.runtime.getURL("js/tesseract-core.wasm.js")
  });

  async function extract() {
    console.log("test1");
    await worker.load();
    console.log("test2");
    await worker.loadLanguage("eng");
    await worker.initialize("eng");
    const {
      data: { text }
    } = await worker.recognize("https://tesseract.projectnaptha.com/img/eng_bw.png");
    console.log(text);
    await worker.terminate();
  }

extract();

Html page includes (tab.html):

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <link rel="stylesheet" href="tab.css" />
    <script src="../js/tesseract.min.js"></script>
  </head>
  <body>
    <div id="app"></div>

    <script src="tab.js"></script>
  </body>
</html>

and js file (tab.js):

import Vue from "vue";
import App from "./App";


/* eslint-disable no-new */
new Vue({
  el: "#app",

  render: h => h(App)
});

My current file structure looks like this: File structure

I've been stuck on this problem for quite a while now so any help would be greatly appreciated!

Upvotes: 0

Views: 908

Answers (1)

zufrieden
zufrieden

Reputation: 40

Although I can't help you with your question per se (and it's been over six months without anyone else answering), I thought I'd let you know how I solved my similar problem.

I too wanted an OCR function in a chrome extension and started digging into tesseract to begin with. When I couldn't solve it I moved on and instead used OCRAD.js and GOCR.js for my project. Although perhaps not quite as powerful as tesseract, I'm fully satisfied with my result.

Both OCRAD and GOCR are simple to include in your manifest.json and then you call the functions in your script by calling them as functions: OCRAD(image) or GOCR(image).

OCRAD has a nice demo page where you can test the functionality for your desired images before using it: https://antimatter15.com/ocrad.js/demo.html

Upvotes: 0

Related Questions