6:[["$","$Le",null,{}],["$","div",null,{"className":"min-h-screen bg-gray-100 p-6","children":[["$","$Lf",null,{}],["$","script",null,{"type":"application/ld+json","dangerouslySetInnerHTML":{"__html":"{\"@context\":\"https://schema.org\",\"@type\":\"QAPage\",\"mainEntity\":{\"@type\":\"Question\",\"name\":\"nltk.download('wordnet') in Dataproc\",\"text\":\"

When I run the following script in Dataproc

\\n

import nltk\\nnltk.download('wordnet')\\n

\\n

The nltk_data is downloaded only in master node but not in worker nodes. Thus submitting PySpark job in dataproc it is failing to read from worker nodes.

\\n

What solutions do you suggest?\\nHow can download nltk_data in worker nodes too?

\\n\",\"author\":{\"@type\":\"Person\",\"name\":\"Arman Malkhasyan\"},\"upvoteCount\":1,\"answerCount\":1,\"acceptedAnswer\":null}}"}}],["$","div",null,{"className":"bg-white shadow-md rounded-lg p-6 mb-6 relative","children":[["$","div",null,{"className":"absolute top-4 right-4 flex flex-wrap space-x-2","children":[["$","span","python",{"className":"bg-blue-600 text-white text-sm px-3 py-1 rounded-full","children":["$","$L10",null,{"href":"/discussion/tag/python/1","children":"python"}]}],["$","span","apache-spark",{"className":"bg-blue-600 text-white text-sm px-3 py-1 rounded-full","children":["$","$L10",null,{"href":"/discussion/tag/apache-spark/1","children":"apache-spark"}]}],["$","span","pyspark",{"className":"bg-blue-600 text-white text-sm px-3 py-1 rounded-full","children":["$","$L10",null,{"href":"/discussion/tag/pyspark/1","children":"pyspark"}]}],["$","span","nltk",{"className":"bg-blue-600 text-white text-sm px-3 py-1 rounded-full","children":["$","$L10",null,{"href":"/discussion/tag/nltk/1","children":"nltk"}]}],["$","span","google-cloud-dataproc",{"className":"bg-blue-600 text-white text-sm px-3 py-1 rounded-full","children":["$","$L10",null,{"href":"/discussion/tag/google-cloud-dataproc/1","children":"google-cloud-dataproc"}]}]]}],["$","div",null,{"className":"flex items-center mb-4","children":[["$","img",null,{"src":"https://lh3.googleusercontent.com/a-/AOh14Gg45cHLdT6jAle72MwbN7OvnGLW4ySiwarL4RJGWg=k-s256","alt":"Arman Malkhasyan","className":"w-16 h-16 rounded-full border"}],["$","div",null,{"className":"ml-4","children":[["$","a",null,{"href":"https://stackoverflow.com/users/17370190/arman-malkhasyan","target":"_blank","rel":"noopener noreferrer","className":"text-lg font-semibold text-blue-600 hover:underline","children":"Arman Malkhasyan"}],["$","p",null,{"className":"text-sm text-gray-500","children":["Reputation: ",97]}]]}]]}],["$","h1",null,{"className":"text-2xl font-bold text-gray-800 mb-4","children":"nltk.download('wordnet') in Dataproc"}],["$","p",null,{"className":"text-gray-700 mt-4","dangerouslySetInnerHTML":{"__html":"

When I run the following script in Dataproc

import nltk\nnltk.download('wordnet')\n

The nltk_data is downloaded only in master node but not in worker nodes. Thus submitting PySpark job in dataproc it is failing to read from worker nodes.

What solutions do you suggest?\nHow can download nltk_data in worker nodes too?

\n"}}],["$","div",null,{"className":"text-gray-600 text-sm mt-4","children":[["$","p",null,{"children":["Upvotes: ",1]}],["$","p",null,{"children":["Views: ",134]}]]}]]}],["$","div",null,{"className":"container mx-auto","children":[["$","h2",null,{"className":"text-2xl font-semibold text-gray-800 mb-6","children":["Answers (",1,")"]}],[["$","div","75318440",{"className":"bg-white shadow-md rounded-lg p-6 mb-6","children":[["$","div",null,{"className":"flex items-center mb-4","children":[["$","img",null,{"src":"https://www.gravatar.com/avatar/23a08fe444828e2c598d2e63c7cef8c0?s=256&d=identicon&r=PG&f=y&so-version=2","alt":"Igor Dvorzhak","className":"w-12 h-12 rounded-full border"}],["$","div",null,{"className":"ml-4","children":[["$","a",null,{"href":"https://stackoverflow.com/users/3227693/igor-dvorzhak","target":"_blank","rel":"noopener noreferrer","className":"text-lg font-semibold text-blue-600 hover:underline","children":"Igor Dvorzhak"}],["$","p",null,{"className":"text-sm text-gray-500","children":["Reputation: ",4455]}]]}]]}],["$","p",null,{"className":"text-gray-700 mb-4","dangerouslySetInnerHTML":{"__html":"

You can use init actions to do this on all cluster nodes: https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/init-actions

\n"}}],["$","div",null,{"className":"text-gray-600 text-sm","children":["$","p",null,{"children":["Upvotes: ",1]}]}]]}]]]}],["$","div",null,{"className":"bg-white shadow-md rounded-lg p-6 mt-6","children":[["$","h2",null,{"className":"text-2xl font-semibold text-gray-800 mb-4","children":"Related Questions"}],["$","ul",null,{"className":"list-disc list-inside","children":[["$","li","53676519",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/53676519","className":"text-blue-600 hover:underline","children":"Creating image masks in flutter"}]}],["$","li","71620770",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/71620770","className":"text-blue-600 hover:underline","children":"drawing an image on top of another in flutter"}]}],["$","li","71384224",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/71384224","className":"text-blue-600 hover:underline","children":"Flutter place an image inside another image"}]}],["$","li","71354220",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/71354220","className":"text-blue-600 hover:underline","children":"Flutter place object on top of image"}]}],["$","li","70793774",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/70793774","className":"text-blue-600 hover:underline","children":"How can I make a Stack of images in flutter?"}]}],["$","li","65037976",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/65037976","className":"text-blue-600 hover:underline","children":"Flutter: How do I make an image overlay?"}]}],["$","li","64609374",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/64609374","className":"text-blue-600 hover:underline","children":"How to overlay an icon on an image in Flutter?"}]}],["$","li","63354227",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/63354227","className":"text-blue-600 hover:underline","children":"How to add swiper dots to the image"}]}],["$","li","52173205",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/52173205","className":"text-blue-600 hover:underline","children":"how can put image inside the image in flutter"}]}],["$","li","54893321",{"className":"mb-2","children":["$","$L10",null,{"href":"/discussion/solution/54893321","className":"text-blue-600 hover:underline","children":"Image inside a container in flutter"}]}]]}]]}]]}],["$","$L11",null,{}],["$","$L12",null,{}],["$","$L13",null,{}],["$","$L14",null,{}],["$","$L15",null,{}]]

nltk.download(&#39;wordnet&#39;) in Dataproc

Answers (1)

Related Questions

nltk.download('wordnet') in Dataproc