Reputation: 340
I am trying to read different language encoding models like golve, fasttext and word3vec and detecting the sarcasm but I am unable to read google's language encoding file. It's giving permission denied error. what should I do?
I tried different encoding and giving all permission to the file as well but still no luck
EMBEDDING_FILE = 'C:/Users/Abhishek/Documents/sarcasm/GoogleNews-vectors-negative300.bin/'
def get_coefs(word, *arr): return word, np.asarray(arr, dtype='float32')
embeddings_index = dict(get_coefs(*o.rstrip().rsplit(' ')) for o in open(EMBEDDING_FILE,encoding="ISO-8859-1"))
embed_size = 300
word_index = tokenizer.word_index
nb_words = min(max_features, len(word_index))
embedding_matrix = np.zeros((nb_words, embed_size))
for word, i in word_index.items():
if i >= max_features: continue
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None: embedding_matrix[i] = embedding_vector
PermissionError Traceback (most recent call last)
<ipython-input-10-5d122ae40ef0> in <module>
1 EMBEDDING_FILE = 'C:/Users/Abhishek/Documents/sarcasm/GoogleNews-vectors-negative300.bin/'
2 def get_coefs(word, *arr): return word, np.asarray(arr, dtype='float32')
----> 3 embeddings_index = dict(get_coefs(*o.rstrip().rsplit(' ')) for o in open(EMBEDDING_FILE,encoding="ISO-8859-1"))
4 embed_size = 300
5 word_index = tokenizer.word_index
PermissionError: [Errno 13] Permission denied: 'C:/Users/Abhishek/Documents/sarcasm/GoogleNews-vectors-negative300.bin/'
Upvotes: 1
Views: 1361
Reputation: 54203
You would likely get the same IO-related error no matter how you try, or for what purpose, you try to open the file – so this isn't really a question about nlp
, or word2vec
, or even jupyter-notebook
.
Note that sometimes errors that we'd consider other things get reported as "permission" problems - because at some level, you can't do that to that kind of path, or file.
You've specified the file path as 'C:/Users/Abhishek/Documents/sarcasm/GoogleNews-vectors-negative300.bin/'
, with a trailing /
that usually indicates something is a directory. That could be a problem.
Also, I believe this particular file is usually 3+ GB in size - and some DOS-descended filesystems, or a Python interpreter which is only 32-bit, might have problems handling files over certain sizes like 2GB or 4GB.
Upvotes: 2