Xxxo
Xxxo

Reputation: 1931

Unknown word/character in txt file from word2vec

I recently encountered the </s> word/character in a vocabulary created by word2vec as a separate word.

Although I did tried to search the web for that character, I cannot actually specify that character at the search engines.

So, does anyone knows what this character is?

Upvotes: 0

Views: 76

Answers (1)

kampta
kampta

Reputation: 4898

If you look at the line 82 of source code of word2vec,

if (ch == '\n') {
  strcpy(word, (char *)"</s>");
  return;
}

</s> is simply a character used by Mikolov et al. to denote the end of line (or more precisely \n). I don't think it has any special html/latex reference. Nor does it appears on ASCII chart.

Upvotes: 1

Related Questions