Reputation: 23
The question is in the title.
I have joined sentences into a large text, which I then call analyze_sentiment
on. The goal is to pull sentiments for the individual sentences - exactly the ones originally joined.
I first clean out all punctuation, lower
the characters, capitalize
sentences, end them with .
and join
with a space.
Here is an example of two sentences that Google considers to be a single sentence.
She answered my questions with ease Thx. Tyler was so considerate.
However,
She answered my questions with ease Thx. Sam was so considerate.
works correctly.
You can try this yourself by going to their natural-language page and trying the API.
If I know the splitting conditions, I can format my original sentences accordingly.
Upvotes: 2
Views: 862
Reputation: 189
It looks like the sentence boundaries model gets confused. I will open a bug for this from the Google side.
If you need to find sentiment for each sentence though, you can send the sentences individually to the API, so the sentence boundary issue doesn't get in your way. Are you concatenating the sentences due to save on quota or billing or latency? Because in terms of how the model works and calculation of the sentiment score, there is no difference between sending the sentences individually vs all in one big chunk.
Upvotes: 2