ski
ski

Reputation: 23

How does Google Language API split text into sentences to assign sentiment?

The question is in the title.

I have joined sentences into a large text, which I then call analyze_sentiment on. The goal is to pull sentiments for the individual sentences - exactly the ones originally joined.

I first clean out all punctuation, lower the characters, capitalize sentences, end them with . and join with a space.

Here is an example of two sentences that Google considers to be a single sentence.

She answered my questions with ease Thx. Tyler was so considerate.

However,

She answered my questions with ease Thx. Sam was so considerate.

works correctly.

You can try this yourself by going to their natural-language page and trying the API.

If I know the splitting conditions, I can format my original sentences accordingly.

Upvotes: 2

Views: 862

Answers (1)

Mona Attariyan
Mona Attariyan

Reputation: 189

It looks like the sentence boundaries model gets confused. I will open a bug for this from the Google side.

If you need to find sentiment for each sentence though, you can send the sentences individually to the API, so the sentence boundary issue doesn't get in your way. Are you concatenating the sentences due to save on quota or billing or latency? Because in terms of how the model works and calculation of the sentiment score, there is no difference between sending the sentences individually vs all in one big chunk.

Upvotes: 2

Related Questions