Reputation: 3329
I have a string which basically contains a paragraph. There might be line breaks. Now I would want to get only the 1st sentence in the string. I thought I would try
indexOf(". ")
that is a dot with a space.
The problem is that this won't work though on a line such as firstName. LastName
.
I'm using .Net. Is there a good method available to achieve this? Im also tagging Java to see if I can narrow down my search.
Upvotes: 1
Views: 2290
Reputation: 259
This can be with use very simple implementation with String.substring()
String example = "Hello world. This is example. " ;
System.out.print(example.substring(0, example.indexOf(".")+1)); // --> Hello world.
Upvotes: 2
Reputation: 512
You need to somehow mark the end of a sentence. As you already noted a "." isn't doing that since it can be used differently ("Hi, my name is Mr. Pudelhund."). If possible I would recommend using some sign that won't be used.
Edit: The other method is good as well, but way more complicated. If you can't edit the string you are using though, that method beats mine ;)
Upvotes: 2
Reputation: 837946
What you need is a Natural Language Parsing (NLP) toolkit. It's very hard to write one yourself, as it requires a lot of research and data collection, but luckily it has already been done for you.
.NET
SharpNLP is a collection of natural language processing tools written in C#. Currently it provides the following NLP tools:
- a sentence splitter
- ...
Java
Upvotes: 2