BTG123
BTG123

Reputation: 147

Sentence Splitting in Python and making it an Ordered Dict

I want to split my text and store it to make an ordered dict

For example:

1.This is my text.
2.This is 2nd Text.

I want to split the numbers and text and store it in a ordered dict like

Ordered Dict 

"1":"This is my text"
"2":"This is 2nd text"

I tried . split but it didn't work for me. How to do this?

d = OrderedDict()
text_data = [ "1.This is my text.","2.This is 2nd text"]
for i, f in enumerate(text_data):
id = f.split('.')
d[id] = text_data[i]
print(i, " :: ", id, " =>\n", d[id], "\n" + "*" * 100 + "\n")

Where am I going wrong? To make an OrderedDict

Upvotes: 1

Views: 207

Answers (3)

U13-Forward
U13-Forward

Reputation: 71600

Or maybe:

from collections import OrderedDict
news='.'.join(s.split('. ')).split('.')
d=OrderedDict(list(dict(news[i:i+2] for i in range(len(news)-2)).items())[::2])
for k,v in d.items():
    print('"%s": "%s"'%(k,v))

Output:

"1": "This is my text"
"2": "This is 2nd Text"

Upvotes: 0

Ma0
Ma0

Reputation: 15204

I would recommend the following:

from collections import OrderedDict


d = OrderedDict()
text_data = [ "1.This is my text.", "2.This is 2nd text"]

for sentence in text_data:  # Note 1
    num, text = sentence.rstrip('.').split('.', 1)  # Notes 2 & 3
    d[num] = text

Notes:

  1. You do need to use the i from the enumerate, so remove it.
  2. rstrip before you split. As you can see there is a dot ('.') at the end of each sentence that might interfere with the split. If you however want to keep the last dot (if it exists), simply remove the .rstrip('.') part.
  3. Pass a second argument to split telling it how many cuts it should do. Think of the case '3. A sentence. With a dot in between.'.

The above produces:

for k, v in d.items():
    print('{!r}: {!r}'.format(k, v))

# '1': 'This is my text'
# '2': 'This is 2nd text'

Upvotes: 0

Rakesh
Rakesh

Reputation: 82785

You are very close. After splitting string by dot access the elements using index.

Ex:

from collections import OrderedDict

d = OrderedDict()
text_data = [ "1.This is my text.","2.This is 2nd text"]
for i, f in enumerate(text_data):
    val = f.split('.')           #str.split
    d[val[0]] = val[1]           #Use Index. 

for k, v in d.items():
    print(k, v)

Upvotes: 1

Related Questions