Reputation:
What is the best way to count variable of say an apostrophe counting with words such as "shouldn't" only.
For example "I shouldn't do that" counts once But " 'I will not do that' " counts zero
Basically how can i use counts to count apostrophes in words and not quotes.
I haven't been able to try much successfully. I can only use the basic for loop to count every apostrophe but can't narrow down specifically.
for sentence in split_sentences:
for w in sentence:
for p in punctuation:
if p == w:
if word in counts:
counts[p] += 1
else:
counts[p] = 1
else:
pass
With a given list of words, It should count only in words not around word. So "Shouldn't" will count but "'should'" will not.
Upvotes: 2
Views: 720
Reputation: 5120
You can use the regular expression [a-zA-Z]'[a-zA-Z]
to find all single quotes that are surrounded by letters.
The requirement for the hyphen isn't completely clear to me. If it has the same requirement (i.e. it only counts when surrounded by letters) than using the regular expression [a-zA-Z]['-][a-zA-Z]
will do the trick: it will count quotes as well as hyphens.
If you should count all hyphens, then you could just use the str.count method (e.g.
"test-string".count("-")
returns 1).
Here is some example code, assuming the hyphens must also be counted only if they are surrounded by letters:
import re
TEST_SENTENCES = (
"I shouldn't do that",
"'I will not do that'",
"Test-hyphen"
)
PATTERN = re.compile("[a-zA-Z]['-][a-zA-Z]")
for sentence in TEST_SENTENCES:
print(len(PATTERN.findall(sentence)))
Output:
1
0
1
Upvotes: 0
Reputation: 42796
You can check if it is inside the word:
for sentence in split_sentences:
for w in sentence:
for p in punctuation:
if p in w and w[0] != p and w[-1] != p:
if word in counts:
counts[p] += 1
else:
counts[p] = 1
else:
pass
The important line is this if p in w and w[0] != p and w[-1] != p:
We have 3 rules for it to count:
p
is in the word 2
w
does not start (w[0]
) by the punctuation p
w
does not ends (w[-1]
) by the punctuation p
A more pythonic way of doing such would be to use the str available methods, endswith
and startswith
:
...
if p in w and not w.startswith(p) and not w.endswith(p):
...
Upvotes: 3