Reputation: 5999
I'm trying to do a task automatically using python script but met with this strange phenomenon. I looked for the same in SO but it's slightly different, so I ask here using simplified example.
I have a file called test1.txt below.
"https://papers.nips.cc/paper/7286-efficient-algorithms-for-non-convex-isotonic-regression-through-submodular-optimization" ## Efficient Algorithms for Non-convex Isotonic Regression through Submodular Optimization
"https://papers.nips.cc/paper/7287-structure-aware-convolutional-neural-networks" ## Structure-Aware Convolutional Neural Networks
"https://papers.nips.cc/paper/7288-kalman-normalization-normalizing-internal-representations-across-network-layers" ## Kalman Normalization: Normalizing Internal Representations Across Network Layers
"https://papers.nips.cc/paper/7289-hogwild-gibbs-can-be-panaccurate" ## HOGWILD!-Gibbs can be PanAccurate
and the python script quest.py
import re
with open('test1.txt') as f:
for line in f:
#print line
link = re.sub(" ##.*","",line)
print link
link1 = link.strip('\"')
print link1
When I execute it by python quest.py
, I get
"https://papers.nips.cc/paper/7286-efficient-algorithms-for-non-convex-isotonic-regression-through-submodular-optimization"
https://papers.nips.cc/paper/7286-efficient-algorithms-for-non-convex-isotonic-regression-through-submodular-optimization"
"https://papers.nips.cc/paper/7287-structure-aware-convolutional-neural-networks"
https://papers.nips.cc/paper/7287-structure-aware-convolutional-neural-networks"
"https://papers.nips.cc/paper/7288-kalman-normalization-normalizing-internal-representations-across-network-layers"
https://papers.nips.cc/paper/7288-kalman-normalization-normalizing-internal-representations-across-network-layers"
"https://papers.nips.cc/paper/7289-hogwild-gibbs-can-be-panaccurate"
https://papers.nips.cc/paper/7289-hogwild-gibbs-can-be-panaccurate"
I want to print the link first time with the surrounding double quotes(=link) and then without the double quotes(=link1). But why do I see the trailing double quote for the link1?
Upvotes: 2
Views: 87
Reputation: 18697
Python's str.strip([chars])
will remove leading and trailing chars
, but will stop once it reaches a character not in chars
.
Looks like your link
ends with a newline char, and stripping stops before even reaching the double quote. (Hint: print
adds only one newline, and in your output you have two.)
To strip double quotes and newline chars:
link1 = link.strip('"\n')
Also, it's worth mentioning (as @glibdud notes in comments), the reason links were ending with a newline was because file iterator doesn't strip newlines, neither does the sub
expression (because .
doesn't include the newline; to include it, add re.DOTALL
regex flag).
Upvotes: 2
Reputation: 20500
Just strip the double quotes and newline when you want to print without quotes, and only strip newlines, when you want to print with double quotes
import re
with open('file.txt') as f:
for line in f:
if line.strip():
#print line
link = re.sub(" ##.*", "", line)
#Print with double quotes
print link.strip('\n')
#Print without double quotes by replacing double quotes with empty char
print link.strip('"\n')
#Print without double quotes by removing double quotes entirely
#print link.strip("\"")
The output will then be
"https://papers.nips.cc/paper/7286-efficient-algorithms-for-non-convex-isotonic-regression-through-submodular-optimization"
https://papers.nips.cc/paper/7286-efficient-algorithms-for-non-convex-isotonic-regression-through-submodular-optimization
"https://papers.nips.cc/paper/7287-structure-aware-convolutional-neural-networks"
https://papers.nips.cc/paper/7287-structure-aware-convolutional-neural-networks
"https://papers.nips.cc/paper/7288-kalman-normalization-normalizing-internal-representations-across-network-layers"
https://papers.nips.cc/paper/7288-kalman-normalization-normalizing-internal-representations-across-network-layers
"https://papers.nips.cc/paper/7289-hogwild-gibbs-can-be-panaccurate"
https://papers.nips.cc/paper/7289-hogwild-gibbs-can-be-panaccurate
Upvotes: 1