Reputation: 3230
I am trying to call this curl
from python3
. This, from bash
, is working fine.
curl -LH "Accept: text/bibliography; style=bibtex" http://dx.doi.org/10.1103/PhysRevLett.117.126802
yielding the expected result:
@article{Chang_2016, title={Observation of the Quantum Anomalous Hall Insulator to Anderson Insulator Quantum Phase Transition and its Scaling Behavior}, volume={117}, ISSN={1079-7114}, url={http://dx.doi.org/10.1103/PhysRevLett.117.126802}, DOI={10.1103/physrevlett.117.126802}, number={12}, journal={Physical Review Letters}, publisher={American Physical Society (APS)}, author={Chang, Cui-Zu and Zhao, Weiwei and Li, Jian and Jain, J. K. and Liu, Chaoxing and Moodera, Jagadeesh S. and Chan, Moses H. W.}, year={2016}, month={Sep}}
in python3, I am doing:
import subprocess
doi = "http://dx.doi.org/10.1103/PhysRevLett.117.126802"
try:
subprocess.call(["curl", "-LH", '"Accept: text/bibliography; style=bibtex"', doi])
except ExplicitException:
print("DOI is not available")
self.Messages.on_warn_clicked("DOI is not given",
"Search google instead")
which is giving error:
<html><body><h1>400 Bad request</h1>
Your browser sent an invalid request.
</body></html>
whats going wrong here?
Upvotes: 1
Views: 267
Reputation: 140307
You have 3 problems here:
subprocess
, it already does that for you when necessary, since you pass the arguments and not the unsplitted command line (good practice, keep it on, but drop the unneccessary quoting).subprocess.call
does not allow to parse/store the output in python, which is problematic for number 3:subprocess.call(["curl", "-LH", '"Accept: text/bibliography; style=bibtex"', doi])
should be
subprocess.call(["curl", "-LH", 'Accept: text/bibliography; style=bibtex', doi])
Else, quotes are applied twice and your Accept: xxx
argument has quotes around it, which is unexpected by curl
demo of the non-working quote part:
import subprocess,os
doi = "http://dx.doi.org/10.1103/PhysRevLett.117.126802"
#### this is wrong because of the quoting ####
p = subprocess.Popen(["curl", "-LH", '"Accept: text/bibliography; style=bibtex"', doi],stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
[output,error] = p.communicate()
print(output)
result:
b' some stats then ... <html><body><h1>400 Bad request</h1>\nYour browser sent an invalid request.\n</body></html>\n\r\n'
I have implemented a retry mechanism which parses the output and retries until correct output is found:
import subprocess,os,sys
doi = "http://dx.doi.org/10.1103/PhysRevLett.117.126802"
while True:
p = subprocess.Popen(["curl", "-LH", 'Accept: text/bibliography; style=bibtex', doi],stdout=subprocess.PIPE)
[output,error] = p.communicate()
output = output.decode("latin-1")
if "java.util.concurrent.FutureTask.run" in output:
# site crashed when responding: junk HTML output: retry
sys.stderr.write("Wrong answer: retrying\n")
else:
print(output)
break
result:
Wrong answer: retrying <==== here the site throwed a big HTML exception output
@article{Chang_2016, title={Observation of the Quantum Anomalous Hall Insulator to Anderson Insulator Quantum Phase Transition and its Scaling Behavior}, volume={117}, ISSN={1079-7114}, url={http://dx.doi.org/10.1103/PhysRevLett.117.126802}, DOI={10.1103/physrevlett.117.126802}, number={12}, journal={Physical Review Letters}, publisher={American Physical Society (APS)}, author={Chang, Cui-Zu and Zhao, Weiwei and Li, Jian and Jain, J.âK. and Liu, Chaoxing and Moodera, Jagadeesh S. and Chan, Moses H.âW.}, year={2016}, month={Sep}}
So it works, it's just a site problem, but with my python wrapper you are able to re-submit the request until it yields the proper answer.
Upvotes: 1