Reputation: 575
Not sure if this is a rookie mistake or plain stupid, but I am facing this strange issue. I have a unicoded string declared as classifier = u"''"
which I am checking for emptiness.
The following code block:
if classifier:
# do something
else:
# else do something else
will hit the else block since there is ''
embedded. I don't have control over the source generating classifier string.
Only if classifier can somehow be operated to return the embedded ''
I can check for emptiness of classifier
, but not sure how. If it is of any help classifier
is collected from HttpRequest
object classifier = request.GET.get('c', '')
.
EDIT:
classifier[1:-1]
returns u''
which now can be checked for emptiness. Any built in method which one can use?
I will go ahead with this approach for now. But leaving the post open for any other advanced pointers if any.
thanks,
Upvotes: 0
Views: 94
Reputation: 365925
You have to actually know what the data means before you can decide how to parse it. Just randomly hacking at it until it works for one example isn't going to help.
So, you're getting the string out of a URL, and it looks like this:
http:///a=maven&v=1.1.0&classifier=''&ttype=pom
Normally, when given a URL, the right thing to do is call urlparse.urlparse
and then call urlparse.parse_qs
on the query
. But that won't actually help here, because this is not actually a valid URL.
Well, it is a valid URL, but it's one with a path <someurl>/a=maven&v=1.1.0&classifier=''&ttype=pom
, not one with a path <someurl>/
and a query a=maven&v=1.1.0&classifier=''&ttype=pom
. You need a ?
to set off the query.
And, on top of that, the query is clearly not generated correctly. You don't quote empty strings in a query. You don't quote anything (you entity-escape ampersands and percent-escape any other special characters). So, unless the URL literally means that the classifier is ''
rather than the empty string, it's wrong.
And, if it weren't wrong, you wouldn't be asking these questions.
If you have any control over how these URLs are getting generated, obviously you want to get that fixed. If you can't control it, but at least know how they're being generated, you can write code to reverse that to get the original values. But if you don't even know that, you have to guess.
You ideally need more than one example to guess. Are they quoting just empty strings, or are they also, e.g., quoting strings with "
characters or spaces or ampersands in them? If it's the latter, you can probably just strip("'")
, but if it's the former, that will be incorrect in any cases where the original data actually has quotes.
Upvotes: 1
Reputation: 6978
You could do this:
if classifier.strip("'"):
# do something
else:
# else do something else
Upvotes: 2