Saugat Mukherjee
Saugat Mukherjee

Reputation: 990

Split a string with delimiters in the value

I am trying to parse the response from a REST call. The response header comes back in the format of a dictionary. The ultimate goal is to decode all the properties (value of x-ms-properties) to strings.

The response is in the format.

{'Last-Modified': 'Mon, 06 May 2019 09:32:13 GMT', 'ETag': '"0x8D6D205B880F304"', 'Server': 'abc', 'x-ms-properties': 'anotherprop=dGVzdA==,source=YWJj', 'x-ms-namespace-enabled': 'true', 'x-ms-request-id': '45839301-401f-0003-1202-04d929000000', 'x-ms-version': '2018-03-28', 'Date': 'Mon, 06 May 2019 11:54:29 GMT'}

I would like to parse the value of the key x-ms-properties. If you see, the value is in the form of key- value pairs. And the value is base64 encoded.

I can decode the value dGVzdA== statically using the code.

import base64
b1="dGVzdA=="
# Decoding the Base64 bytes
d = base64.b64decode(b1)
# Decoding the bytes to string
s2 = d.decode("UTF-8")
print(s2)

But how do I parse the response and then do this generically?

I have read the forum posts and tried something like

originalresp={'Last-Modified': 'Mon, 06 May 2019 09:32:13 GMT', 'ETag': '"0x8D6D205B880F304"', 'Server': 'abc', 'x-ms-properties': 'anotherprop=dGVzdA==,source=YWJj', 'x-ms-namespace-enabled': 'true', 'x-ms-request-id': '45839301-401f-0003-1202-04d929000000', 'x-ms-version': '2018-03-28', 'Date': 'Mon, 06 May 2019 11:54:29 GMT'}

properties=originalresp["x-ms-properties"]

dict(item.split("=") for item in properties.split(","))

But of course it fails, as my properties has "==" in the value, because of the base64 encoding.

How to get the value for this key and then proceed on to my decoding?

Upvotes: 0

Views: 449

Answers (2)

Devesh Kumar Singh
Devesh Kumar Singh

Reputation: 20490

The only thing missing in the code is you need to tell split('=') to only consider the first equals, which you can do by item.split("=",1)

From the docs: https://docs.python.org/3/library/stdtypes.html#str.split

str.split(sep=None, maxsplit=-1)
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements).

So making this change, we see

originalresp={'Last-Modified': 'Mon, 06 May 2019 09:32:13 GMT', 'ETag': '"0x8D6D205B880F304"', 'Server': 'abc', 'x-ms-properties': 'anotherprop=dGVzdA==,source=YWJj', 'x-ms-namespace-enabled': 'true', 'x-ms-request-id': '45839301-401f-0003-1202-04d929000000', 'x-ms-version': '2018-03-28', 'Date': 'Mon, 06 May 2019 11:54:29 GMT'}

properties=originalresp["x-ms-properties"]

#Changed the split on equals here with maxsplit=1
dct = dict(item.split("=",1) for item in properties.split(","))
print(dct)

The output will be

{'anotherprop': 'dGVzdA==', 'source': 'YWJj'}

Now your original code will work as expected :)

import base64
# Decoding the Base64 bytes
d = base64.b64decode(dct['anotherprop'])
# Decoding the bytes to string
s2 = d.decode("UTF-8")
print(s2)

The output will be test

Upvotes: 0

Rakesh
Rakesh

Reputation: 82765

Use the ast module

Ex:

import ast

originalresp="""{'Last-Modified': 'Mon, 06 May 2019 09:32:13 GMT', 'ETag': '"0x8D6D205B880F304"', 'Server': 'abc', 'x-ms-properties': 'anotherprop=dGVzdA==,source=YWJj', 'x-ms-namespace-enabled': 'true', 'x-ms-request-id': '45839301-401f-0003-1202-04d929000000', 'x-ms-version': '2018-03-28', 'Date': 'Mon, 06 May 2019 11:54:29 GMT'}"""
originalresp = ast.literal_eval(originalresp)
print(originalresp["x-ms-properties"])

Output:

anotherprop=dGVzdA==,source=YWJj

Upvotes: 1

Related Questions