Reputation: 141
I saved my network data in a har file. Now I want to extract the whole dictionary of content that contains specific word as an indicator to save that dictionary to an array. There are multiple similar dicts in the har file that contain that value and I want to create an array of all the responses.
I am fairly new to python(and coding in general), explainlikeimfive kind explanation will greatly help me.
Upvotes: 14
Views: 24233
Reputation: 12992
You can use haralyzer
module. You can install it easily using pip
like so:
pip install haralyzer
The following code uses this sample har file:
>>> import json
>>> from haralyzer import HarParser, HarPage
>>>
>>> with open('sample.har', 'r', encoding='utf-8') as f:
... har_parser = HarParser(json.loads(f.read()))
>>>
>>> data = har_parser.har_data
>>> type(data)
<class 'dict'>
>>>
>>> data.keys()
dict_keys(['version', 'creator', 'pages', 'entries'])
>>>
>>> har_parser.har_data["pages"]
[{'startedDateTime': '2013-08-24T20:16:16.997Z', 'id': 'page_1', 'title': 'http://ericduran.github.io/chromeHAR/', 'pageTimings': {'onContentLoad': 317, 'onLoad': 406}}]
For more info, check the official GitHub repository.
Upvotes: 17
Reputation: 2529
Tacking on to the answer from Anwarvic, entries in the HAR file that have a text-based content type contain the actual content in the key entry -> response -> content -> text
. So, here is an example printing the content of all such entries.
.... initialize har parser as per documentation ....
for page in har_parser.pages:
for entry in page.entries:
# Need to be careful accessing the text property, it will not exist for non text-based responses.
print(entry['response']['content'].get('text', ''))
From there you can use in
or a regex to see if the response text of the entry matches the text you are looking for.
Upvotes: 1