user3206440
user3206440

Reputation: 5059

python - asn1 parsed text to json

With text given in this link, need to extract data as follows

  1. Each record starts with YYYY Mmm dd hh:mm:ss.ms, for example 2019 Aug 31 09:17:36.550
  2. Each record has a header starting from line #1 above and ending with a blank line
  3. The record data is contained in lines below Interpreted PDU:
  4. The records of interest are the ones with record header first line having 0xB821 NR5G RRC OTA Packet -- RRC_RECONFIG

Is it possible to extract selected record headers and text below #3 above as an array of nested json in the format as below - snipped for brevity, really need to have the entire text data as JSON.

data = [{"time": "2019 Aug 31  09:17:36.550", "PDU Number": "RRC_RECONFIG Message", "Physical Cell ID": 0, "rrc-TransactionIdentifier": 1, "criticalExtensions rrcReconfiguration": {"secondaryCellGroup": {"cellGroupId": 1, "rlc-BearerToAddModList": [{"logicalChannelIdentity": 1, "servedRadioBearer drb-Identity": 2, "rlc-Config am": {"ul-AM-RLC": {"sn-FieldLength": "size18", "t-PollRetransmit": "ms40", "pollPDU": "p32", "pollByte": "kB25", "maxRetxThreshold": "t32"}, "dl-AM-RLC": {"sn-FieldLength": "size18", "t-Reassembly": "ms40", "t-StatusProhibit": "ms20"}}}]}}  }, next records data here]

Note that the input text is parsed output of ASN1 data specifications in 3GPP 38.331 section 6.3.2. I'm not sure normal python text parsing is the right way to handle this or should one use something like asn1tools library ? If so an example usage on this data would be helpful.

Upvotes: 1

Views: 3636

Answers (2)

Gerd
Gerd

Reputation: 2813

Given that you have a textual representation of the input data, you might take a look at the parse library. This allows you to find a pattern in a string and assign contents to variables.

Here is an example for extracting the time, PDU Number and Physical Cell ID data fields:

import parse

with open('w9s2MJK4.txt', 'r') as f:
  input = f.read()

data = []
pattern = parse.compile('\n{year:d} {month:w} {day:d}  {hour:d}:{min:d}:{sec:d}.{ms:d}{}Physical Cell ID = {pcid:d}{}PDU Number = {pdu:w} {pdutype:w}')

for s in pattern.findall(input):
  record = {}
  record['time'] = '{} {} {} {:02d}:{:02d}:{:02d}.{:03d}'.format(s.named['year'], s.named['month'], s.named['day'], s.named['hour'], s.named['min'], s.named['sec'], s.named['ms'])
  record['PDU Number'] = '{} {}'.format(s.named['pdu'], s.named['pdutype'])
  record['Physical Cell ID'] = s.named['pcid']
  data.append(record)

Since you have quite a complicated structure and a large number of data fields, this might become a bit cumbersome, but personally I would prefer this approach over regular expressions. Maybe there is also a smarter method to parse the date (which unfortunately seems not to have one of the standard formats supported by the library).

Upvotes: 0

YaFred
YaFred

Reputation: 10008

Unfortunately, it is unlikely that somebody will come with a straight answer to your question (which is very similar to How to extract data from asn1 data file and load it into a dataframe?)

The text of your link is obviously a log file where ASN.1 value notation was used to make the messages human readable. So trying to decode these messages from their textual form is unusual and you will probably not find tooling for that.

In theory, the generic method would be this one:

  1. Gather the ASN.1 DEFINITIONS (schema) that were used to create the ASN.1 messages
  2. Compile these DEFINITIONS with an ASN.1 tool (aka compiler) to generate an object model in your favorite language (python). The tool would provide the specific code to encode and decode ... you would use ASN.1 values decoders.
  3. Add your custom code (either to the object model or plugged in the ASN.1 compiler) to encode your JSON objects

As you see, it is a very long shot (I can expand if this explanation is too short or unclear)

Unless your task is repetivite and/or the number of messages is big, try the methods you already know (manual search, regex) to search the log file.

If you want to see what it takes to create ASN.1 tools, you can find a few (not that many as ASN.1 is not particularly young and popular). Check out https://github.com/etingof/pyasn1 (python)

I created my own for fun in Java and I am adding the ASN.1 value decoders to illustrate my answer: https://github.com/yafred/asn1-tool (branch text-asn-value-support)

Upvotes: 2

Related Questions