Reputation: 257
I have a "json file" with the following objects:
combinedRecognizedPhrases
recognizedPhrases
json example:
{
"source": "https://example.com",
"timestamp": "2021-04-12T19:34:24Z",
"durationInTicks": 1082400000,
"duration": "PT1M48.24S",
"combinedRecognizedPhrases": [
{
"channel": 0,
"lexical": "aaa",
"itn": "aaa",
"maskedITN": "aaa",
"display": "aaa"
}
],
"recognizedPhrases": [
{
"recognitionStatus": "Success",
"channel": 0,
"speaker": 1,
"offset": "PT2.18S",
"duration": "PT3.88S",
"offsetInTicks": 21800000,
"durationInTicks": 38800000,
"nBest": [
{
"confidence": 0.9306252,
"lexical": "gracias por llamar",
"itn": "gracias por llamar",
"maskedITN": "gracias por llamar",
"display": "¿Gracias por llamar",
"words": [
{
"word": "gracias",
"offset": "PT2.18S",
"duration": "PT0.37S",
"offsetInTicks": 21800000,
"durationInTicks": 3700000,
"confidence": 0.930625
},
{
"word": "por",
"offset": "PT2.55S",
"duration": "PT0.18S",
"offsetInTicks": 25500000,
"durationInTicks": 1800000,
"confidence": 0.930625
},
{
"word": "llamar",
"offset": "PT2.73S",
"duration": "PT0.22S",
"offsetInTicks": 27300000,
"durationInTicks": 2200000,
"confidence": 0.930625
}
]
}
]
},
{
"recognitionStatus": "Success",
"channel": 0,
"speaker": 2,
"offset": "PT6.85S",
"duration": "PT5.63S",
"offsetInTicks": 68500000,
"durationInTicks": 56300000,
"nBest": [
{
"confidence": 0.9306253,
"lexical": "quiero hacer un pago",
"itn": "quiero hacer un pago",
"maskedITN": "quiero hacer un pago",
"display": "quiero hacer un pago"
}
]
},
{
"recognitionStatus": "Success",
"channel": 0,
"speaker": 2,
"offset": "PT13.29S",
"duration": "PT3.81S",
"offsetInTicks": 132900000,
"durationInTicks": 38100000,
"nBest": [
{
"confidence": 0.93062526,
"lexical": "no sé bien la cantidad",
"itn": "no sé bien la cantidad",
"maskedITN": "no sé bien la cantidad",
"display": "no sé bien la cantidad"
}
]
}
]
}
In the example the recognizedPhrases
object has values from 0
to 2
. Each of these values has information describing it:
"recognitionStatus": "Success",
"channel": 0,
"speaker": 1,
"offset": "PT2.18S",
"duration": "PT3.88S",
"offsetInTicks": 21800000,
"durationInTicks": 38800000
There is also an object inside recognizedPhrases
called nBest
which contains the following information:
"confidence": 0.9306252,
"lexical": "thank you for calling",
"itn": "thank you for calling",
"maskedITN": "thank you for calling",
"display": "thank you for calling".
I need to organize the information available in each recognizedPhrases/[0]
or 1
or 2
etc. /nBest/[1]/display
In a DF with a column named "speaker 1" when speaker=1 and another column when "speaker 2".
For example: If recognizedPhrases/[0]
object contains "speaker": 1
and recognizedPhrases/[1]
also has "speaker": 1
those phrases should be concatenated in the speaker 1 column of my df.
Edit 1: I have already tried the following:
with open('file.json','r') as f:
j = json.load(f)
test = pd.json_normalize(j, record_path=['recognizedPhrases'], meta=['source', 'durationInTicks', 'duration'], record_prefix='_')
which give me the following DF:
The problem with this DF is that it has one row for each time each speaker speaks. In the example I am using, speaker1 speaks once, and speaker 2 speaks twice, this code would generate 3 rows, and that is not what I am looking for. In addition, what each speaker says is inside the _nBest
dictionary and requires additional code to get what was said.
What I am trying to get is a DF where all the information is in a single row. Here is an example of what I am looking for:
Comment on @DSteman answer: This approach does one good thing and that is it allows me to separate speakers. However, there are two things I need to improve. First, this approach creates two rows. I need all the information to be in one row. Second, that in the speaker 1 column there is something that speaker 2 said.
Third, with this approach there is a lot of necessary information left out (see above the picture of the output I am looking for).
Upvotes: 3
Views: 76
Reputation: 1668
If the JSON object is assigned to variable 'x' as a dict, the following generates a df where the columns are the speakers and rows are the phrases.
speakers = [i.get('speaker') for i in x.get('recognizedPhrases')]
phrases = [i.get('nBest')[0].get('display') for i in x.get('recognizedPhrases')]
columns = list(set(speakers))
data = [[x[1] for x in zip(speakers, phrases) if x[0] == i] for i in columns]
df = pd.DataFrame(data, columns=columns)
Upvotes: 1