Georgi Stoyanov
Georgi Stoyanov

Reputation: 644

Python program which reads and extracts specific information from JSON file generated by FFPROBE

I want to create a simple Python script, which runs a custom ffprobe script and then extracts some specific information from the generated JSON file. So the ffprobe command is:

ffprobe -v quiet -print_format json -show_format -show_streams example.mp4 > output.json

This command is extracting the video specific information into output.json file. Then I want to read the file and extract some specific information out of it. For example the JSON file has the following format:

{
    "streams": [
        {
            "index": 0,
            "codec_name": "h264",
            "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
            "profile": "High 4:2:2 Intra",
            "codec_type": "video",
            "codec_time_base": "1/100",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "width": 3840,
            "height": 2160,
            "coded_width": 3840,
            "coded_height": 2160,
            "has_b_frames": 0,
            "sample_aspect_ratio": "1:1",
            "display_aspect_ratio": "16:9",
            "pix_fmt": "yuv422p10le",
            "level": 52,
            "color_range": "tv",
            "color_space": "bt709",
            "color_transfer": "bt709",
            "color_primaries": "bt709",
            "chroma_location": "left",
            "field_order": "progressive",
            "refs": 1,
            "is_avc": "false",
            "nal_length_size": "0",
            "r_frame_rate": "50/1",
            "avg_frame_rate": "50/1",
            "time_base": "1/50",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 15000,
            "duration": "300.000000",
            "bits_per_raw_sample": "10",
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0
            },
            "tags": {
                "file_package_umid": "0x060A2B340101010501010D0013EC94F152947134B6EC94F10052947134B6EC01",
                "file_package_name": "Source Package"
            }
        }
    ],
    "format": {
        "filename": "300sec.mxf",
        "nb_streams": 1,
        "nb_programs": 0,
        "format_name": "mxf",
        "format_long_name": "MXF (Material eXchange Format)",
        "start_time": "0.000000",
        "duration": "300.000000",
        "size": "16772788991",
        "bit_rate": "447274373",
        "probe_score": 100,
        "tags": {
            "uid": "adab4424-2f25-4dc7-92ff-29bd000c0000",
            "generation_uid": "adab4424-2f25-4dc7-92ff-29bd000c0001",
            "company_name": "FFmpeg",
            "product_name": "OP1a Muxer",
            "product_version": "57.66.101",
            "product_uid": "adab4424-2f25-4dc7-92ff-29bd000c0002",
            "modification_date": "0-01-02T00:00:00.000000Z",
            "material_package_umid": "0x060A2B340101010501010D0013EC94F152947134B6EC94F10052947134B6EC00",
            "timecode": "00:00:00:00"
        }
    }
}

Perfect will be if the Python script runs and then asks you for the path to the input file.

Unfortunately I have very little experience with Python, but it comes pre-installed on my system and I want to use it. The other option is to write a bash script doing the same. Any suggestions will be greatly appreciated.

So far I have figured out how I can open the JSON file and to extract data from it using:

import json

with open('output.json') as json_data:
    data = json.load(json_data)
    json_data.close()
    print(data["streams"][0]["codec_name"])
    print(data["streams"][0]["profile"])

Upvotes: 1

Views: 786

Answers (1)

Rahul R
Rahul R

Reputation: 31

Try this:

import json
import sys

pass the file name as a command line argument

then read the whole json at once, should be fine as long as file ain't too large

finally parse the json into a dict

file_content = open(sys.argv[1], 'r').read()
data = json.loads(file_content) # is a dictionary
print(data["streams"][0]["codec_name"])
print(data["streams"][0]["profile"])

run as :

python script.py /path/to/output.json

Upvotes: 1

Related Questions