Ashutosh Kumar
Ashutosh Kumar

Reputation: 109

How to store the header data of a DICOM file in a pandas dataframe?

I am trying to read DICOM files using pydicom in Python and want to store the header data into a pandas dataframe. How do I extract the data element value for this purpose?

So far I have created a dataframe with columns as the tag names in the DICOM file. I have accessed the data element but I only need to store the value of the data element and not the entire sequence. For this, I converted the sequence to a string and tried to split it. But it won't work either as the length of different tags are different.

refDs = dicom.dcmread('000000.dcm')
    info_header = refDs.dir()

    df = pd.DataFrame(columns = info_header)
    print(df)

    info_data = []
    for i in info_header:
        if (i in refDs):
            info_data.append(str(refDs.data_element(i)).split(" ")[0])

    print (info_data[0],len(info_data))

I have put the data element sequence element in a list as I could not put it into the dataframe directly. The output of the above code is

(0008, 0050) Accession Number                    SH: '1091888302507299' 89

But I only want to store the data inside the quotes.

Upvotes: 3

Views: 5479

Answers (1)

gil-c
gil-c

Reputation: 51

This works for me:

import pydicom as dicom
import pandas as pd

ds = dicom.read_file('path_to_file')
df = pd.DataFrame(ds.values())
df[0] = df[0].apply(lambda x: dicom.dataelem.DataElement_from_raw(x) if isinstance(x, dicom.dataelem.RawDataElement) else x)
df['name'] = df[0].apply(lambda x: x.name)
df['value'] = df[0].apply(lambda x: x.value)
df = df[['name', 'value']]

Eventually, you can transpose it:

df = df.set_index('name').T.reset_index(drop=True)

Nested fields would require more work if you also need them.

Upvotes: 4

Related Questions