CatGirl19
CatGirl19

Reputation: 209

Nesting with xml and python

I'm trying to access the description tag within the supplemental-guidance tag. Im currently printing out the following information within each Nist control number, title, baseline-impact and priority. Now I'm trying to print out the description within the supplemental-guidance tag but I cant seem to access it alone without looking for all description tags. Thank you for the help in advance! complete nist file here Note: Not all controls have 3 baseline tags so saying Element[8][0] wont work xmlFile.xml


<?xml version="1.0" encoding="UTF-8"?>
<controls>
  <control>
    <family>ACCESS CONTROL</family>
    <number>AC-1</number>
    <title>ACCESS CONTROL POLICY AND PROCEDURES</title>
    <priority>P1</priority>
    <baseline-impact>LOW</baseline-impact>
    <baseline-impact>MODERATE</baseline-impact>
    <baseline-impact>HIGH</baseline-impact>
    <statement>
      <description>The organization:</description>
      <statement>
        <number>AC-1a.</number>
        <description>
        Develops, documents, and disseminates to [Assignment: organization-defined personnel or roles]:
        </description>
        <statement>
          <number>AC-1a.1.</number>
          <description>
          An access control policy that addresses purpose, scope, roles, responsibilities, management commitment, coordination among organizational entities, and compliance; and
          </description>
        </statement>
        <statement>
          <number>AC-1a.2.</number>
          <description>
          Procedures to facilitate the implementation of the access control policy and associated access controls; and
          </description>
        </statement>
      </statement>
      <statement>
        <number>AC-1b.</number>
        <description>Reviews and updates the current:</description>
      <statement>
        <number>AC-1b.1.</number>
        <description>
        Access control policy [Assignment: organization-defined frequency]; and
        </description>
      </statement>
      <statement>
        <number>AC-1b.2.</number>
        <description>
        Access control procedures [Assignment: organization-defined frequency].
        </description>
      </statement>
     </statement>
    </statement>
    <supplemental-guidance>
      <description>
      This control addresses the establishment of policy and procedures for the effective implementation of selected security controls and control enhancements in the AC family. Policy and procedures reflect applicable federal laws, Executive Orders, directives, regulations, policies, standards, and guidance. Security program policies and procedures at the organization level may make the need for system-specific policies and procedures unnecessary. The policy can be included as part of the general information security policy for organizations or conversely, can be represented by multiple policies reflecting the complex nature of certain organizations. The procedures can be established for the security program in general and for particular information systems, if needed. The organizational risk management strategy is a key factor in establishing policy and procedures.
      </description>
      <related>PM-9</related>
    </supplemental-guidance>
    <references>
      <reference>
        <item xml:lang="en-US" href="https://csrc.nist.gov/publications/search?keywords-lg=800-12">NIST Special Publication 800-12</item>
      </reference>
      <reference>
        <item xml:lang="en-US" href="https://csrc.nist.gov/publications/search?keywords-lg=800-100">NIST Special Publication 800-100</item>
      </reference>
    </references>
  </control>
</controls>

ExportXMLtoExcel.py

import xml.etree.ElementTree as ET 
import csv


xmlFile='/Users/username/Desktop/xmlFile.xml'
tree = ET.parse(xmlFile) 
root = tree.getroot()

# open a file for writing
excelFile = open('/Users/username/Desktop/security_controls.csv', 'w')

# creates the csv writer object / varible to write to csv
csvwriter = csv.writer(excelFile)
# list that contains the header
list_head = []
count = 0

for element in root.findall('control'):
    list_nodes=[]
    # address_list = []
    if count == 0:
        number = element.find('number').tag
        list_head.append(number)
        title = element.find('title').tag
        list_head.append(title)
        priority = element.find('priority').tag
        list_head.append(priority)

        # baseline_impact = element.find('baseline-impact').tag
        # list_head.append(baseline_impact)

        baseline_impact = element[4].tag
        list_head.append(baseline_impact)

        supplemental_guidance = element.find('supplemental-guidance').tag
        list_head.append(supplemental_guidance)

        reference = element.find('references').tag
        list_head.append(reference)

        csvwriter.writerow(list_head)
        count = count + 1

    number = element.find('number').text
    list_nodes.append(number)

    title = element.find('title').text
    list_nodes.append(title)

    if element.find('priority') is not None:
        priority = element.find('priority').text
        list_nodes.append(priority)
    else:
        priority = 'none'
        list_nodes.append(priority)

    if element.find('baseline-impact') is not None:
        if element[5].tag == 'baseline-impact':
            value = element[5].text + ', '
            if element[6].tag == 'baseline-impact':
                value += element[6].text + ', '
        baseline_impact = element.find('baseline-impact').text +', ' + value
        list_nodes.append(baseline_impact[:-2])
    else:
        baseline_impact = 'NONE'
        list_nodes.append(baseline_impact)

    if element.find('supplemental-guidance'):
        # trying to drill into the nested elements within the 'supplemental-guidance' tag 
        # and print out the description 

    csvwriter.writerow(list_nodes)
excelFile.close()

Upvotes: 0

Views: 55

Answers (1)

salparadise
salparadise

Reputation: 5805

I'm trying to access the description tag within the supplemental-guidance tag

You can use xpath to do exactly this by finding the specific node.

'.//supplemental-guidance/description' Should get you what you need.

Demo using your xml output:

In [20]: tree = ET.parse('/tmp/so.xml')

In [21]: root = tree.getroot()

In [22]: for element in root.findall('control'):
    ...:     print(element.find('.//supplemental-guidance/description').text)
    ...:

      This control addresses the establishment of policy and procedures for the effective implementation of selected security controls and control enhancements in the AC family. Policy and procedures reflect applicable federal laws, Executive Orders, directives, regulations, policies, standards, and guidance. Security program policies and procedures at the organization level may make the need for system-specific policies and procedures unnecessary. The policy can be included as part of the general information security policy for organizations or conversely, can be represented by multiple policies reflecting the complex nature of certain organizations. The procedures can be established for the security program in general and for particular information systems, if needed. The organizational risk management strategy is a key factor in establishing policy and procedures.

Upvotes: 2

Related Questions