Parsing XML to CSV with Python

Question

i'd like to parse an XML file into a CSV format and display in a format like:

I have succesfully located each element's text within the csv file, id like to match up the namelink and descriptions into rows and have the text from each element in each column as can be seen in the table.

The original XML File:

My current attempt:

# Importing the required libraries
import xml.etree.ElementTree as Xet
import pandas as pd

# Parsing the XML file
xmlparse = Xet.parse('NiktoReportTest.xml')
root = xmlparse.getroot()

cols = ["namelink", "description"]
rows = []


x = []
for elm in root.findall("./niktoscan/scandetails/item/namelink"):
    x.append(elm.text)

y = []
for value in root.findall("./niktoscan/scandetails/item/description"):
    y.append(value.text)


rows.append({"namelink": x,
             "description": y})

df = pd.DataFrame(rows, columns=cols)

# Writing dataframe to csv
df.to_csv('output.csv')

The current output of the CSV file:

,namelink,description
0,"['http://127.0.0.1:80/', 'http://127.0.0.1:80/', 'http://127.0.0.1:80/', 'http://127.0.0.1:80/', 'http://127.0.0.1:80/', 'http://127.0.0.1:80/./', 'http://127.0.0.1:80/./', 'http://127.0.0.1:80//', 'http://127.0.0.1:80//', 'http://127.0.0.1:80/%2e/', 'http://127.0.0.1:80/%2e/', 'http://127.0.0.1:80///etc/hosts', 'http://127.0.0.1:80///', 'http://127.0.0.1:80/server-status', 'http://127.0.0.1:80/?PageServices', 'http://127.0.0.1:80/?wp-cs-dump', 'http://127.0.0.1:80///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////', 'http://127.0.0.1:80///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////', 'http://127.0.0.1:80/wp-content/themes/twentyeleven/images/headers/server.php?filesrc=/etc/hosts', 'http://127.0.0.1:80/wordpresswp-content/themes/twentyeleven/images/headers/server.php?filesrc=/etc/hosts', 'http://127.0.0.1:80/wp-includes/Requests/Utility/content-post.php?filesrc=/etc/hosts', 'http://127.0.0.1:80/wordpresswp-includes/Requests/Utility/content-post.php?filesrc=/etc/hosts', 'http://127.0.0.1:80/wp-includes/js/tinymce/themes/modern/Meuhy.php?filesrc=/etc/hosts', 'http://127.0.0.1:80/wordpresswp-includes/js/tinymce/themes/modern/Meuhy.php?filesrc=/etc/hosts', 'http://127.0.0.1:80/assets/mobirise/css/meta.php?filesrc=', 'http://127.0.0.1:80/login.cgi?cli=aa%20aa%27cat%20/etc/hosts', 'http://127.0.0.1:80/shell?cat+/etc/hosts']","['The anti-clickjacking X-Frame-Options header is not present.', 'The X-XSS-Protection header is not defined. This header can hint to the user agent to protect against some forms of XSS', 'The X-Content-Type-Options header is not set. This could allow the user agent to render the content of the site in a different fashion to the MIME type', '/: Directory indexing found.', 'Allowed HTTP Methods: POST, OPTIONS, HEAD, GET ', '/./: Directory indexing found.', ""/./: Appending '/./' to a directory allows indexing"", '//: Directory indexing found.', '//: Apache on Red Hat Linux release 9 reveals the root directory listing by default if there is no index page.', '/%2e/: Directory indexing found.', '/%2e/: Weblogic allows source code or directory listing, upgrade to v6.0 SP1 or higher. BID-2513.', ""///etc/hosts: The server install allows reading of any system file by adding an extra '/' to the URL."", '///: Directory indexing found.', '/server-status: This reveals Apache information. Comment out appropriate line in the Apache conf file or restrict access to allowed sources.', ""/?PageServices: The remote server may allow directory listings through Web Publisher by forcing the server to show all files via 'open directory browsing'. Web Publisher should be disabled. CVE-1999-0269."", ""/?wp-cs-dump: The remote server may allow directory listings through Web Publisher by forcing the server to show all files via 'open directory browsing'. Web Publisher should be disabled. CVE-1999-0269."", '///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////: Directory indexing found.', ""///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////: Abyss 1.03 reveals directory listing when 	 /'s are requested."", '/wp-content/themes/twentyeleven/images/headers/server.php?filesrc=/etc/hosts: A PHP backdoor file manager was found.', '/wordpresswp-content/themes/twentyeleven/images/headers/server.php?filesrc=/etc/hosts: A PHP backdoor file manager was found.', '/wp-includes/Requests/Utility/content-post.php?filesrc=/etc/hosts: A PHP backdoor file manager was found.', '/wordpresswp-includes/Requests/Utility/content-post.php?filesrc=/etc/hosts: A PHP backdoor file manager was found.', '/wp-includes/js/tinymce/themes/modern/Meuhy.php?filesrc=/etc/hosts: A PHP backdoor file manager was found.', '/wordpresswp-includes/js/tinymce/themes/modern/Meuhy.php?filesrc=/etc/hosts: A PHP backdoor file manager was found.', '/assets/mobirise/css/meta.php?filesrc=: A PHP backdoor file manager was found.', '/login.cgi?cli=aa%20aa%27cat%20/etc/hosts: Some D-Link router remote command execution.', '/shell?cat+/etc/hosts: A backdoor was identified.']"

Parsing XML to CSV with Python

Answers (1)

Related Questions