Sc-python-leaner
Sc-python-leaner

Reputation: 259

For loop into a pandas dataframe

I have the following piece of code and it works but prints out data as it should. I'm trying (unsuccessfully) to putting the results into a dataframe so I can export the results to a csv file. I am looping through a json file and the results are correct, I just need two columns that print out to go into a dataframe instead of printing the results. I took out the code that was causing the error so it will run.

import json
import requests
import re 
import pandas as pd

data = {}
df = pd.DataFrame(columns=['subtechnique', 'name'])
df

RE_FOR_SUB_TECHNIQUE = r"(T\d+)\.(\d+)"
r = requests.get('https://raw.githubusercontent.com/mitre/cti/master/enterprise-attack/enterprise-attack.json', verify=False)

data = r.json()

objects = data['objects']
for obj in objects:
    ext_ref = obj.get('external_references',[])
    revoked = obj.get('revoked') or '*****'
    subtechnique = obj.get('x_mitre_is_subtechnique')
    name = obj.get('name')    
    for ref in ext_ref:
        ext_id = ref.get('external_id') or ''
        if ext_id:
            re_match = re.match(RE_FOR_SUB_TECHNIQUE, ext_id)
            if re_match:
                technique = re_match.group(1)
                sub_technique = re_match.group(2)
                print('{},{}'.format(technique+'.'+sub_technique, name))
     

Unless there is an easier way to put the results of each row in the loop and have that append to a csv file.

Any help is appreciated.

Thanks

Upvotes: 1

Views: 45

Answers (1)

Edunne
Edunne

Reputation: 234

In this instance, it's likely easier to just write the csv file directly, rather than go through Pandas:

with open("enterprise_attack.csv", "w") as f:
    my_writer = csv.writer(f)   
    for obj in objects:
        ext_ref = obj.get('external_references',[])
        revoked = obj.get('revoked') or '*****'
        subtechnique = obj.get('x_mitre_is_subtechnique')
        name = obj.get('name')
        for ref in ext_ref:
            ext_id = ref.get('external_id') or ''
            if ext_id:
                re_match = re.match(RE_FOR_SUB_TECHNIQUE, ext_id)
                if re_match:
                    technique = re_match.group(1)
                    sub_technique = re_match.group(2)
                    print('{},{}'.format(technique+'.'+sub_technique, name))
                    my_writer.writerow([technique+"."+sub_technique, name])

It should be noted that the above will overwrite the output of any previous runs. If you wish to keep the output of multiple runs, change the file mode to "a":

with open("enterprise_attack.csv", "a") as f:

Upvotes: 2

Related Questions