cyber_raven
cyber_raven

Reputation: 27

Open CSV, sort a specific column and overwrite existing CSV

I'm been stuck on this one for a while. I'm trying to open a csv, sort by severity (Critical, High, Medium, Low) then overwrite the existing file. I'd also like to ignore the first wrote or add a header row.

Original CSV

IP Address  Severity    Score
10.0.0.1    High        2302
172.65.0.1  Low         310
192.168.0.1 Critical    5402
127.0.0.1   Medium      1672`

Modified/Sorted CSV

IP Address  Severity    Score
192.168.0.1 Critical    5402
10.0.0.1    High        2302
127.0.0.1   Medium      1672 
172.65.0.1  Low         310

Code

import csv
crit_sev = "Critical"
high_sev = "High"
med_sev = "Medium"
low_sev = "Low"
reader = csv.reader(open('sample.csv', 'r'))
row=0
my_list = []
for row in reader:
    if row[1] == crit_sev:
        my_list.append(row)
    elif row[1] == high_sev:
        my_list.append(row)
    elif row[1] == med_sev:
        my_list.append(row)
    elif row[1] == low_sev:
        my_list.append(row)

writer = csv.writer(open("sample.csv", 'w'))
header = ['IP Address', 'Severity', 'Score']
writer.writerow([header])
for word in my_list:
    writer.writerow([word])

Any help would be appreciated.

Upvotes: 1

Views: 1283

Answers (2)

Martin Evans
Martin Evans

Reputation: 46759

You could use Python's csv library to do this as follows:

import socket     
import csv

severity = {"Critical" : 0, "High" : 1, "Medium" : 2, "Low" : 3}     

with open('sample.csv', 'rb') as f_input:
    csv_input = csv.reader(f_input)
    header = next(csv_input)
    data = sorted(csv_input, key=lambda x: (severity[x[1]], socket.inet_aton(x[0])))

with open('sample.csv', 'wb') as f_output:
    csv_output = csv.writer(f_output)
    csv_output.writerow(header)
    csv_output.writerows(data)

This preserves the existing header and sorts the entries based on the severity column. Next it also (optionally) sorts by IP address (which may or may not be useful to you) using socket.inet_aton() to convert the IP address into a sortable number.

For example:

IP Address,Severity,Score
10.168.0.1,Critical,5402
192.168.0.1,Critical,5402
10.0.0.1,High,2302
127.0.0.1,Medium,1672
172.65.0.1,Low,310

Upvotes: 1

DYZ
DYZ

Reputation: 57033

Here's a pandas solution:

import pandas as pd
# Read the CSV file
data = pd.read_csv('sample.csv')

# Configure the levels of severity
levels = pd.Series({"Critical" : 0, "High" : 1, "Medium" : 2, "Low" : 3})
levels.name='Severity'

# Add numeric severity data to the table
augmented = data.join(levels,on='Severity',rsuffix='_')

# Sort and select the original columns
sorted_df = augmented.sort_values('Severity_')[['IP Address', 'Severity','Score']]

# Overwrite the original file
sorted_df.to_csv('sample.csv',index=False)

Upvotes: 2

Related Questions