Python pandas to extract only desired column and desired values from the columns

Question

I'm trying to use pandas to parse my CSV file, where my CSV file has multiple columns but I need to pick only certain ones. From my CSV file I'm looking to grab 'Platform ID' , that may startswith CS-Unix* next one is 'Target system address' where it contains Sever names which could be anything so I need whole value to be printed out another and the last column is 'Failure reason' which again a symptom and could be anything so need that printed.

I mentioned CS-Unix-* , because it could be anything after CS-Unix- However, in another field, I opted * to print the value as Server name could again be different.

My Data Format will be ..

Platform ID               Target system address       Failure reason
CS-Unix-RootAccounts-SSH  Serer1                       xyz

Below what I'm trying but not working.

import csv
import pandas as pd

pd.set_option('display.height', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

data = pd.read_csv('/home/karn/plura/Test/Python_Pnada/Cyber_July.csv', usecols=['Platform ID', 'Target system address', 'Failure reason'])
#data.drop(data.index[0], inplace=True)
hostnames = data[(data['Platform ID']=='CDS-Unix-*') | (data['Target system address'] == '*' )]['Failure reason']
print(hostnames)

Please guide to move forward.

gaganso · Accepted Answer

This should provide failure reasons for all the rows with Platform IDs of the form CS-Unix-*.

hostnames = data[data['Platform ID'].str.startswith("CS-Unix-")][['Target system address','Failure reason']]

startswith() returns a boolean indicating whether the elements start with the string passes as a parameter to it.

Python pandas to extract only desired column and desired values from the columns

Answers (1)

Related Questions