user2023
user2023

Reputation: 478

How to get the data sorted based on column values before comma

In an unsorted large CSV file, How could i sort the Resource Name column so i can get all the same name results aligned sequentially lined up One after another like grep in my example below.

In the Resource Name column we need select the data before comma , in order to get it required result.

My trial code:

#!/home/uatacto/bin/python
# -*- coding: utf-8 -*-

import pandas as pd

cols = ['Status', 'Resource Name', 'Appliance',  'State',  'Description', 'Corrective Action']
df = pd.read_csv("ResourceAlertsReport.csv", usecols=cols)
#df['Resource Name'] = df['Resource Name'].str.lower()
df = df.apply(lambda x: x.astype(str).str.lower())

print(df)

Resulted DataFrame Structure:

        Status                                      Resource Name                     Appliance   State                                        Description                                  Corrective Action
0     critical                                     enc8005, bay 4                        ov8002  active  there are failed entries in the ilo integrated...  log into the ilo and view the integrated manag...
1     critical                                     enc8013, bay 9                        ov8004  active              the bios/hardware status is critical.  log into the ilo and view the integrated manag...
2     critical                                     enc8013, bay 9                        ov8004  locked  the ilo for the server hardware in bay 9 is re...  check the ilo system information - health summ...
3     critical                                            enc8002                        ov8001  locked  encountered connectivity problems with synergy...  refresh the enclosure, reseat the appliance, r...
4     critical                                    enc3061, bay 12                        ov3004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
5     critical                                            enc8002                        ov8001  locked  encountered connectivity problems with synergy...  refresh the enclosure, reseat the appliance, r...
6     critical                                     enc3070, bay 3                        ov3007  active  service event: the state of the processor in s...  a support ticket has been opened by the hpe su...
7     critical                                     enc2029, bay 3                        ov2002  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
8     critical                                     enc8058, bay 6                        ov8013  active  there are failed entries in the ilo integrated...  log into the ilo and view the integrated manag...
9     critical                                     enc8056, bay 8                        ov8013  active  there are failed entries in the ilo integrated...  log into the ilo and view the integrated manag...
10    critical                                     enc3054, bay 4                        ov3002  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
11    critical                                            enc8002                        ov8001  locked  encountered connectivity problems with synergy...  refresh the enclosure, reseat the appliance, r...
12    critical                                     enc8005, bay 4                        ov8002  locked  the product id and/or serial number of the ser...  if the server is running a workload, schedule ...
13    critical                                     enc1019, bay 1                        ov1001  active  the firmware integrity scan or boot time valid...  if configured for automatic recovery, the repa...
14    critical                                    enc8048, bay 11                        ov8010  active              the bios/hardware status is critical.  log into the ilo and view the integrated manag...
15    critical                                     enc8043, bay 5                        ov8008  active  service event: the physical status of a drive ...  a support ticket has been opened by the hpe su...
16    critical                                     enc8043, bay 5                        ov8008  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
17    critical                                    enc2009, bay 14         ov c7000 enclosures 1  active  there are failed entries in the ilo integrated...  log into the ilo and view the integrated manag...
18    critical                                     enc3054, bay 9                        ov3002  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...

Sorted result with grep:

$ python test_ov.py | grep enc8013
1     critical                                     enc8013, bay 9                        ov8004  active              the bios/hardware status is critical.  log into the ilo and view the integrated manag...
2     critical                                     enc8013, bay 9                        ov8004  locked  the ilo for the server hardware in bay 9 is re...  check the ilo system information - health summ...
324   critical                                     enc8013, bay 3                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
389   critical                                    enc8013, bay 11                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
436   critical                                     enc8013, bay 8                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
491   critical                                     enc8013, bay 7                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
499   critical                                     enc8013, bay 5                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
643   critical                                     enc8013, bay 2                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
716   critical                                     enc8013, bay 1                        ov8004  locked            unable to discover the server hardware.  verify your request. if credentials were speci...
717   critical                                     enc8013, bay 1                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1197  critical                                     enc8013, bay 3                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1208  critical                                     enc8013, bay 5                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1266  critical                                     enc8013, bay 7                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1329  critical                                     enc8013, bay 2                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1362  critical                                    enc8013, bay 11                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1365  critical                                     enc8013, bay 1                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1502  critical                                     enc8013, bay 8                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1685  critical                                     enc8013, bay 2                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1723  critical                                     enc8013, bay 1                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...
1735  critical                                     enc8013, bay 3                        ov8004  active  remote insight/ integrated lights-out self tes...  to decode the error-code look into cpqsm2cntlr...

any help will be much appreciated.

Upvotes: 0

Views: 58

Answers (3)

BoomBoxBoy
BoomBoxBoy

Reputation: 1885

You should be able to do that with the following.

df["sort_column"] = df['Resource Name'].str.extract(r'(\S+)[,]|(\S+)', flags=0, expand=False).fillna(method='bfill', axis=1).iloc[:, 0]
df.sort_values('sort_column', inplace=True)

Upvotes: 1

Krishnakanth Allika
Krishnakanth Allika

Reputation: 790

FYI, aligning names sequentially is sorting but what you did using grep is filtering/searching. Both are different tasks.

For filtering (like you did with grep):

df = df[df['Resource Name'].str.contains('enc8013')]

Output:

    Status      Resource Name
1   critical    enc8013
2   critical    enc8013, bay 9

For sorting by resource name:

df['resource_name_only'] = df['Resource Name'].str.split(',').str[0]
df = df.sort_values(by=['resource_name_only'])

Output:

    Status      Resource Name       resource_name_only
0   critical    enc8005, bay 4      enc8005
1   critical    enc8013             enc8013
2   critical    enc8013, bay 9      enc8013

Upvotes: 1

mosc9575
mosc9575

Reputation: 6347

You can create two columns out of the 'Resource Name' using the extract() method with an regular expression and sort your values afterwards.

from io import StringIO
import pandas as pd

t = """ Status   Resource Name
0     critical   enc8005, bay 4
1     critical   enc8013, bay 9
2     critical   enc8013, bay 9
3     critical   enc8002
"""

df = pd.read_csv(StringIO(t), sep='\s\s+')

ans = df['Resource Name'].str.extract(r'(?P<a>\w+\d+)\,?\s*(?P<b>\w+\s+\d+)?') # this creates two columns 'a' and 'b'
df['a'] = ans['a']
df['b'] = ans['b']
df = df.sort_values(['a', 'b'])
>>> df
     Status   Resource Name        a       b
3  critical  enc8002         enc8002     NaN
0  critical  enc8005, bay 4  enc8005   bay 4
1  critical  enc8013, bay 9  enc8013   bay 9
2  critical  enc8013, bay 9  enc8013   bay 9

Upvotes: 0

Related Questions