Ali Khan
Ali Khan

Reputation: 59

pandas dataframe - does filtering / selecting cols by String preserve order?

I have a use case where I have say 10 cols out of which 5 start with the string 'Region'. I need to get a resulting dataframe which only contains those cols (starting with string 'Region'). Not only that, I need to make sure the order is preserved (e.g. if in original df, the col order is 'Region 1', 'Region 2', 'Region 3' -- this should be preserved and not result in 'Region 3', 'Region 2', 'Region 1' instead).

Would following the 'accepted answer' for this question preserve the order or is there some other method to achieve that?

stackoverflow - find-column-whose-name-contains-a-specific-string

Upvotes: 0

Views: 446

Answers (3)

Umar.H
Umar.H

Reputation: 23099

if your data frame is similar to :

print(df)


   Region 3  Region 2  Region 1  Custom  UnwantedCol
0         0         0         0       0            0

we can use the sorted method to sort your columns by the number:

nat_cols_sort = dict(sorted(
    {col: int(col.split(" ")[1]) for col in df.filter(regex='^Region').columns}.items(),
    key=lambda x: x[1],
))


print(df[nat_sort.keys()])

   Region 1  Region 2  Region 3
0         0         0         0

Upvotes: 1

BENY
BENY

Reputation: 323316

Two steps first use filter

s=df.filter(like='Region')

Upvotes: 2

Artyom Akselrod
Artyom Akselrod

Reputation: 976

Yes, it will. df.columns is a list, when you iterate over list, you preserve the order of the list. Thus, you can use the answer from the mentioned link:

region_cols = [col for col in df.columns if 'Region' in col]

df[region_cols] - will be the df you require.

Upvotes: 2

Related Questions