Ruan
Ruan

Reputation: 25

In Python, how do i read all columns where the values start with a certain string?

Say I have a table like:

col1    col2    col3    col4
a       b       c       [d
e       [f      g       h
i       j       k       l
m       n       o       [p

I want to load only the columns that contain a value that starts with left bracket [ .

So i want the following to be returned as a dataframe

col 2    col4
b        [d
[f       h
j        l
n        [p

Upvotes: 0

Views: 328

Answers (4)

Himmat
Himmat

Reputation: 166

Please try this, hope this works for you

df = pd.DataFrame([['a', 'b', 'c','[d'], ['e','[f','g','h'],['i','j','k','l'],['m','n','o','[p']],columns=['col1','col2','col3','col4'])
cols = []
for col in df.columns:
    if df[col].str.contains('[',regex=False).any() == True:
        cols.append(col)

df[cols]

Output

    col2    col4
0   b   [d
1   [f  h
2   j   l
3   n   [p

Upvotes: 0

U13-Forward
U13-Forward

Reputation: 71610

Use this:

s=df.applymap(lambda x: '[' in x).any()
print(df[s[s].index])

Output:

  col2 col4
0    b   [d
1   [f    h
2    j    l
3    n  [pa

Upvotes: 0

anky
anky

Reputation: 75100

I want to load only the columns that contain a value that starts with right bracket [

For this you need series.str.startswith():

df.loc[:,df.apply(lambda x: x.str.startswith('[')).any()]

  col2 col4
0    b   [d
1   [f    h
2    j    l
3    n   [p

Note that there is a difference between startswith and contains. The docs are explanatory.

Upvotes: 1

Jeril
Jeril

Reputation: 8541

Can you try the following:

>>> df = pd.DataFrame([[1, 2, 4], [4, 5, 6], [7, '[8', 9]])
>>> df = df.astype('str')
>>> df
   0   1  2
0  1   2  4
1  4   5  6
2  7  [8  9
>>> df[df.columns[[df[i].str.contains('[', regex=False).any() for i in df.columns]]]
    1
0   2
1   5
2  [8
>>>

Upvotes: 0

Related Questions