drop columns with multiple elements pandas

I have a text file1 with

col0 col1 
g1   text
g2   text,text
g3   text,text,text
g4   text
g5   text,text,text,text,text

need to modify it using pandas to remove all rows with multiple text output should look like this

col0 col1 
g1   text
g4   text

only difference i have files which have ~300,000 rows in total

Upvotes: 2

Views: 73

Answers (2)

piRSquared
piRSquared

Reputation: 294488

This answer was based on @MaxU's concept, but this adds a layer of generalization enabling you to change the condition of how many text values are allowed.

df[df.col1.str.count(',') < 1]

  col0  col1
0   g1  text
3   g4  text

​

Upvotes: 2

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210882

If col1 contains flat strings:

In [94]: df
Out[94]:
  col0                      col1
0   g1                      text
1   g2                 text,text
2   g3            text,text,text
3   g4                      text
4   g5  text,text,text,text,text

In [95]: df = df.loc[~df.col1.str.contains(',')]

In [96]: df
Out[96]:
  col0  col1
0   g1  text
3   g4  text

In [105]: df
Out[105]:
  col0                            col1
0   g1                          [text]
1   g2                    [text, text]
2   g3              [text, text, text]
3   g4                          [text]
4   g5  [text, text, text, text, text]

In [106]: df.col1.str.len() < 2
Out[106]:
0     True
1    False
2    False
3     True
4    False
Name: col1, dtype: bool

In [107]: df[df.col1.str.len() < 2]
Out[107]:
  col0    col1
0   g1  [text]
3   g4  [text]

Upvotes: 3

Related Questions