Reputation: 858
A csv file from a DDBB. Some rows in a column have over a item. I would like to count every item in every rows in a column
for example; a line in a column has four items:
column1, column2, column3, column4
aaa, bbb,ccc,ddd ddd ddd ddd, eee
bbb, ccc,eee, ddd, eee
fff, ccc, eee,ddd, eee
df["column3"].value_counts()
must be 6
df["column3"].str.split('\n', expand=True)
not work
Upvotes: 0
Views: 46
Reputation: 18315
After splitting you can sum the lengths of the results:
>>> df.column3.str.split().str.len().sum()
6
Alternatively, summing the count of whitespaces (plus 1) without splitting:
>>> df.column3.str.count("\s+").add(1).sum()
6
Doing this for every column:
>>> df.apply(lambda s: s.str.count("\s+").add(1).sum())
column1 3
column2 3
column3 6
column4 3
Upvotes: 1