Word length count with for/if loop pandas

Question

I have a dataframe and I need to count the word length from the column Word for each Concept separately depending on the Note column.

For each Concept in a df: 
  if Note contains ("tupi") -> count word length for these Words.    
  if not -> count word length for others

  print (Concept + " tupi " + word_length)
  print (Concept + " not tupi " + word_length)

And the output should be something like:

ANTEATER tupi 5.034

ANTEATER not tupi 4.56
_______
WILD CAT tupi 4.55

WILD CAT not tupi 3.44

Input dataframe example:

Language	Concept	Word	Borrowing	Note
First	ANTEATER	tamanduá	YES	loan from tupi

Second	ANTEATER	uãiarú

Third	ANTEATER	atãn

Fourth	ANTEATER	aatãm	YES	loan from tupi

Fifth	WILD CAT	tamano	YES

Sixth	WILD CAT	sdfsg	YES
Seventh	WILD CAT	tamano	YES	loan from tupi

Eigth	WILD CAT	sdfsg	YES	loan from tupi

Shaido · Accepted Answer

You can do this entirely in pandas without the need for a for-loop.

Create a column tupi that represents if the Note column contains 'tupi' or not.
Create a Word Length column with the length of the word in the Word column.

Now, use groupby and compute the average word length of each Concept with and without 'tupi' in the Note column:

df['tupi'] = df['Note'].str.contains('tupi').fillna(False)
df['Word Length'] = df['Word'].str.len()
df.groupby(['Concept', 'tupi'])['Word Length'].mean()

Resulting dataframe from the given data:

Concept   tupi 
ANTEATER  False    5.0
          True     6.5
WILD CAT  False    5.5
          True     5.5

Word length count with for/if loop pandas

Answers (1)

Related Questions