Reputation: 49
I want to build a new column which contains the count of the number of times a word from ai_functional list occurs in a text column.
List given is:
> ai_functional = ["natural language
> processing","nlp","A I ","Aritificial intelligence", "stemming","lemmatization","lemmatization","information
> extraction","text mining","text analytics","data-mining"]
the result I want is as follows:
> text counter
>
> 1. More details A I Artificial Intelligence 2
> 2. NLP works very well these days 1
> 3. receiving information at the right time 1
The code i have been using is
def func(stringans):
for x in ai_tech:
count = stringans.count(x)
return count
df['counter']=df['text'].apply(func)
Please can someone help me with this. I am really stuck because everytime i apply this i get result as 0 in the counter column
Upvotes: 0
Views: 156
Reputation: 54148
As you do count =
, you erase the previous value, you want to sum up the different counts
def func(stringans):
count = 0
for x in ai_tech:
count += stringans.count(x)
return count
# with sum and generator
def func(stringans):
return sum(stringans.count(x) for x in ai_tech)
Fixing some typos in ai_tech
and setting all to .lower()
gives 2,1,0
in the counter col, the last row has no value in common
import pandas as pd
ai_tech = ["natural language processing", "nlp", "A I ", "Artificial intelligence",
"stemming", "lemmatization", "information extraction",
"text mining", "text analytics", "data - mining"]
df = pd.DataFrame([["1. More details A I Artificial Intelligence"], ["2. NLP works very well these days"],
["3. receiving information at the right time"]], columns=["text"])
def func(stringans):
return sum(stringans.lower().count(x.lower()) for x in ai_tech)
df['counter'] = df['text'].apply(func)
print(df)
# ------------------
text counter
0 1. More details A I Artificial Intelligence 2
1 2. NLP works very well these days 1
2 3. receiving information at the right time 0
Upvotes: 1