Reputation: 43
I have a pandas dataframe with words in the first column. I want to create columns in the same dataframe with number of occurrences of each letter in each word.
The dataframe should look something like:
Word A B C D E ...
BED 0 1 0 1 1
Is there an easy way to do this and update it for new words added to the dataframe? It should create a column for the letter if it doesn't exist
I've tried this -
for i in range(len(df)):
u = df.iat[i, 0]
for j in u:
df.iat[i, j] = u.count(j)
Doesn't work...
Upvotes: 4
Views: 822
Reputation: 18647
You could use collections.Counter
in a list comprehension then reindex using string.ascii_uppercase
:
from collections import Counter
from string import ascii_uppercase
df = df[['Word']].join(pd.DataFrame([Counter(word) for word in df['Word'].str.upper()])
.reindex(list(ascii_uppercase), axis=1).fillna(0).astype(int))
[output]
print(df)
Word A B C D E F G H I ... Q R S T U V W X Y Z
0 BED 0 1 0 1 1 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
[1 rows x 27 columns]
Upvotes: 5