Onik Rahman
Onik Rahman

Reputation: 49

How to for iterate a column to create a dictionary and finally create a data frame

I am trying to for iterate a column to achieve the count of each word in a sentence.

I have a column:

words
"one two three four four six"
"seven eight nine ten eleven"
"twelve thirteen fourteen"
"..."

I have used this code for a single row:

text = df['word'][0]
wordss = []
wordss = text.split()
wfreq=[wordss.count(w) for w in wordss]
ini_dict = dict(zip(wordss,wfreq))

keys, values = zip(*ini_dict.items())

print ("keys : ", str(keys))
print ("values : ", str(values))

The output I receive:

keys :  ('one', 'two', 'three', 'four', 'four', 'six')
values :  (1, 1, 1, 2, 1)

My objective is to iterate in the whole list to then create a dataframe.

I have used this code at the end to achieve the desired dataframe.

df = pd.DataFrame.from_dict(ini_dict.items())
df.columns = ['Words', 'n']
df
Words n
one 1
two 1
three 1
four 2
six 1

I would like to first iterate the whole 'word' column to create a dictionary and finally have a dataframe that contains all the keys and values of the iterated column. Anyone has a solution?

Upvotes: 0

Views: 502

Answers (1)

Try something like: import pandas as pd

words = ["one two three four four six", "seven eight nine ten eleven", "twelve thirteen fourteen"]
df = pd.DataFrame(words, columns=["word"])
print(df)
wordss = []
for i in df['word']:
    wordss += i.split()
wfreq = [wordss.count(w) for w in wordss]
ini_dict = dict(zip(wordss, wfreq))

keys, values = zip(*ini_dict.items())

print("keys : ", str(keys))
print("values : ", str(values))

I have iterated through whole column and extracted words from each line. Simple! It gave me output:

                          word
0  one two three four four six
1  seven eight nine ten eleven
2     twelve thirteen fourteen
keys :  ('one', 'two', 'three', 'four', 'six', 'seven', 'eight', 'nine', 'ten', 'eleven', 'twelve', 'thirteen', 'fourteen')
values :  (1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1)

Then you can simply create DataFrame using Pandas like:

ndf = pd.DataFrame(ini_dict, index=["count"])
print(ndf.transpose())

It will give you something like:

          count
one           1
two           1
three         1
four          2
six           1
seven         1
eight         1
nine          1
ten           1
eleven        1
twelve        1
thirteen      1
fourteen      1
enter code here

Upvotes: 0

Related Questions