Mitali Patel
Mitali Patel

Reputation: 66

Building a Word Counter for Analysis

I'm trying to build a Python program similar to the wordcounter.net (https://wordcounter.net/). I have an excel file with one column that has text to be analyzed. Using pandas and other functions, I created a single word frequency counter.

But now, I need to further modify to find patterns.

For example a text has " Happy face sad face mellow little baby sweet Happy face face mellow sad face mellow "

So here, it should be able to trace patterns such as Two word density

....

Three word density

....

I also tried :

for match in re.finditer(pattern, line):

But this again has to be done manually and I want it to automatically find the patterns.

Can anyone help on how to proceed for this ?

Upvotes: 0

Views: 59

Answers (1)

Алексей Р
Алексей Р

Reputation: 7627

text = 'Happy face sad face mellow little baby sweet Happy face face mellow sad face mellow'

d = {}
for s in text.split():
    d.setdefault(s, 0)
    d[s] += 1
out = {}
for k, v in d.items():
    out.setdefault(v, []).append(k)
for i in sorted(out.keys(), reverse=True):
    print(f'{i} word density:')
    print(f'\t{out[i]}')

Output

5 word density:
    ['face']
3 word density:
    ['mellow']
2 word density:
    ['Happy', 'sad']
1 word density:
    ['little', 'baby', 'sweet']

Edit2

from collections import Counter


def freq(lst, n):
    lstn = []
    for i in range(len(lst) - (n - 1)):
        lstn.append(" ".join([lst[i + x] for x in range(n)]))
    out = Counter(lstn)
    print(f'{n} word density:')
    for k, v in out.items():
        print(f'\t"{k}" {v}')


text = 'Happy face sad face mellow little baby sweet Happy face face mellow sad face mellow'
lst = text.split()

freq(lst, 2)
freq(lst, 3)

Output

2 word density:
    "Happy face" 2
    "face sad" 1
    "sad face" 2
    "face mellow" 3
    "mellow little" 1
    "little baby" 1
    "baby sweet" 1
    "sweet Happy" 1
    "face face" 1
    "mellow sad" 1
3 word density:
    "Happy face sad" 1
    "face sad face" 1
    "sad face mellow" 2
    "face mellow little" 1
    "mellow little baby" 1
    "little baby sweet" 1
    "baby sweet Happy" 1
    "sweet Happy face" 1
    "Happy face face" 1
    "face face mellow" 1
    "face mellow sad" 1
    "mellow sad face" 1

Upvotes: 2

Related Questions