Group values in column based on common words

Question

I have a dataframe:

ID    message
1     request body:  dwfkjn34241
2     request body:  jnwg3425
3     request body: ,  qwefn2
4     received an error
5     
6     received an error


I want to extract groups of values in column based on common words. So, first three rows in message column can be considered as same group, though they are little bit different. Fourth and sixth as members of same groups. How could i group those values i column message using words and structural similarity criterion for that? What is a good method for that? The dataframe in example is given for example. So, im more interested in methods suiting the idea of problem, than regular expressions based solution for example

Group values in column based on common words

Answers (1)

Related Questions