Reputation: 140
I do have a set of different name tokens and also data where the different names are combined. Eg. If the name has 3 tokens like "abc def ghi" and given a name "abcdef" or "abcdefghi", I would like to identify different valid tokens of that combined name string. Can we build a dictionary of name tokens and use some NLP techniques or python libraries to achieve this? Please give your inputs on how to start.
Upvotes: -3
Views: 323
Reputation: 1256
If you need to find a substring in a string, all you need is a list of tokens and a loop:
tokens = ['abc', 'def', 'ghi']
name = 'abcdef'
for token in tokens:
if token in name:
print(token, 'is part of', name)
See also if you need to find the position of the substring within the string.
Upvotes: 0