Parth Shah
Parth Shah

Reputation: 15

How to return item in list after list is matched against data set in Python?

I am trying to find a way to check a list of queries in a data set against specific lists. For example, here is my list:

season = ['winter','spring','summer','fall','autumn']

Sample Query set:

fall crafts for kids
winter crafts for kids
spring crafts for kids
fall craft ideas for kids
summer crafts for kids
autumn crafts for kids
easy winter crafts for kids
spring craft ideas for kids
fun summer crafts for kids
winter craft ideas for kids

Output:

Fall
Winter
Spring
Fall
Summer
Autumn
Winter
Spring
Summer
Winter

I'm able to tag each query with the list name, so for example:

Top Queries Volume  Intent
fall crafts for kids    33100   Season
winter crafts for kids  2900    Season
spring crafts for kids  1600    Season
fall craft ideas for kids   1000    Season
summer crafts for kids  1000    Season
autumn crafts for kids  880 Season
easy winter crafts for kids 880 Season
spring craft ideas for kids 880 Season
fun summer crafts for kids  480 Season
winter craft ideas for kids 480 Season

But I would like to map the item in the list. How can this be done?

Upvotes: 0

Views: 71

Answers (3)

Alain T.
Alain T.

Reputation: 42143

You could use a regular expression built from your list of keywords:

import re

seasons = ['winter','spring','summer','fall','autumn']
pattern = re.compile(r"\b("+"|".join(map(re.escape,seasons))+r")\b")

qSet = """fall crafts for kids
winter crafts for kids
spring crafts for kids
fall craft ideas for kids
summer crafts for kids
autumn crafts for kids
easy winter crafts for kids
spring craft ideas for kids
fun summer crafts for kids
winter craft ideas for kids""".split("\n")

for q,s in zip(qSet,map(pattern.findall,qSet)): print(q,":",*s)

fall crafts for kids : fall
winter crafts for kids : winter
spring crafts for kids : spring
fall craft ideas for kids : fall
summer crafts for kids : summer
autumn crafts for kids : autumn
easy winter crafts for kids : winter
spring craft ideas for kids : spring
fun summer crafts for kids : summer
winter craft ideas for kids : winter

The regular expression in pattern selects any of the keywords as a whole word in the sentence. For example, this expression: '\b(winter|spring|summer|fall|autumn)\b' will not pick up 'fall' in 'watching rainbows under the waterfall in summer'

Upvotes: 0

RJ Adriaansen
RJ Adriaansen

Reputation: 9619

Or you can just use list comprehension:

seasons = ["winter", "spring", "summer", "fall", "autumn"]

queries = [
    "fall crafts for kids",
    "winter crafts for kids",
    "spring crafts for kids",
    "fall craft ideas for kids",
    "sumMer crafts for kids",
    "autumn crafts for kids",
    "easy winter crafts for kids",
    "spring craft ideas for kids",
    "fun summer crafts for kids",
    "winter craft ideas for kids",
]

hits = [i.lower() for l in queries for i in l.split() for x in seasons if i.lower() == x]

result:

['fall','winter','spring','fall','summer','autumn','winter','spring','summer','winter']

Upvotes: 0

AKX
AKX

Reputation: 169022

Sure thing.

We can define a neat little function that takes a query string and an iterable of strings, and returns all strings from that iterable which are found in the query.

def find_matching_keywords(query, keywords):
    return {keyword for keyword in keywords if keyword in query}

Then let's plug in some data...

seasons = ["winter", "spring", "summer", "fall", "autumn"]

queries = [
    "fall crafts for kids",
    "winter crafts for kids",
    "spring crafts for kids",
    "fall craft ideas for kids",
    "sumMer crafts for kids",
    "autumn crafts for kids",
    "easy winter crafts for kids",
    "spring craft ideas for kids",
    "fun summer crafts for kids",
    "winter craft ideas for kids",
]

and map over the queries with a dictionary comprehension (note I'm lower-casing the query, to make the matching case-insensitive):

query_to_keywords = {
    query: find_matching_keywords(query.lower(), seasons)
    for query in queries
}

and finally we can print things out (You'd probably do something else than just print these, but for the sake of illustration...

for query, keywords in query_to_keywords.items():
    print(query, keywords)

The output is

fall crafts for kids {'fall'}
winter crafts for kids {'winter'}
spring crafts for kids {'spring'}
fall craft ideas for kids {'fall'}
sumMer crafts for kids {'summer'}
autumn crafts for kids {'autumn'}
easy winter crafts for kids {'winter'}
spring craft ideas for kids {'spring'}
fun summer crafts for kids {'summer'}
winter craft ideas for kids {'winter'}

If you would need various categories of keywords (e.g. seasons, adjectives, ...), you might extend seasons to a dict mapping those categories to keyword lists:

category_to_keywords = {
    "season": ["winter", "spring", "summer", "fall", "autumn"],
    "difficulty": ["easy", "hard"],
}

Then, an additional function to map over that...

def find_matching_keywords_with_categories(query, category_to_keywords):
    unfiltered_result = {
        category: find_matching_keywords(query, keywords)
        for category, keywords
        in category_to_keywords.items()
    }
    return {
        category: keywords
        for (category, keywords)
        in unfiltered_result.items()
        if keywords
    }

and when called, á la

query_to_keywords = {
    query: find_matching_keywords_with_categories(query.lower(), category_to_keywords)
    for query in queries
}

we'll end up printing out

fall crafts for kids {'season': {'fall'}}
winter crafts for kids {'season': {'winter'}}
spring crafts for kids {'season': {'spring'}}
fall craft ideas for kids {'season': {'fall'}}
sumMer crafts for kids {'season': {'summer'}}
autumn crafts for kids {'season': {'autumn'}}
easy winter crafts for kids {'season': {'winter'}, 'difficulty': {'easy'}}
spring craft ideas for kids {'season': {'spring'}}
fun summer crafts for kids {'season': {'summer'}}
winter craft ideas for kids {'season': {'winter'}}

Upvotes: 1

Related Questions