Reputation: 1
I am currently working on a project to help intetify entites that work well in certain google channels, like discover. I have the URLs that ranked over the last year. What I wanted to do now is extract the names of certain people from those URLs (p.ex. /maite-kelly-singer-sick-giovanni-zarrella-show.html). I already created a set of names to match if those URLs contain the names. The next step would be to check each URL and if one of the names is included return it in a cell. If more than one name matches then return them in separate cells one after the other. Sheets example
I've been doing research for days now and haven't come up with a good idea yet. I think the best solution would be an app script, but I just started learning to write functions, so I'm not even sure where to look for something to me seemingly complex like this.
I really appreciate all help, even if it's just links to some ressources. Thank you!
Upvotes: 0
Views: 136
Reputation: 14502
Basically this algorithm can extract words from URLs, compare them with given names and add these names to the row of the result 'table':
const names = [
['giovanni', 'zarrella'],
['maite', 'kelly']
]
const urls = [
'/maite-kelly-singer-sick-giovanni-zarrella-show.html)',
'/kelly-maite-singer-sick-giovanni-zarrella-show.html)',
'/kelly-maite-singer-giovanni-sick-zarrella-show.html)',
'/giovanni-zarrella-show.html)',
'/maite-kelly-singer-sick-show.html)',
'/maite-singer-sick-show.html)',
]
const table = [];
for (let i in urls) {
table.push([urls[i]]);
let url = urls[i].toLowerCase().replace(/[^a-z]/g, '-'); // replace all non-letters to dashes
let words = url.split('-'); // split the string by dashes
for (let n in names) {
if (words.includes(names[n][0]) && words.includes(names[n][1])) {
table[i].push(names[n].join('-'));
}
}
}
console.log(table);
From the given names and urls it will return the table (2D-array) like this:
┌─────────┬─────────────────────────────────────────────────────────┬─────────────────────┬───────────────┐
│ (index) │ 0 │ 1 │ 2 │
├─────────┼─────────────────────────────────────────────────────────┼─────────────────────┼───────────────┤
│ 0 │ '/maite-kelly-singer-sick-giovanni-zarrella-show.html)' │ 'giovanni-zarrella' │ 'maite-kelly' │
│ 1 │ '/kelly-maite-singer-sick-giovanni-zarrella-show.html)' │ 'giovanni-zarrella' │ 'maite-kelly' │
│ 2 │ '/kelly-maite-singer-giovanni-sick-zarrella-show.html)' │ 'giovanni-zarrella' │ 'maite-kelly' │
│ 3 │ '/giovanni-zarrella-show.html)' │ 'giovanni-zarrella' │ │
│ 4 │ '/maite-kelly-singer-sick-show.html)' │ 'maite-kelly' │ │
│ 5 │ '/maite-singer-sick-show.html)' │ │ │
└─────────┴─────────────────────────────────────────────────────────┴─────────────────────┴───────────────┘
All you need is to grab your names and urls from your sheet and put the result table back on this (or another) sheet.
A final implementation depends on how exactly your spreadsheet looks like.
Update
Just figured out a little bit shorter and clean variant of the main part of the code:
const table = [];
for (let url of urls) {
let row = [url];
let words = url.toLowerCase().match(/\w+/g);
for (let name of names) {
if (words.includes(name[0]) && words.includes(name[1])) {
row.push(name.join('-'));
}
}
table.push(row);
}
console.log(table);
Upvotes: 1