Reputation: 37
I scrape sites for a database with a chrome extension, need assitance with a JavaScript Clean up function
e.g
my target output is:
_60789694386.html
everything past .html needs to be removed, but since it is diffrent in each URL - i'm lost
the output is in a .csv file, in which i run a JavaScript to clean up the data.
this.values[8] = this.values[8].replace("https://www.alibaba.com/product-detail/","");
this.values[8] is how i target the column in the script. (Column 8 holds the URL)
Upvotes: 0
Views: 101
Reputation: 10280
Well, you can use split
.
var final = this.values[8].split('.html')[0]
split
gives you an array of items split by a string, in your case'.html'
, then you take the first one.
Upvotes: 3
Reputation: 233
You can use the regex to get it done. As of my knowledge you do something like:
var v = "https://www.alibaba.com/product-detail/_60789694386.html?spm=a2700.galleryofferlist.normalList.1.5be41470uWBNGm&s=p"
result = (v.match(/[^\/]+$/)[0]);
result = result.substring(0,result.indexOf('?'));
console.log(result); // will return _60789694386.html
Upvotes: 0
Reputation: 20039
Alternate, without using split
var link = "https://www.alibaba.com/product-detail/_60789694386.html?spm=a2700.galleryofferlist.normalList.1.5be41470uWBNGm&s=p"
var result = link.replace('https://www.alibaba.com/product-detail/', '').replace(/\?.*$/, '');
console.log(result);
Upvotes: 0
Reputation: 324790
For when you don't care about readability...
this.values[8] = new URL(this.values[8]).pathname.split("/").pop().replace(".html","");
Upvotes: 0
Reputation: 347
Not sure i understood your problem, but try this
var s = 'https://www.alibaba.com/product-detail/_60789694386.html?spm=a2700.galleryofferlist.normalList.1.5be41470uWBNGm&s=p'
s = s.substring(0, s.indexOf('?'));
console.log( s );
Upvotes: 0
Reputation: 11
Consider using substr
this.values[8] = this.values[8].substr(0,this.values[8].indexOf('?'))
Upvotes: 1
Reputation: 160
You can use split method to divide text from ? as in example.
var link = "https://www.alibaba.com/product-detail/_60789694386.html?spm=a2700.galleryofferlist.normalList.1.5be41470uWBNGm&s=p"
var result = link.split('?')[0].replace("https://www.alibaba.com/product-detail/","");
console.log(result);
Upvotes: 0