Reputation: 1335
I have a regex problem which bugs me and have no clue how to solve it.
I have an input field with a text and I like to extract certain values out of it. I would like to extract a title, description, a price and a special price.
Examples for the input:
The CoffeeScript pattern I'm using:
pattern = ///
([^$]+)
(#(.+?)#+)
([\$]\d+\. \d+)
([\%\$]\d+\. \d+)
///
params = [title,description,oldPrice,newPrice]=input_txt.match(pattern)[1..4]
It does not work. It should work if I enter all values in the given sequence and I also have to provide a the asked substring.
What I would like to have is the ability to extract the sequments if the are provided (so optional) and no matter of the sequence... How can I extract optional sequences of an string... EDIT/// I provide some examples
exmp1:
Kindle #Amazon's ebook reader# $79.00
this should be extracted as
title:Kindle
description: Amazon's ebook reader
oldPrice:$79.00
exmp2:
Nike Sneaker's $109.00 %$89.00
this should be extracted as
title:Nikes Sneaker's
oldPrice:$109.00
newPrice:$89.00
exmp3:
$100.00 Just dance 3 #for XBox#
this should be extracted to
title: Just dance 3
description: for XBox
oldPrice:$100.00
Any help would be great ...
Upvotes: 0
Views: 1141
Reputation: 707426
You can use this code that looks for a removes each separate piece of the matches:
function extractParts(str) {
var parts = {};
function removePiece(re) {
var result;
var matches = str.match(re);
if (matches) {
result = matches[1];
str = str.replace(re, "");
}
return(result);
}
// find and remove each piece we're looking for
parts.description = removePiece(/#([^#]+)#/); // #text#
parts.oldPrice = removePiece(/[^%](\$\d+\.\d+)/); // $4.56
parts.newPrice = removePiece(/%(\$\d+\.\d+)/); // %$3.78
// fix up whitespace
parts.title = str.replace(/\s+/g, " ").replace(/^\s+/, "").replace(/\s+$/, "");
return(parts);
}
var pieces = extractParts("Kindle #Amazon's ebook reader# $79.00");
And, you can see a demo in action here: http://jsfiddle.net/jfriend00/d8NNr/.
Upvotes: 1
Reputation: 40810
The nature of regular grammars makes it hard to solve your problem. As a work around the simplest solution would be to just execute your regex 4 times:
Upvotes: 4