Reputation: 135
I got excel with thousands rows like:
Basic medical supply - AIT - needs documents from trasnport - drugs
Basic medical supply - TTT - needs documents from trasnport - patiens and other stuff
random string from excel - TTP - other text like always - marijuana per rectum
How to set regex to take strings after 3rd "-" character. In this example "drugs", "patiens and other stuff" and "marijuana per rectum".
It cannot be done in jQuery etc. need to be pure JS becasue i am scripting it in Pentago BI. This strings will be very random, only 3 " - " characters are constants. Is possible to do such thing? Maybe take everything to 3rd " - " slice and then save to variable everything left and trim whitespaces and non letter characters?
EDIT:
I have just noticed that in files every second row in this column got products code:
550-1008-000000-405.02.04.03
Which I have to split too on e.q 550-1028 and 405.02.04.03 so i have to remove those zeros on the middle of string and get 2 substrings
Upvotes: 0
Views: 77
Reputation: 17238
The following regex does the trick:
^([^-]+-){3}(.*)
The result is in $2
, so a standalone code snippet would look like:
var s = [
'Basic medical supply - AIT - needs documents from trasnport - drugs'
, 'Basic medical supply - TTT - needs documents from trasnport - patiens and other stuff'
, 'random string from excel - TTP - other text like always - marijuana per rectum'
]
, res
, i;
for ( i=0; i < s.length; i++ ) {
res = s[i].replace(/^([^-]+-){3}(.*)/g, "$2");
console.log ( "#" + i + ": '" + s[i] + "# -> '" + res + "'\n" );
}
live test on regex 101.
Explanation
The regex is based on a sequence of arbitrary characters except -
, trailed by a single -
. It matches 3 consecutive occurrences of this base sequence and assigns the remainder of the line to capture group 2.
Caveat
Beware of consecutive -
chars - this solution is not compatible with such strings and the proper way to handle that might depend on your data ( eg. may --
occur as an ascii representation of a dash —
?).
Upvotes: 1
Reputation: 1583
the regex could just be /^.+-.+-.+-\s*([^-]+)\s*$/
or even easier: /-\s*([^-]+)\s*$/
everything after the last "-". Or could that string contain "-" itself?
Upvotes: 2
Reputation: 4876
This works simple without any time taking regex
var s = [
'Basic medical supply - AIT - needs documents from trasnport - drugs'
, 'Basic medical supply - TTT - needs documents from trasnport - patiens and other stuff'
, 'random string from excel - TTP - other text like always - marijuana per rectum'
];
s.forEach(function(el){
console.log(el.split("-")[3].trim());
});
Upvotes: 0
Reputation: 2656
If you really want a regular expression you can use this:
var s = 'Basic medical supply - AIT - needs documents from trasnport - drugs';
var regex = /[^-]+-[^-]+-[^-]+-(.*)/;
var match = regex.exec(s);
console.log(match[1]); //outputs "drugs"
But I'd prefer the solution from @VinodLouis (in the comments) without using regular expressions...
Upvotes: 0