Reputation: 33
I'm trying to build a regex to catch a useful part of my S3 filename uploads. I used a regex generator and so far I have this test (which results in an error thrown on javascript):
/[A-Za-z]++[^\.\w][^\.]++|(?<=_)\w++(?=\.)/g
Here are some example strings that I am working with (with the require pattern to match):
"MTxoZbRRUu9BfQLvAWwP_Bruntwood Leeds Digital Festival ad.pdf" // desired match "Bruntwood Leeds Digital Festival ad"
"bbZRU3329BfXXvvAWwP_short-video.mp4" // desired match "short-video"
"zQZFnWVcRUbFNGyGdIP0_MGI-Artificial-Intelligence-Discussion-slides.pptx" // desired match "MGI-Artificial-Intelligence-Discussion-slides"
If it helps - I need to run this regex test on javascript.
const filename = "bbZRU3329BfXXvvAWwP_short-video.mp4";
const match = filename.match(regex);
console.log(match); // "short-video"
Thank you!
Upvotes: 0
Views: 66
Reputation: 163277
For these example strings you could split on a dot and an underscore [._]
That will give you an array with 3 parts. The values you are looking for are in the second part [1]
:
const strings = [
"MTxoZbRRUu9BfQLvAWwP_Bruntwood Leeds Digital Festival ad.pdf",
"bbZRU3329BfXXvvAWwP_short-video.mp4",
"zQZFnWVcRUbFNGyGdIP0_MGI-Artificial-Intelligence-Discussion-slides.pptx"
];
strings.forEach((s) => console.log(s.split(/[_.]/)[1]));
Upvotes: 0
Reputation: 48711
Don't use regex generators if they don't provide your end regex flavor as flavors syntax and features may differ from each other. You are basically doing this:
_[^.]+
with the only one difference that it matches preceding _
character too that you can work around it later in JS.
var text = `MTxoZbRRUu9BfQLvAWwP_Bruntwood Leeds Digital Festival ad.pdf
bbZRU3329BfXXvvAWwP_short-video.mp4
zQZFnWVcRUbFNGyGdIP0_MGI-Artificial-Intelligence-Discussion-slides`;
console.log(
text.match(/_[^.]+/g).map(v => v.substr(1))
);
Upvotes: 1
Reputation: 14927
Given your examples, you could use a much simpler regex:
const regex = /_([^.]+)/;
const inputs = [
"MTxoZbRRUu9BfQLvAWwP_Bruntwood Leeds Digital Festival ad.pdf", // desired match "Bruntwood Leeds Digital Festival ad"
"bbZRU3329BfXXvvAWwP_short-video.mp4", // desired match "short-video"
"zQZFnWVcRUbFNGyGdIP0_MGI-Artificial-Intelligence-Discussion-slides.pptx" // desired match "MGI-Artificial-Intelligence-Discussion-slides"
];
for (const input of inputs) {
const match = input.match(regex);
console.log(match[1]);
}
Upvotes: 3
Reputation: 85767
I used a regex generator
But not for JavaScript regexes, it seems. Every tool and library has its own regex quirks. In particular, JS doesn't support possessive quantifiers like ++
(nor independent submatches in general, (?>
)
).
JS also does not support look-behind, (?<=
)
.
You could e.g. do this instead:
const strs = [
"MTxoZbRRUu9BfQLvAWwP_Bruntwood Leeds Digital Festival ad.pdf",
"bbZRU3329BfXXvvAWwP_short-video.mp4",
"zQZFnWVcRUbFNGyGdIP0_MGI-Artificial-Intelligence-Discussion-slides.pptx",
];
for (const str of strs) {
const m = /_([^.]+)\./.exec(str);
if (!m) {
console.log("no match: " + str);
continue;
}
console.log("match: " + m[1]);
}
Upvotes: 2