Acanthus
Acanthus

Reputation: 303

How to use regex to match everything from the start until dot or dash?

I got a system that only accepts regular expressions for matching.

Regular expressions needs to match the following:

File.f
File-1.f

in both cases it has to return what's before the . or - in the 2nd case (File).

Upvotes: 1

Views: 245

Answers (4)

onaclov2000
onaclov2000

Reputation: 5841

I agree that Kimvais has a really solid answer (I can't vote so sorry)

I wrote it up in perl before i read their answer, I came up with this:

$string1 = "John.f"; $string2 = "Eric-1.f";

$string1 =~ m/^([0-9a-zA-Z]+)[.-]/i; print $1 . "\n\n\n";

$string2 =~ m/^([0-9a-zA-Z]+)[.-]/i;

print $1 . "\n\n\n";

Basically it's along the same lines of Kimvais's except that his will accept any character's before the . or -, which I'm not sure if you want to see, mine will only accept number's or letter's then a . or a -

Good luck

Upvotes: 0

jason
jason

Reputation: 241641

I don't know what language you're using, but they all work mostly the same. In C# we would do something like the following:

List<string> files = new List<string>() {
    "File.f",
    "File-1.f"
};
Regex r = new Regex(@"^(?<name>[^\.\-]+)");
foreach(string file in files) {
    Match m = r.Match(file);
    Console.WriteLine(m.Groups["name"]);
}

The named group allows you to easily extract the prefix that you are seeking. The above prints

File
File

on the console.

I strongly encourage you to pick up the book Mastering Regular Expressions. Every programmer should be comfortable with regular expressions and Friedl's book is by far the best on the subject. It has pertinent to Perl, Java, .NET and PHP depending on your language choice.

Upvotes: 1

Kimvais
Kimvais

Reputation: 39548

^([^.-]+).*\.f$

First ^ means beginning of a line

() means a group - this is the part that is captured and returned as the first group (depending on your language it is $1, \1 or groups()[0] or group(1)

[] means one from this set of characters

[^ means a set not containing these characters, i.e. it is "all characters but not the ones I list" opposed to [], which means "no characters but only the ones I list"

+ means that the previous can be repeated from 1 to infinity times.

. is 'any' single character

* is repeats from 0 to infinity times.

\. means the character . (because . is special)

f is just the letter f (or word f, actually)

$ is the end of line.

Upvotes: 1

Lucero
Lucero

Reputation: 60190

This should do:

^[^\.\-]+

(In English: Match has to start at the beginning of the string and consists of everything until either a . or a - is found.)

Upvotes: 5

Related Questions