MatteS
MatteS

Reputation: 1542

Regex to match the first file in a rar archive file set

How do I do this? I've seen a solution not using a single regex for ruby becauase ruby doesn't support loookaround assertions. But is it possible in c#?

[Test]
public void RarArchiveFirstFileNameShouldMatch() {
    var regex = new Regex(@"\.(rar|001)$", RegexOptions.IgnoreCase | RegexOptions.Singleline);
    Assert.That(regex.IsMatch("filename.001"));
    Assert.That(regex.IsMatch("filename.rar"));
    Assert.That(regex.IsMatch("filename.part1.rar"));
    Assert.That(regex.IsMatch("filename.part01.rar"));
    Assert.That(regex.IsMatch("filenamepart44.rar"));
    Assert.That(regex.IsMatch("filename.004"), Is.False);
    Assert.That(regex.IsMatch("filename.057"), Is.False);
    Assert.That(regex.IsMatch("filename.r67"), Is.False);
    Assert.That(regex.IsMatch("filename.s89"), Is.False);
    Assert.That(regex.IsMatch("filename.part2.rar"), Is.False);
    Assert.That(regex.IsMatch("filename.part04.rar"), Is.False);
    Assert.That(regex.IsMatch("filename.part11.rar"), Is.False);
}

Upvotes: 3

Views: 725

Answers (2)

Mark Byers
Mark Byers

Reputation: 838216

You can do it in one regular expression, both in C# and in Ruby, but why bother?

You haven't really defined exactly what you want - first you should document that. Once you've documented it, it's easy to turn that description into ordinary code. I think that it is more readable and maintable this way:

/// <summary>
/// Returns true if a filename's extension is .001.
/// If the extensions is .rar, check to see if there is a part number
/// immediately before the extension.
/// If there is no part number, return true.
/// If there is a part number, returns true if the part number is 1.
/// In all other cases, return false.
/// </summary>
static bool isMainFile(string name)
{
    string extension = Path.GetExtension(name);
    if (extension == ".001")
        return true;
    if (extension != ".rar")
        return false;
    Match match = Regex.Match(name, @"\.part(\d+)\.rar$");
    if (!match.Success)
        return true;
    string partNumber = match.Groups[1].Value.TrimStart('0');
    return partNumber == "1";
}

I've left one regular expression in there as it's not too complex, and the alternative of fiddling with the Path functions seems clunky to me. Overall, I think the above code expresses the intention much more clearly than a regular expression does.

I do like it in when you can cleanly solve a problem with an elegant regular expression, but I'm not sure that a single regular expression is the best way to solve this problem.

Upvotes: 2

Max Shawabkeh
Max Shawabkeh

Reputation: 38603

This should pass your tests:

    var regex = new Regex(@"(\.001|\.part0*1\.rar|^((?!part\d*\.rar$).)*\.rar)$", RegexOptions.IgnoreCase | RegexOptions.Singleline);
    Assert.That(regex.IsMatch("filename.001"));
    Assert.That(regex.IsMatch("filename.rar"));
    Assert.That(regex.IsMatch("filename.part1.rar"));
    Assert.That(regex.IsMatch("filename.part01.rar"));
    Assert.That(regex.IsMatch("filename.004"), Is.False);
    Assert.That(regex.IsMatch("filename.057"), Is.False);
    Assert.That(regex.IsMatch("filename.r67"), Is.False);
    Assert.That(regex.IsMatch("filename.s89"), Is.False);
    Assert.That(regex.IsMatch("filename.part2.rar"), Is.False);
    Assert.That(regex.IsMatch("filename.part04.rar"), Is.False);
    Assert.That(regex.IsMatch("filename.part11.rar"), Is.False);

Upvotes: 5

Related Questions