user222427
user222427

Reputation:

C# complicated regex

Hey guys, thanks for all the help that you can provide. I need a little bit of regex help thats far beyond my knowledge.

I have a listbox with a file name in it, example 3123~101, a delimited file that has 1 line of text in it. I need to Regex everything after the last "\" before the last "-" in the text file. The ending will could contain a prefix then ###{####-004587}.txt The ~ formula is {### + ~# -1.

File name: 3123~101 So Example 1: 3123|X:directory\Path\Directory|Pre0{0442-0500}.txt

Result: X:\directory\Path\Directory\Pre00542.txt

File name: 3123~101 So Example 1: 3123|X:directory\Path\Directory|0{0442-0500}.txt

Result: X:\directory\Path\Directory\00542.txt

Upvotes: 1

Views: 568

Answers (4)

jerone
jerone

Reputation: 16871

According your example I've created the following regexp:

\|(.)(.*)\|(.*)\{\d{2}(\d{2})\-(\d{2}).*(\..*)

The result should be as following:

group1 + "\\" + group2 + "\\" + group3 + group5 + group4 + group6

If you ain't satisfied, you can always give it a spin yourself here.


EDIT:

After remembering me about named groups:

\|(?<drive>.)(?<path>.*)\|(?<prefix>.*)\{\d{2}(?<number2>\d{2})\-(?<number1>\d{2}).*(?<extension>\..*)

drive + "\\" + path + "\\" + prefix + number1 + number2 + extension

Upvotes: 3

Greg Bacon
Greg Bacon

Reputation: 139491

public static string AdjustPath(string filename, string line)
{
  int tilde = GetTilde(filename);

  string[] fields = Regex.Split(line, @"\|");

  var addbackslash = new MatchEvaluator(
    m => m.Groups[1].Value + "\\" + m.Groups[2].Value);
  string dir = Regex.Replace(fields[1], @"^([A-Z]:)([^\\])", addbackslash);

  var addtilde = new MatchEvaluator(
    m => (tilde + Int32.Parse(m.Groups[1].Value) - 1).
           ToString().
           PadLeft(m.Groups[1].Value.Length, '0'));

  return Path.Combine(dir, Regex.Replace(fields[2], @"\{(\d+)-.+}", addtilde));
}

private static int GetTilde(string filename)
{
  Match m = Regex.Match(filename, @"^.+~(\d+)$");

  if (!m.Success)
    throw new ArgumentException("Invalid filename", "filename");

  return Int32.Parse(m.Groups[1].Value);
}

Call AdjustPath as in the following:

public static void Main(string[] args)
{
  Console.WriteLine(AdjustPath("3123~101", @"3123|X:directory\Path\Directory|Pre0{0442-0500}.txt"));
  Console.WriteLine(AdjustPath("3123~101", @"3123|X:directory\Path\Directory|0{0442-0500}.txt"));
}

Output:

X:\directory\Path\Directory\Pre00542.txt
X:\directory\Path\Directory\00542.txt

If instead you want to write the output to a file, use

public static void WriteAdjustedPaths(string inpath, string outpath)
{
  using (var w = new StreamWriter(outpath))
  {
    var r = new StreamReader(inpath);
    string line;
    while ((line = r.ReadLine()) != null)
      w.WriteLine("{0}", AdjustPath(inpath, line));
  }
}

You might call it with

WriteAdjustedPaths("3123~101", "output.txt");

If you want a List<String> instead

public static List<String> AdjustedPaths(string inpath)
{
  var paths = new List<String>();

   var r = new StreamReader(inpath);
   string line;
   while ((line = r.ReadLine()) != null)
     paths.Add(AdjustPath(inpath, line));

   return paths;
}

To avoid repeated logic, we should define WriteAdjustedPaths in terms of the new function:

public static void WriteAdjustedPaths(string inpath, string outpath)
{
  using (var w = new StreamWriter(outpath))
  {
    foreach (var p in AdjustedPaths(inpath))
      w.WriteLine("{0}", p);
  }
}

The syntax could be streamlined with Linq. See C# File Handling.

Upvotes: 2

Martin Brown
Martin Brown

Reputation: 25310

A slight variation on gbacon's answer that will also work in older versions of .Net:

    static void Main(string[] args)
    {
        Console.WriteLine(Adjust("3123~101", @"3123|X:directory\Path\Directory|Pre0{0442-0500}.txt"));
        Console.WriteLine(Adjust("3123~101", @"3123|X:directory\Path\Directory|0{0442-0500}.txt"));
    }

    private static string Adjust(string name, string file)
    {
        Regex nameParse = new Regex(@"\d*~(?<value>\d*)");
        Regex fileParse = new Regex(@"\d*\|(?<drive>[A-Za-z]):(?<path>[^\|]*)\|(?<prefix>[^{]*){(?<code>\d*)");

        Match nameMatch = nameParse.Match(name);
        Match fileMatch = fileParse.Match(file);

        int value = Convert.ToInt32(nameMatch.Groups["value"].Value);

        int code = Convert.ToInt32(fileMatch.Groups["code"].Value);
        code = code + value - 1;

        string drive = fileMatch.Groups["drive"].Value;
        string path = fileMatch.Groups["path"].Value;
        string prefix = fileMatch.Groups["prefix"].Value;

        string result = string.Format(@"{0}:\{1}\{2}{3:0000}.txt",
            drive, 
            path,
            prefix, 
            code);

        return result;
    }

Upvotes: 1

Anon.
Anon.

Reputation: 59983

You don't seem to be very clear in your examples.

That said,

/.*\\(.*)-[^-]*$/

will capture all text between the last backslash and the last hyphen in whatever it's matched against.

Upvotes: 0

Related Questions