angularconsulting.au
angularconsulting.au

Reputation: 28259

C# Sort files by natural number ordering in the name?

I have files in directory like that

0-0.jpeg
0-1.jpeg
0-5.jpeg
0-9.jpeg
0-10.jpeg
0-12.jpeg

....

when i loading files:

FileInfo[] files = di.GetFiles();

They getting in wrong order (they should go like above):

0-0.jpeg
0-1.jpeg
0-10.jpeg
0-12.jpeg
0-5.jpeg
0-9.jpeg

How to fix that?

I was trying to sort them but no way:

1) Array.Sort(files, (f1, f2) => f1.Name.CompareTo(f2.Name));

2) Array.Sort(files, (x, y) => StringComparer.OrdinalIgnoreCase.Compare(x.Name, y.Name)); 

Upvotes: 15

Views: 27761

Answers (7)

Pierre
Pierre

Reputation: 9052

Time to get creative.

FileInfo[] files = di.GetFiles().OrderBy(file =>
    Regex.Replace(file.Name, @"\d+", match => match.Value.PadLeft(4, '0'))
);

Using Regex replace in the OrderBy Clause:

Regex.Replace(file.Name, @"\d+", match => match.Value.PadLeft(4, '0'))

So what this does is it pads each match of numeric values in the file name with a 0 character to a length of 4 characters:

0-0.jpeg     ->   0000-0000.jpeg
0-1.jpeg     ->   0000-0001.jpeg
0-5.jpeg     ->   0000-0005.jpeg
0-9.jpeg     ->   0000-0009.jpeg
0-10.jpeg    ->   0000-0010.jpeg
0-12.jpeg    ->   0000-0012.jpeg

This only happens in the OrderBy clause, it does not alter the original file name.
You will end up with the order you are looking for.

If your file names have dates in them, it would still work:

pic_20230124_1528.jpg -> pic_00020000000200030000000100020004_0001000500020008.jpg
pic_20230124_1601.jpg -> pic_00020000000200030000000100020004_0001000600000001.jpg
pic_20230305_0951.jpg -> pic_00020000000200030000000300000005_0000000900050001.jpg

Upvotes: 31

rittergig
rittergig

Reputation: 755

Here my solution. I first parsed the file path, then defined the order rules.

For the following code you need the namespace System.Text.RegularExpressions.

Regex parseFileNameForNaturalSortingRegex = new Regex(
    @"
        ^
        (?<DirectoryName>.*)
        (?<FullFileName>
            (?<FileNameBasePart>[^/\\]*)
            (?<FileNameNumericPart>\d+)?
            (?>
            \.
            (?<FileNameExtension>[^./\\]+)
            )?
        )
        $
    ",
    RegexOptions.IgnorePatternWhitespace | RegexOptions.RightToLeft
);

FileInfo[] filesOrdered = System.IO.Directory.GetFiles(@"C:\my\source\directory")
    .Select(fi =>
    {
        Match match = parseFileNameForNaturalSortingRegex.Match(fi.FullName);
        return new
        {
            FileInfo = fi,
            DirectoryName = match.Groups["DirectoryName"].Value,
            FullFileName = match.Groups["FullFileName"].Value,
            BasePart = match.Groups["FileNameBasePart"].Value,
            NumericPart = match.Groups["FileNameNumericPart"].Success ? int.Parse(match.Groups["FileNameNumericPart"].Value) : -1,
            HasFileNameExtension = match.Groups["FileNameExtension"].Success,
            FileNameExtension = match.Groups["FileNameExtension"].Value
        };
    })
    .OrderBy(r => r.DirectoryName)
    .ThenBy(r => r.BasePart)
    .ThenBy(r => r.NumericPart)
    .ThenBy(r => r.HasFileNameExtension)
    .ThenBy(r => r.FileNameExtension)
    .Select(r => r.FileInfo)
    .ToArray();

Upvotes: 1

Almir
Almir

Reputation: 1186

Implement Comparison method for your specific case and use it in Array.Sort.

private int CompareByNumericName(FileInfo firstFile, FileInfo secondFile)
{
    /* First remove '0-' and '.jpeg' from both filenames then... */

    int firstFileNumericName = Int32.Parse(firstFile.Name);
    int secondFileNumericName = Int32.Parse(secondFile.Name);

    return firstFileNumericName.CompareTo(secondFileNumericName);
}

 

FileInfo[] files = di.GetFiles();
Array.Sort<FileInfo>(files, CompareByNumericName);

Upvotes: -1

L.B
L.B

Reputation: 116098

See the "CustomSort" function here.

List<string> list = new List<string>() { 
                    "0-5.jpeg",
                    "0-9.jpeg",
                    "0-0.jpeg",
                    "0-1.jpeg",
                    "0-10.jpeg",
                    "0-12.jpeg"};
list.CustomSort().ToList().ForEach(x => Console.WriteLine(x));

Its output:

0-0.jpeg
0-1.jpeg
0-5.jpeg
0-9.jpeg
0-10.jpeg
0-12.jpeg

Upvotes: 7

sma6871
sma6871

Reputation: 3298

For solving this problem you can use StrCmpLogicalW windows API.

For more details see This Artice.

Upvotes: 7

Nicholas Carey
Nicholas Carey

Reputation: 74177

You filenames appear to be structured. If you just sort them, they sort as ordinary strings. You need to:

  1. Parse the file name into its constituent component parts.
  2. Convert the numeric segments to a numeric value.
  3. Compare that structure in the desired order to get the intended collation sequence.

Personally, I'd create a class that represented the structure implicit in the filename. Perhaps it should wrap the FileInfo. The constructor for that class should parse the filename into its constituent parts and instantiate the properties of the class appropriately.

The class should implement IComparable/IComparable<T> (or you could create an implementation of Comparer).

Sort your objects and they should then come out in the collating sequence you desire.

If looks like your file names are composed of 3 parts:

  • a high-order numeric value (let's call it 'hi'),
  • a low-order numeric value (let's call it 'lo'),
  • and an extension (let's call it 'ext')

So your class might look something like

public class MyFileInfoWrapper : IComparable<MyFileInfoWrapper>,IComparable
{
  public MyFileInfoWrapper( FileInfo fi )
  {
    // your implementation here
    throw new NotImplementedException() ;
  }

  public int    Hi         { get ; private set ; }
  public int    Lo         { get ; private set ; }
  public string Extension  { get ; private set ; }

  public FileInfo FileInfo { get ; private set ; }

  public int CompareTo( MyFileInfoWrapper other )
  {
    int cc ;
    if      ( other   == null     ) cc = -1 ;
    else if ( this.Hi <  other.Hi ) cc = -1 ;
    else if ( this.Hi >  other.Hi ) cc = +1 ;
    else if ( this.Lo <  other.Lo ) cc = -1 ;
    else if ( this.Lo >  other.Lo ) cc = +1 ;
    else                            cc = string.Compare( this.Extension , other.Extension , StringComparison.InvariantCultureIgnoreCase ) ;
    return cc ;
  }

  public int CompareTo( object obj )
  {
    int cc ;
    if      ( obj == null              ) cc = -1 ;
    else if ( obj is MyFileInfoWrapper ) cc = CompareTo( (MyFileInfoWrapper) obj ) ;
    else throw new ArgumentException("'obj' is not a 'MyFileInfoWrapper' type.", "obj") ;
    return cc ;
  }

}

Upvotes: 0

rein
rein

Reputation: 33445

Alphabetically, the "wrong" order is in fact correct. If you want it sorted numerically then you'll need to either:

  1. convert the filenames to a list of numeric numbers and sort them
  2. name the files in such a way that alphabetic and numeric sorting are the same (0-001.jpeg and 0-030.jpg)
  3. rely on the file creation time to sort (presuming the files were created in order).

See the answer to Sorting Directory.GetFiles() for an example of #3.

Upvotes: 10

Related Questions