Reputation: 11
I have a macro which reads file names from a folder. The problem is that when file names are in series like A1,A2.....A200.pdf, as in this image:
then it reads in Excel as A1,A10,A100,A101.....A109,A11,A110.....A119,A20, as in this image:
How can I sort this so that the value in Excel comes as same as folder file names, or is there a way I can sort in Excel itself?
Upvotes: 1
Views: 1909
Reputation: 14373
The best way is to change the file names in your Excel list to have leading zeroes. Instead of A19, refer to the file as A019 and it will sort correctly. Convert the file names using this formula in a helper column.
=Left($A2, 1) & Right("000" & Mid($A2, 2, 3), 3)
Note that the 3 zeroes and string lengths of 3 are all related to each other. To create fixed length numbers of 4 digits, just use 4 zeroes and increase both string lengths to 4.
Copy the formula down from row 2 to the end. Copy the helper column, paste Values
in place and, when everything is perfect, replace the original column with the helper.
In order to accommodate a fixed number of digits following the number the above formula may be tweaked. The formula below will accommodate 4 extra characters which follow the number, for example ".pdf" (including the period).
=Left($A2, 1) & Right("000" & Mid($A2, 2, 7), 7)
Upvotes: 0
Reputation: 494
This is happening ofcourse of because different sorting algorithm in both these cases (Windows Explorer and Excel) Refer to this article if you want to understand.
To solve your problem, one of the ways is to pull out only the numeric part of file names in a different cell (say column B) and then sort based on those numbers.
If I can assume that the pattern of the files names is AXXX.pdf i.e. one letter A, then number, and 4 characters for file extension. You can use this function
=VALUE(MID(A1,2,LEN(A1)-5))
This works by pulling out some number of characters from in between the string. As per assumption, the number starts from 2nd place that's why the second parameter is 2. Then to decide, how many characters you pull, you know that all the characters except 'A' (1 char) and '.pdf' (4 chars) make the number. So, take the lenght of the whole name and reduce 5 characters. You get your number part which you can sort.
Upvotes: 1
Reputation: 33672
Another option, you can use the RegEx
object to extract the Numeric digits "captured" inside the file name.
Option Explicit
Sub SortFileNames()
Dim i As Long
With Sheets("Sheet1") ' replaces "Sheet1| with your sheet's name
For i = 1 To .Cells(.Rows.Count, "A").End(xlUp).Row
.Range("B" & i).Value = RetractNumberwithRegex(.Range("A" & i)) ' call the Regex function
Next i
End With
End Sub
'========================================================================
Function RetractNumberwithRegex(Rng As Range) As String
' function uses the Regex object to substract the Numeric values inside
Dim Reg1 As Object
Dim Matches As Object
Set Reg1 = CreateObject("vbscript.regexp")
With Reg1
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "[0-9]{1,20}" ' any size numeric string (up to 20 digits length)
End With
Set Matches = Reg1.Execute(Rng.Value2)
If Matches.Count <> 0 Then
RetractNumberwithRegex = Matches.Item(0)
End If
End Function
Upvotes: 1
Reputation: 81
You can sort this in Excel with a helper column. Create a new column and calculate the length of your filenames in that "=LEN(A1)". Then use two-level sort to sort your filenames. Data -> Sort: Use length in the first level and the filenames in the second level.
Upvotes: 2