Reputation: 1
I have been a lurker at stackoverflow.com for many years (great site and users here), but never had the need to ask a question. Now the time has come :-) Let me begin:
OS: x64 Windows 8.0 to Windows 10 (15063.14) (the issue exists since years, but I have never pursued it fully yet, so we can exclude that it is specific to a specific Windows version)
FS: NTFS
Issue: 2 files with the same (long) name in the same directory and I cannot figure out how this is even possible. This happens to me since years whenever I manually upgrade my Email client. The main .EXE file of it (MailClient.exe) is never asking for replacement if copying the new one over to the same directory. Instead they are both placed there, with the exact same long name.
The issue has nothing to do with a specific directory, I can copy around both .EXE files to freshly created directories on the NTFS drive without issues (also getting no "overwrite" question there).
Let me show you:
C:\temp\2>dir
Volume in drive C is SSD 840 Pro
Volume Serial Number is 0C6D-D489
Directory of C:\temp\2
13.04.2017 02:29 <DIR> .
13.04.2017 02:29 <DIR> ..
21.10.2016 17:10 24.742.760 MailClient.exe
27.12.2016 03:26 24.911.872 MailCliеnt.exe
2 File(s) 49.654.632 bytes
2 Dir(s) 78.503.038.976 bytes free
However, if doing a dir /x, this comes up:
C:\temp\2>dir /x
Volume in drive C is SSD 840 Pro
Volume Serial Number is 0C6D-D489
Directory of C:\temp\2
13.04.2017 02:29 <DIR> .
13.04.2017 02:29 <DIR> ..
21.10.2016 17:10 24.742.760 MAILCL~2.EXE MailClient.exe
27.12.2016 03:26 24.911.872 MAILCL~1.EXE MailCliеnt.exe
2 File(s) 49.654.632 bytes
2 Dir(s) 78.503.038.976 bytes free
So they obviously have a different 8.3 name, OK, but the exact same long name. Here is another screenshot of the situation. Both files show the same location within the Windows "properties" dialog (right click) too. Unfortunately I am not allowed to post images just yet (it seems) - just tried. So you will have to take my word.
I cannot figure out how this is possible and this is bugging me ;) As soon as I rename both files for example to 1.exe, Windows starts telling me that there is already a file with that name in the same directory. So it obviously has something to do with the filename, but they are both exactly identical, no extra spaces, nothing, as you can see from the DIR command.
I´ve also tried to rename them and re-wrote the exact wording "MailCient.exe" manually for both, to make sure the characters are EXCACTLY the same, Windows still won´t complain, they both go there once again under the same name. However, renaming them to "Mail.exe" and "Mail.exe" will NOT work, then Windows is saying that another file with that name already exists. However, naming them both back to "MailClient.exe" is just absolutely fine, no complains by Windows with that.
Another fun fact about this, if I dir for mailclient.exe directly, this happens:
C:\temp\2>dir mailclient.exe
Volume in drive C is SSD 840 Pro
Volume Serial Number is 0C6D-D489
Directory of C:\temp\2
21.10.2016 17:10 24.742.760 MailClient.exe
1 File(s) 24.742.760 bytes
0 Dir(s) 78.501.998.592 bytes free
However, if looking for *.exe, this happens:
C:\temp\2>dir *.exe
Volume in drive C is SSD 840 Pro
Volume Serial Number is 0C6D-D489
Directory of C:\temp\2
21.10.2016 17:10 24.742.760 MailClient.exe
27.12.2016 03:26 24.911.872 MailCliеnt.exe
2 File(s) 49.654.632 bytes
0 Dir(s) 78.501.990.400 bytes free
This yields also interesting results:
C:\temp\2>ren mailclient.exe *.bak
C:\temp\2>dir
Volume in drive C is SSD 840 Pro
Volume Serial Number is 0C6D-D489
Directory of C:\temp\2
13.04.2017 02:50 <DIR> .
13.04.2017 02:50 <DIR> ..
21.10.2016 17:10 24.742.760 MailClient.bak
27.12.2016 03:26 24.911.872 MailCliеnt.exe
2 File(s) 49.654.632 bytes
2 Dir(s) 78.501.990.400 bytes free
And back:
C:\temp\2>ren mailclient.bak MailClient.exe
C:\temp\2>dir
Volume in drive C is SSD 840 Pro
Volume Serial Number is 0C6D-D489
Directory of C:\temp\2
13.04.2017 02:51 <DIR> .
13.04.2017 02:51 <DIR> ..
21.10.2016 17:10 24.742.760 MailClient.exe
27.12.2016 03:26 24.911.872 MailCliеnt.exe
2 File(s) 49.654.632 bytes
2 Dir(s) 78.501.982.208 bytes free
I´ve also checked permissions on the files and took ownership, it changes nothing. Additionally I´ve cleared the NTFS Journal and even the transaction log + run chkdsk, which reveals no errors either.
Any ideas on this mysterious situation? What am I missing?
Thanks so much:)
UPDATE #1:
I´ve just tried this: going to Windows explorer and renaming both files after each other by truncating their names. So I first renamed the first "MailClient.exe" to "MailClien.exe", then the seconds "MailClient.exe" to "MailClien.exe". Again, no message by Windows that they have the same name, it just renamed both fine. I then continued to "MailClie.exe". Worked. However, as soon as I tried to renamed both to "MailCli.exe", Windows complained and told me that there is already another file with that name. Trying to rename both back from there to "MailClient.exe" also does not work, just for one of them, because then Windows says (and right so too) that a file with that name already exists. So it seems to come down to the "e" possibly having another ANSI-character in both filenames? I, however, wouldn´t know of another one for "e", or am I missing something?
Upvotes: 0
Views: 2090
Reputation: 30113
Harry Johnston is right: one of the filenames contains a Unicode character that just looks the same as an ANSI character.
Read Naming Files, Paths, and Namespaces:
On newer file systems, such as NTFS, exFAT, UDFS, and FAT32, Windows stores the long file names on disk in Unicode, which means that the original long file name is always preserved. This is true even if a long file name contains extended characters, regardless of the code page that is active during a disk read or write operation.
Use the following PowerShell script 43381802b.ps1
to detect and show non-ANSI file names (see different calls below):
param( [string[]]$Path = '.',
[switch]$Cpp, ### list any non-ANSI character in file names like a C++ literal
### i.e. a prefix \u followed by a four digit Unicode code point
[switch]$All ### list all files including pure ANSI-encoded file names
)
Set-StrictMode -Version latest
$strArr = Get-ChildItem -path $Path
$arrDiff = @()
for ($i=0; $i -lt $strArr.Count; $i++) {
$strDiff = 'ANSI'
$strName = ''
$auxName = $strArr[$i].Name
for ( $k=0; $k -lt $auxName.Length; $k++ ) {
if ( [int][char]$auxName[$k] -gt 255 ) {
$strDiff = 'UCS2'
$strName += '\u{0:X4}' -f [int][char]$auxName[$k]
} else {
$strName += $auxName[$k]
}
}
if ( $All.IsPresent -or $strDiff -eq 'UCS2' ) {
$strArr[$i] | Add-Member NoteProperty Code $strDiff
$strArr[$i] | Add-Member NoteProperty CppName $strName
$arrDiff += $strArr[$i]
}
}
if ( $Cpp.IsPresent ) {
$arrDiff | Select-Object -Property Code, Mode, LastWriteTime, Length, CppName | ft
} else {
$arrDiff | Select-Object -Property Code, Mode, LastWriteTime, Length, Name | ft
}
Output:
PS D:\PShell> .\SO\43381802b.ps1 'C:\testC\43381802'
Code Mode LastWriteTime Length Name
---- ---- ------------- ------ ----
UCS2 -a---- 02/05/2017 11:47:53 317 MailCliеnt.txt
UCS2 -a---- 02/05/2017 11:49:04 317 МailClient.txt
UCS2 -a---- 02/05/2017 11:50:16 399 МailCliеnt.txt
PS D:\PShell> .\SO\43381802b.ps1 'C:\testC\43381802' -Cpp
Code Mode LastWriteTime Length CppName
---- ---- ------------- ------ -------
UCS2 -a---- 02/05/2017 11:47:53 317 MailCli\u0435nt.txt
UCS2 -a---- 02/05/2017 11:49:04 317 \u041CailClient.txt
UCS2 -a---- 02/05/2017 11:50:16 399 \u041CailCli\u0435nt.txt
PS D:\PShell> .\SO\43381802b.ps1 'C:\testC\43381802' -Cpp -All
Code Mode LastWriteTime Length CppName
---- ---- ------------- ------ -------
ANSI -a---- 02/05/2017 11:44:05 235 MailClient.txt
UCS2 -a---- 02/05/2017 11:47:53 317 MailCli\u0435nt.txt
UCS2 -a---- 02/05/2017 11:49:04 317 \u041CailClient.txt
UCS2 -a---- 02/05/2017 11:50:16 399 \u041CailCli\u0435nt.txt
Use the following 43381802a.ps1
script to get more info about non-ANSI characters (see the first call bellow) and their position in file names (see the latter call bellow with -Detail
switch):
param( [string[]] $strArr = @('ΗGreek', 'НCyril', 'HLatin'),
[switch]$Detail )
Set-StrictMode -Version latest
$auxArr = @()
if ( ( Get-Command -Name Get-CharInfo -ErrorAction SilentlyContinue ) -and
( -not $Detail.IsPresent ) ) {
$auxArr = $strArr | Get-CharInfo |
Where-Object { [int]$_.Codepoint.Replace('U+', '0x') -ge 128 }
} else {
foreach ($strStr in $strArr) {
for ($i = 0; $i -lt $strStr.Length; $i++ ) {
if ( [int][char]$strStr[$i] -ge 128 ) {
$auxArr += [PSCustomObject] @{
Char = $strStr[$i]
CodePoint = 'U+{0:x4}' -f [int][char]$strStr[$i]
Category = $i + 1 ### 1-based index
Description = $strStr ### string itself
}
}
}
}
}
$auxArr
Output:
PS D:\PShell> .\SO\43381802a.ps1 ( Get-childitem -path 'C:\testC\43381802' ).Name
Char CodePoint Category Description
---- --------- -------- -----------
е U+0435 LowercaseLetter Cyrillic Small Letter Ie
М U+041C UppercaseLetter Cyrillic Capital Letter Em
М U+041C UppercaseLetter Cyrillic Capital Letter Em
е U+0435 LowercaseLetter Cyrillic Small Letter Ie
PS D:\PShell> .\SO\43381802a.ps1 ( Get-childitem -path 'C:\testC\43381802' ).Name -detail
Char CodePoint Category Description
---- --------- -------- -----------
е U+0435 8 MailCliеnt.txt
М U+041c 1 МailClient.txt
М U+041c 1 МailCliеnt.txt
е U+0435 8 МailCliеnt.txt
Tested on files:
==> dir /-C /X /A-D C:\testC\43381802\
Volume in drive C has no label.
Volume Serial Number is …
Directory of C:\testC\43381802
02/05/2017 11:44 235 MAILCL~1.TXT MailClient.txt
02/05/2017 11:47 317 MAILCL~2.TXT MailCliеnt.txt
02/05/2017 11:49 317 AILCLI~1.TXT МailClient.txt
02/05/2017 11:50 399 AILCLI~2.TXT МailCliеnt.txt
4 File(s) 1268 bytes
0 Dir(s) 69914857472 bytes free
==>
Upvotes: 1