Reputation: 397

Using batch, how to write unicode into a file?

I want to drag and drop folders/files print out all dirs/files and files of its subfolder recursively into a file.

@echo off
REM chcp 1250
REM chcp 65001

if [%1]==[] goto :eof
:loop
  echo %1 >> aText.txt
  for /f "tokens=* delims=" %%a in ('dir %1  /s /b') do (
    echo %%a >> aText.txt
  )
shift
if not [%1]==[] goto loop

aText.txt

@pause

And that works fine, but it doesn't support Unicode filenames. It also doesn't work, if I save the bat file itself under UTF-8 or Unicode. I have looked at this: Unicode characters in Windows command line - how?

But this doesn't make it work. My guess is, the chcp makes it possible to write unicode in the batch file, and not unicode in the file it creates. How do I get the unicode filenames written into the file it creates?

EDIT:

To re-phrase my question more precisely. I want to write this unicode to be readable by the browser (e.g. Chrome mostly) What I have now is this:

@echo off
chcp 65001

if [%1]==[] goto :eof
:loop
  echo %1 > aText.txt
  for /f "tokens=* delims=" %%a in ('dir %1  /s /b') do (
echo   ^<br^>^<img src='%%a'^> >> aText.txt
REM    echo %%a >> aText.txt
  )
shift
if not [%1]==[] goto loop

aText.txt

@pause

So I open this in notepad, it shows the unicode, all fine. (Just as MC ND describes in the answer) This gives me:

D:\Downloads\unicodes 
  <br><img src='D:\Downloads\unicodes\sdsdsd.html'> 
  <br><img src='D:\Downloads\unicodes\ŽŽŽŽŽ.png'> 
  <br><img src='D:\Downloads\unicodes\中文.png'> 
  <br><img src='D:\Downloads\unicodes\文言.png'> 
  <br><img src='D:\Downloads\unicodes\日本語.png'> 
  <br><img src='D:\Downloads\unicodes\日本語.txt'> 
  <br><img src='D:\Downloads\unicodes\粵語.png'> 
  <br><img src='D:\Downloads\unicodes\한국어.png'>

However, when I open this with Chrome it gets:

D:\Downloads\unicodes 
  <br><img src='D:\Downloads\unicodes\sdsdsd - Kopie.txt'> 
  <br><img src='D:\Downloads\unicodes\sdsdsd.html'> 
  <br><img src='D:\Downloads\unicodes\Å½Å½Å½Å½Å½.png'> 
  <br><img src='D:\Downloads\unicodes\ä¸æ–‡.png'> 
  <br><img src='D:\Downloads\unicodes\æ–‡è¨€.png'> 
  <br><img src='D:\Downloads\unicodes\æ—¥æœ¬èªž.png'> 
  <br><img src='D:\Downloads\unicodes\æ—¥æœ¬èªž.txt'> 
  <br><img src='D:\Downloads\unicodes\ç²µèªž.png'> 
  <br><img src='D:\Downloads\unicodes\í•œêµì–´.png'>

obviously, when I rename the txt file to an html file, there is just a bunch of broken images even for the png files.

When I manually open the txt in notepad and re-save the txt file under a diffrent name, not even changing any of the set encodings (UTF-8), all works fine, as I want it, but I need to get rid of this manual saving.

With npocmaka's CM \u solution I was getting something with spaces inbetween each character, unfortunately I suddenly don't seem to be able to reproduce this after trying around uselessly, and instead with this now:

@echo off
chcp 65001

cmd /u /c for /f "tokens=* delims=" %%a in ('dir %1 /s /b') do ( echo %%a >> aText.txt )

aText.txt

I get

D:\Downloads>(echo D:\Downloads\unicodes\sdsdsd.html   ) 
D:\Downloads\unicodes\sdsdsd.html  

D:\Downloads>(echo D:\Downloads\unicodes\ŽŽŽŽŽ.png   ) 
D:\Downloads\unicodes\ŽŽŽŽŽ.png  

D:\Downloads>(echo D:\Downloads\unicodes\中文.png   ) 
D:\Downloads\unicodes\中文.png  

D:\Downloads>(echo D:\Downloads\unicodes\文言.png   ) 
D:\Downloads\unicodes\文言.png  

D:\Downloads>(echo D:\Downloads\unicodes\日本語.png   ) 
D:\Downloads\unicodes\日本語.png  

D:\Downloads>(echo D:\Downloads\unicodes\日本語.txt   ) 
D:\Downloads\unicodes\日本語.txt  

D:\Downloads>(echo D:\Downloads\unicodes\粵語.png   ) 
D:\Downloads\unicodes\粵語.png  

D:\Downloads>(echo D:\Downloads\unicodes\한국어.png   ) 
D:\Downloads\unicodes\한국어.png

whose double line output despite the echo off in itself is weird for me, but at any rate, in notepad the unicode filesnames are shown, but chrome would not want even want to open the txt, and upon renaming the extention to html, it shows "garbage" as follows:

D:\Downloads>(echo D:\Downloads\unicodes\sdsdsd.html ) D:\Downloads\unicodes\sdsdsd.html D:\Downloads>(echo D:\Downloads\unicodes\}}}}}.png ) D:\Downloads\unicodes\}}}}}.png D:\Downloads>(echo D:\Downloads\unicodes\-N‡e.png ) D:\Downloads\unicodes\-N‡e.png D:\Downloads>(echo D:\Downloads\unicodes\‡eŠ.png ) D:\Downloads\unicodes\‡eŠ.png D:\Downloads>(echo D:\Downloads\unicodes\åe,gžŠ.png ) D:\Downloads\unicodes\åe,gžŠ.png D:\Downloads>(echo D:\Downloads\unicodes\åe,gžŠ.txt ) D:\Downloads\unicodes\åe,gžŠ.txt D:\Downloads>(echo D:\Downloads\unicodes\µ|žŠ.png ) D:\Downloads\unicodes\µ|žŠ.png D:\Downloads>(echo D:\Downloads\unicodes\\Õm´Å.png ) D:\Downloads\unicodes\\Õm´Å.png

which is not what I need...

Upvotes: 1

Answers (2)

Chef Pharaoh

Reputation: 2406

I was having this problem with certain wmic commands want to write as unicode characters to the file. Here is how I resolved the problem:

echo %%a |more>> aText.txt

This also works on WinPE, for those who may be interested.

Upvotes: 2

MC ND

Reputation: 70951

Directory with a file containig unicode characters in filename (∏∏∏∏.txt).

With pagecode 850, dir command show correct filename, but redirection of dir command to file just generates ansi file with ????.txt both from type or notepad

With pagecode 65001, dir command show correct filename, redirection to file generates a utf-8 file, correct displayed with type under pagecode 65001 and "garbage" under pagecode 850. Notepad shows correct values.

With cmd /u (unicode), with pagecode 850 or 65001, dir command shows correct infor, but redirection generates a unicode file (two bytes per character). Type command displays "spaces" between characters in any pagecode. Notepad handles the file without problems.

Solution ? There's no simple solution. Each program/system/display understand diferent things. Determine what will be the final output of the information and make sure all the involved elements, independently of how data is shown in middle stages, allow you to generate the desired output.

Answering your cuestion, to get UNICODE characters inside file, npocmaka comment gives you what you need: start a new cmd instance with /u as parameter, obtaining an unicode command line.

Upvotes: 0

Using batch, how to write unicode into a file?

Answers (2)

Related Questions