Reputation: 3245
I don't get the point why there are encoding
and also fileencoding
in VIM.
In my knowledge, a file is like an array of bytes. When we create a text file, we create an array of characters (or symbols), and encode this character-array with encoding X to an array of bytes, and save the byte-array to disk. When read in text editor, it decode the byte-array with encoding X to reconstruct the original character-array, and display each character with a graph according to the font. In this process, only one encoding involved.
In VIM set encoding and fileencoding utf-8, which refers wiki of VIM about working with unicode,
encoding sets how vim shall represent characters internally. Utf-8 is necessary for most flavors of Unicode.
fileencoding sets the encoding for a particular file (local to buffer)
"How vim shall represent characters internally" vs "encoding for a particular file"... resambles Unicode vs UTF-8? If so, why should a user bother with the former?
Any hint?
Upvotes: 4
Views: 313
Reputation: 522522
I'll preface this by saying that I'm not a vim expert by any means.
I think the flaw in your thinking is here:
When read in text editor, it decode the byte-array with encoding X to reconstruct the original character-array, and display each character with a graph according to the font.
The thing is, vim is not responsible for rendering the glyph here. vim reads bytes from a file, stores them internally and sends bytes to the terminal which renders the glyph using a font. vim itself never touches fonts and hence never really needs to understand "characters". It only needs to work with bytes internally which it moves back and forth between files, internal buffers and the terminal.
Hence, there are three possible different byte storages involved:
fileencoding
encoding
termencoding
vim will convert between those as necessary. It could read from a Shift-JIS encoded file, store the data internally as UTF-16 and send/receive I/O to/from the terminal in UTF-8. I am not sure why you'd want to change the internal byte handling of vim (again, not an expert), but in any case, you can alter that setting if you want to.
Hypothesising follows: If you set encoding
to a Unicode encoding, you're safe to be able to handle any possible character you may encounter. However, in some circumstances those Unicode encodings may be too large to comfortably fit into memory in very limited systems, so in this case you may want to use a more specialised encoding if you know what you're doing.
Upvotes: 5
Reputation: 172718
You're right; most programs have a fixed internal encoding (speaking of C datatypes, that's either char
, which mostly then uses the underlying locale and may not be able to represent all characters, or UTF-8; or wchar
(wide characters) which can represent the Unicode range). The choice is mainly driven by programming language and available APIs (as having to convert back and forth is tedious and not efficient).
Vim, because it supports a large variety of platforms (starting with the old Amiga where development started) and is geared towards programmers and highly advanced users allows to configure the internal representation.
'fileencodings'
, or explicitly specify it.'encoding'
. With utf-8
, you're on the safe side.'termencoding'
.As you can see, though it can be confusing to the beginner, you actually have all the power available to you!
Upvotes: 6