Reputation: 1499
Context: I want to right-justify part of my prompt. In doing so, my current approach is to compute the length of both the left and right components of it and fill in the middle component with spaces.
Problem: Coping with %G
(see prompt expansion) when the string possibly contains unicode (for example git status). Possibly the actual problem is that I don't grasp it correctly. The use of %G
was suggested in another thread answer about how to signal zsh that there are characters to be output, maybe the source of my confusion. The following snippet illustrates the problem:
strlen() {
FOO=$1
local invisible='%([BSUbfksu]|([FB]|){*})' # (1)
LEN=${#${(S%%)FOO//$~invisible/}}
echo $LEN
}
local blob="%{↓%G%}"
echo $blob $(strlen $blob) # (2) Unexpectedly gives 0
local blob="↓"
echo $blob $(strlen $blob) # (3) Gives the wanted output of 1
# but then this result would tell us to not use %G for unicode
The strlen
function comes from this tentative explanation of counting user-visible string. Unfortunately, there was no clear complete explanation for the invisible
part # (1) any extra references/explanation on this would be also welcome.
Question: When should I really use %G
? Or should I just ditch it as suggested by the above snippet?
Upvotes: 5
Views: 2764
Reputation: 2688
The following function computes the length of a string in the same way it's done during prompt expansion. It handles all inputs correctly unlike other solutions.
# Usage: prompt-length TEXT [COLUMNS]
#
# If you run `print -P TEXT`, how many characters will be printed
# on the last line?
#
# Or, equivalently, if you set PROMPT=TEXT with prompt_subst
# option unset, on which column will the cursor be?
#
# The second argument specifies terminal width. Defaults to the
# real terminal width.
#
# Assumes that `%{%}` and `%G` don't lie.
#
# Examples:
#
# prompt-length '' => 0
# prompt-length 'abc' => 3
# prompt-length $'abc\nxy' => 2
# prompt-length '❎' => 2
# prompt-length $'\t' => 8
# prompt-length $'\u274E' => 2
# prompt-length '%F{red}abc' => 3
# prompt-length $'%{a\b%Gb%}' => 1
# prompt-length '%D' => 8
# prompt-length '%1(l..ab)' => 2
# prompt-length '%(!.a.)' => 1 if root, 0 if not
function prompt-length() {
emulate -L zsh
local COLUMNS=${2:-$COLUMNS}
local -i x y=${#1} m
if (( y )); then
while (( ${${(%):-$1%$y(l.1.0)}[-1]} )); do
x=y
(( y *= 2 ))
done
while (( y > x + 1 )); do
(( m = x + (y - x) / 2 ))
(( ${${(%):-$1%$m(l.x.y)}[-1]} = m ))
done
fi
echo $x
}
This function comes from Powerlevel10k ZSH theme where it's used to implement multi-line right prompt and responsive current directory truncation (demo). More info: Multi-line prompt: The missing ingredient.
Upvotes: 1
Reputation: 18329
You do not have to take any additional steps when using Unicode characters instead of plain ASCII. Current versions of zsh
fully support Unicode characters and can handle them correctly. So even if a character is encoded by multiple bytes, zsh
will still know that it is only a single character.
%{...%}
and %G
%{...%}
is used to indicate to zsh
that the string inside does not change the cursor position. This is for example useful, if you want to add escape sequences as used for setting colors:
print -P '%{\e[31m%}terminal red%{\e[0m%}'
print -P '%{\e[38;2;0;127;255m%}#007FFF%{\e[0m%}'
Without %{...%}
zsh
would have to assume that each character of the escape sequence moves the cursor one position to the right.
Using %G
inside %{...%}
(or %1{...%}
) tells zsh
to assume that a single character will be output. This is for counting purposes only, it will not move the cursor on its own.
According to the ZSH Manual:
This is useful when outputting characters that otherwise cannot be correctly handled by the shell, such as the alternate character set on some terminals.
As zsh
is able to handle Unicode characters, it is unnecessary there (although not necessarily wrong).
strlen "%{↓%G%}"
:This is due to the fact that strlen
really only tries to remove any null-length prompt sequences (like %B
or %F{red}
) instead of actually measuring the printed length of the resulting string (which is probably impossible anyway). In many cases this works well enough, but it fails spectacularly in the case of "%{↓%G%}"
, which is actually equivalent to "↓"
in the context of zsh
prompts.
In order to find these null-length prompt sequences, strlen
matches its input to this pattern
invisible=%([BSUbfksu]|([FB]|){*})'
This also contains the the sub-pattern %{*}
, which will match on %{…%}
. Then
LEN=${#${(S%%)FOO//$~invisible/}}
just removes any matching substring from FOO
before counting the characters.
On top of that, it does not actually handle %G
in any way and just removes it together with the surrounding %{...%}
.
As the whole string "%{↓%G%}"
matches the pattern, it will be completely removed, resulting in the unexpected character count of 0
.
BTW: This does not mean, that you should not use strlen
(I have been using something derived from it for quite some time in my prompt). But you should be aware of some limitations:
%G
(obviously).%{...%}
like %3{...%}
.%
for foreground and background colors like %1F
(instead of %F{1}
or %F{red}
)%{...%}
, or really any }
inside %{...%}
. (This is for example important when intending to use %D{string}
for date formatting, as the length of the format string string
would have to match the length of the resulting date without using `%{...%} around it.)Lastly, there was a bug in the original definition and it should be:
local invisible='%([BSUbfksu]|([FK]|){*})'
The second B
should be a K
as it is intended to match the prompt escape for background colors. (%B
starts boldface mode)
Upvotes: 3