Reputation: 183
Are there any similar libraries/packages in go that emulate what vis(3)
and unvis(3)
do for BSD systems? I'm trying to do something that requires representation of strings that contain special characters like whitespace and such.
Upvotes: 0
Views: 665
Reputation:
No, Not exactly, but if you are looking for URL encoding, You can do all the URL encoding you want with the net/url
package:
see: Encode / decode URLs
and: Is there any example and usage of url.QueryEscape ? for golang
sample code:
fmt.Println(url.QueryEscape("https://stackoverflow.com/questions/tagged/go test\r \r\n"))
output:
http%3A%2F%2Fstackoverflow.com%2Fquestions%2Ftagged%2Fgo+test%0D+%0D%0A
or write your own:
in Go string is UTF-8
encoded, and is in effect a read-only slice of bytes:
you may get bytes like this:
str := "UTF-8"
bytes := []byte(str) // string to slice
fmt.Println(str, bytes) // UTF8 [85 84 70 45 56]
or convert bytes to string like this:
s := string([]byte{85, 84, 70, 45, 56, 32, 0xc2, 0xb5}) // slice to string
fmt.Println(s) // UTF-8 µ
0xC2 0xB5 is UTF-8 (hex) for Character 'MICRO SIGN' (U+00B5) see: http://www.fileformat.info/info/unicode/char/00b5/index.htm
also you may get bytes like this:
for i := 0; i < len(s); i++ {
fmt.Printf("%d: %d, ", i, s[i])
//0: 85, 1: 84, 2: 70, 3: 45, 4: 56, 5: 32, 6: 194, 7: 181,
}
or in compact Hex format:
fmt.Printf("% x\n", s) // 55 54 46 2d 38 20 c2 b5
and get runes (Unicode codepoints) like this:
for i, v := range s {
fmt.Printf("%d: %v, ", i, v)
//0: 85, 1: 84, 2: 70, 3: 45, 4: 56, 5: 32, 6: 181,
}
see: What is a rune?
and convert rune to string:
r := rune(181)
fmt.Printf("%#U\n", r) // U+00B5 'µ'
st := "this is UTF-8: " + string(r)
fmt.Println(st) // this is UTF-8: µ
convert slice of runes to string:
rs := []rune{181, 181, 181, 181}
sr := string(rs)
fmt.Println(sr) // µµµµ
convert string to slice of runes:
br := []rune(sr)
fmt.Println(br) //[181 181 181 181]
The %q (quoted) verb will escape any non-printable byte sequences in a string so the output is unambiguous:
fmt.Printf("%+q \n", "Hello, 世界") // "Hello, \u4e16\u754c"
unicode.IsSpace
reports whether the rune is a space character as defined by Unicode's White Space property; in the Latin-1 space this is
'\t', '\n', '\v', '\f', '\r', ' ', U+0085 (NEL), U+00A0 (NBSP). sample code:
package main
import (
"bytes"
"fmt"
"unicode"
)
func main() {
var buf bytes.Buffer
s := "\u4e16\u754c \u0020\r\n 世界"
for _, r := range s {
if unicode.IsSpace(r) {
buf.WriteString(fmt.Sprintf("\\u%04x", r))
} else {
buf.WriteString(string(r))
}
}
st := buf.String()
fmt.Println(st)
}
output:
世界\u0020\u0020\u000d\u000a\u0020\u0020世界
You can find more functions in the unicode/utf8
, unicode
, strconv
and strings
packages:
https://golang.org/pkg/unicode/utf8/
https://golang.org/pkg/unicode/
https://golang.org/pkg/strings/
https://golang.org/pkg/strconv/
https://blog.golang.org/strings
Upvotes: 1