zexed640
zexed640

Reputation: 167

How to print UTF-8 string in C

I have this function:

void read_request(int fd) {
        int size = 50, pos = 0, b;
        char* buffer = calloc(size, 1);

        while (strncmp(buffer + (pos - 4 < 0 ? 0 : pos - 4), "\r\n\r\n", 4)) {
                if ((b = read(fd, buffer + pos, size - pos)) == -1) {
                        perror("read() error");
                        exit(-1);
                }

                pos += b;

                if (pos >= size) {
                        size *= 2;
                        buffer = realloc(buffer, size);
                }
        }

        fwrite(buffer, 1, pos, stdout);
        free(buffer);
}

It reads HTTP headers from browser request and prints it. It works fine until I put cyrillic symbols into URL, for example: http://127.0.0.1/тест. All ASCII symbols will be printed as usual, but тест prints as hex value:

GET /%D1%82%D0%B5%D1%81%D1%82 HTTP/1.1
Host: 127.0.0.1
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
DNT: 1
Connection: keep-alive
Upgrade-Insecure-Requests: 1

How to print it as normal readable text?

Upvotes: 0

Views: 449

Answers (1)

Neal Burns
Neal Burns

Reputation: 849

%D1%82%D0%B5%D1%81%D1%82 is url encoded. The web browser url encodes the unicode string before sending it to the server.

I'm not sure what you'll see if you un-url-encode it. It might depend on the terminal as to whether it correctly displays unicode or not.

Upvotes: 2

Related Questions