Reputation: 717
I want to print some unicode characters but u'\u1000'
up to u'\u1099'
. This doesn't work:
for i in range(1000,1100):
s=unicode('u'+str(i))
print i,s
Upvotes: 16
Views: 26059
Reputation: 99
I didn't see an answer with usage of str.isprintable() above. Also there is no formal answer to the subject "How can I print all unicode characters", so I would give a try...
First - some unicode characters are non-printable. Therefore you can't print all them "as is". You can just print some representation of corresponding character. Here is a small code how to print all printable unicode characters, while for non-printable ones - their "\u...." representation will be printed:
for i in range(0x0000, 1 + 0xffff, 1):
if str.isprintable(chr(i)):
print(chr(i))
else:
print("Non-printable character: '\\u" + format(i, '04x') + "'")
Note: You can remove the prefix Non-printable character: above if don't need it. The code was tested with Python 3.8.7 .
Second - the term "printable" depends on your output device (e.g. console-output supporting utf-8 symbols, or console-output supporting only ASCII symbols, or console supporting only characters in certain code-page-encoding, etc.). Correspondingly you might need additionally to encode & decode your characters to be "supported" by the output device, or - if applicable - change the current encoding of the output device itself (e.g. see the sys.stdout.encoding , codecs.getwriter, PYTHONIOENCODING, etc.). But this would be a different topic.
Upvotes: 3
Reputation: 13120
One might appreciate this php-cli version:
It is using html entities and UTF8 decoding.
Recent version of XTERM and others terminals supports unicode chars pretty nicely :)
php -r 'for ($x = 0; $x < 255000; $x++) {echo html_entity_decode("&#".$x.";",ENT_NOQUOTES,"UTF-8");}'
Upvotes: -2
Reputation: 43
I stumbled across this rather old post and played a bit ...
Here you find the Unicode blocks:
https://en.wikipedia.org/wiki/Unicode_block
And here I am printing some of the blocks
#!/usr/bin/env python3
ranges = list()
# Just some example ranges ...
# Plane 0 0000–ffff - Basic Multilingual Plane
ranges.append((0x0000, 0x001f, 'ASCII (Controls)'))
ranges.append((0x0020, 0x007f, 'ASCII'))
ranges.append((0x0100, 0x017f, 'Latin Extended-A'))
ranges.append((0x0180, 0x024f, 'Latin Extended-B'))
ranges.append((0x0250, 0x02af, 'IPA Extensions'))
ranges.append((0x0370, 0x03FF, 'Greek'))
ranges.append((0x4e00, 0x9fff, 'CJK Unified Ideographs'))
# Plane 1 10000–1ffff - Supplementary Multilingual Plane
ranges.append((0x1f600, 0x1f64f, 'Emoticons'))
ranges.append((0x17000, 0x187ff, 'Tangut'))
for r in ranges:
# print the header of each range
print(f'{r[0]:x} - {r[1]:x} {r[2]}')
j = 1
for i in range(r[0], r[1]):
if j % 80 == 0:
print('')
j += 1
print(f'{str(chr(i))}', end='')
print('\n')
Upvotes: 3
Reputation: 1143
(Python 3) The following will give you the characters corresponding to an arbitrary unicode range
start_code, stop_code = '4E00', '9FFF' # (CJK Unified Ideographs)
start_idx, stop_idx = [int(code, 16) for code in (start_code, stop_code)] # from hexadecimal to unicode code point
characters = []
for unicode_idx in range(start_idx, stop_idx+1):
characters.append(chr(unicode_idx))
Upvotes: 3
Reputation: 1057
Use chr
instead of unichr
to avoid an error message.
for i in range(1000, 1100):
print i, chr(i)
Upvotes: 0
Reputation: 196
You'll want to use the unichr() builtin function:
for i in range(1000,1100):
print i, unichr(i)
Note that in Python 3, just chr() will suffice.
Upvotes: 18
Reputation: 208545
Try the following:
for i in range(1000, 1100):
print i, unichr(i)
Upvotes: 7
Reputation: 160005
unichr
is the function you are looking for - it takes a number and returns the Unicode character for that point.
for i in range(1000, 1100):
print i, unichr(i)
Upvotes: 6
Reputation: 838696
Use unichr:
s = unichr(i)
From the documentation:
unichr(i)
Return the Unicode string of one character whose Unicode code is the integer i. For example, unichr(97) returns the string u'a'.
Upvotes: 12