Handling Internationalized Domain Names (IDN) in Python

by @jehiah on 2010-02-08 11:45UTC

Internationalized Domain Names or IDN are domain names that contain non ascii characters. It’s important to know how to handle these in an application because domain names themselves can actually only contain ascii. Punycode is the way the unicode characters gets encoded into ascii characters in a domain name.

Thankfully python has an encoding language IDNA built in that makes this easy.

[python]
>>> a="xn--keh.ws"
>>> a.decode("idna")
u"\u2295.ws"
>>> print a.decode("'idna")
⊕.ws
>>> u"⊕.ws".encode("idna")
"xn--keh.ws"
[/python]
Subscribe via RSS ı Email
© 2023 - Jehiah Czebotar