Monday, December 12, 2011

How to convert the character encoding of a text file

The role of character encodings is to make life easier. However, sometimes I think it just makes it much miserable. So, a tool for changing the encoding of text files can be really useful. libc-bin package contains a command, iconv,  that does exactly that for us. Its usage is pretty simple. E.g.

$ iconv --from-code=ISO-8859-1 --to-code=UTF-8 iso.txt -o utf.txt


converts a file encoded by ISO-8859-1 (iso.txt) to UTF-8 (utf.txt). For more information see man iconv.

Some other useful programs from the Debian repos:
  • recode – character set conversion utility
  • cstocs – recoding utility and Czech sorter
  • enca – detect and convert encoding of text files

No comments:

Post a Comment