KISS

Keep It Simple Stupid

Autojump and UnicodeDecodeError in Arch Linux

| comments

If you must know, there is an awesome small utility that tremendously helps you to jump between directories in terminal. It’s called autojump and works with both bash and zsh. Basically, it learns which directories you stay at most time and suggests them when you type the first letters and press Tab. An extremely useful tool indeed. You should definitely check it out if you haven’t yet.

A few days ago, it stopped working spitting out the following exception every time I tried to tun it:

1
2
3
4
5
6
7
8
9
10
11
12
Traceback (most recent call last):
  File "/usr/bin/autojump", line 449, in <module>
    if not shell_utility(): sys.exit(1)
  File "/usr/bin/autojump", line 387, in shell_utility
    db = Database(DB_FILE)
  File "/usr/bin/autojump", line 74, in __init__
    self.load()
  File "/usr/bin/autojump", line 113, in load
    for line in f.readlines():
  File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 3265: ordinal not in range(128)

Hmm, weird. Skimming through package manager’s logs didn’t reveal any updates of either autojump or python. Having converted the position 3265 to hex (

1
2
3
4
$ bc -ql
obase=16
3265
CC1

), I went to check what is the symbol there in ~/.local/share/autojump file with xxd. There were first Russian symbols in the position. autojump had never had any problems with paths containing Russian letters. What had happened?

I ran zsh, and autojump worked fine there! Just to be sure, python was of the same version, as was the data file with jumps history. I ran env to check the environment variables, the last but one line was about LC_CTYPE var. It struck me that moment that one day before there had been updates to the main Arch Linux’s /etc/rc.conf file, where most settings had been deleted. I know this because after every update I check the updates in the /etc/ dir with git.

So, the problem must be with locale. echo $LANG prints C, which is wrong. Apparently, that update removed the line setting system locale, and didn’t set in any other way. Quick check: export LANG=en_US.UTF-8, and autojump works again!

According to Arch wiki, to set the system locale you need to use /etc/locale.conf file:

1
2
3
4
5
6
7
8
9
10
$ cat /etc/locale.conf
# Set default locale
LANG="en_US.UTF-8"

# Keep the default sort order (e.g. files starting with a '.'
# should appear at the start of a directory listing)
LC_COLLATE="C"

# Set the short date to YYYY-MM-DD (test with "date +%c")
LC_TIME="en_DK.UTF-8"

linux

Don't hesitate to leave a comment below. NB! If you don't see a comment form under the post, it's most likely that an extension (such as Ghostery, NoScript, or AdBlock) of your browser blocks the scripts from disqus.com, and you can unblock that.

« Installing CocoaPods with rbenv Ruby gem and proxy »

Comments