Mon, 28 Oct 2013
Keeping Humans in their Place

Computers can be pretty frustrating, even when you think you understand them fairly well. This understanding might make things worse in some ways, as you'll go the extra mile, persevere a bit longer, do the extra debugging and perhaps end up no better off (except even more frustrated).

What's brought this about? Well, over and above the usual nitpicks :

Mozilla Thunderbird IMAP Issues

My domain email stopped working a few days ago. Initially I thought it was just an email dry spell, but some more concerned digging showed a problem connecting to my IMAP server (dovecot).

Some potential complexity here ...IMAP itself but especially the SSL layered over it (imaps). So a fair amount of anxiety about what might have been broken - server update? expired certificate? problem ertificate? or a problem on the client computer, or client mail application?

Suspicion settled on the client application, Mozilla Thunderbird, and I went through a slightly painful process of regressing some major releases and finding that version 23.0 broke things.

Somehow I had managed to get through v23.0, 24.0 and 24.0.1 via the automatic updates with a working mail capability. At least until last week. I am not sure how!

Posted some notes and asked for comment on mozillaZine, and then ended up logging a bug. Should have expected this, but then tasked to find the nightly regression point, a potentially painful process. "Luckily", being on holiday meant I have had some time to do this ...

Mozregression didn't seem to work well for me, not finding any break point, so I took the manual route of downloading some releases close to the last version that worked for me (release 22.0) and seeing where it failed :

2013/05/2013-05-24-00-40-21-comm-aurora/ ----- BAD
2013/05/2013-05-23-00-40-20-comm-aurora/ ----- BAD
...
2013/05/2013-05-20-00-40-04-comm-aurora/ ----- BAD
...
2013/05/2013-05-16-00-40-19-comm-aurora/ ----- BAD
...
2013/05/2013-05-14-00-40-02-comm-aurora/ ----- BAD
2013/05/2013-05-13-00-40-21-comm-aurora/ ----- OK
2013/05/2013-05-12-00-40-18-comm-aurora/ ----- OK
...
2013/05/2013-05-06-00-40-01-comm-aurora/ ----- OK
...
2013/05/2013-05-02-00-40-01-comm-aurora/ ----- OK

So, IMAP to my domain broken with the 2013-05-14 build. Let's see how things go.

  • Bug : 930878
    IMAP with SSL/TLS,normal password fails to retrieve mail after v22.0

One always wonders ... it's probably my fault somewhere. Still Diggin' :-)

Software RAID Failure

Did I mention holiday? A couple of days ago I got an email with subject line :

Fail event on /dev/md/2:shuttle

That's a disk failure with a RAID mirror I have in a system (where I normally stage the blog). Something to look forward to fixing when I get home. Hopefully the remaining disk stays well, always a slight concern with something like this.

On top of this issue, I have smart complaining on another system about "unreadable sectors" but this is something I've been momitoring for a couple of months, the number not increasing for now. RAID is not a backup, but it helps mitigate hardware failures.

A quick followup to this. The 500GB 2.5" SATA disk I was going to use as a replacement might not be healthy itself. I did a quick smartctl health check on it and it spat our some warnings :

==> WARNING: These drives may corrupt large files,
see the following web pages for details:
http://knowledge.seagate.com/articles/en_US/FAQ/215451en
http://forums.seagate.com/t5/Momentus-XT-Momentus-Momentus/Momentus-XT-corrupting-large-files-Linux/td-p/109008
http://superuser.com/questions/313447/seagate-momentus-xt-corrupting-files-linux-and-mac

I didn't know smartmontools did this. What a great feature. So, looks like I need to flash the Seagate firmware.

Update

Disk firmware updated, replaced in RAID and syncing the mirror ...

Laptop Random Hibernations

I'm trying Debian Testing (Jessie) on my Thinkpad x220 and it's generally been fine. In fact, in many ways it's the best and fastest version yet (and the laptop's pretty good as well)

However, I've had it decide to hibernate itself when I'm not looking. This wouldn't be so bad except it has a problem resuming (libgcrypt message, similar to bug 724275), so this turns into a hard reset. As usual, a number of places I could look to solve this (initramfs, acpi, uswsusp etc.) and I'll see if I can find some time and do some debugging. Chasing this sort of issue is particularly tough because of the need for rebooting/hibernating to test things.

I was going to followup a post on the Debian Users web forum but it looks like my account has been "deactivated" manually by an admin and I can't re-activate or re-register (username in use!). A large bit of friction having to send a mail to the admins about it and a bit of a crappy policy if you ask me ...

So ...

Maybe I have too many computers, and too many computer related activities going on. I'm juggling different virtual machines running different versions of Debian, doing different things and occasionally thinking about synchronisation. Silly things such as whether to run the development system VM on KVM or switch to VirtualBox? If I use both, best ways to sync them up? Converting raw KVM disk to a VDI etc.

No wonder the odds increase that I end up in pain sometimes. The aim is always to get things sorted and arranged in such a way that I can actually do some work, or something worthwhile. Not spend all day fixing or configuring things before managing any of that!