CentOS 6.4: NSS, MD5 Certificates, and Authentication Problems: UPDATE

In tinkering with setting up a kickstart script to get a basic workstation installed just like I want I decided to revisit the authentication issue related to certificates with an MD5 signature. Thankfully there is a workaround to enable MD5 support in the nss package that worked for me. Simply add ‘export NSS_HASH_ALG_SUPPORT=+MD5’ to /etc/sysconfig/init and reboot. Thanks go to @NewLifeMark and this blog posting.

Advertisements

CentOS 6.4: NSS, MD5 Certificates, and Authentication Problems

CentOS just released 6.4, so I decided to update the office server and as it was the end of the day, left it at that. Bad form on my end as I come in the next day to find I’m unable to connect to the samba server. Checking the logs I see samba is unable to connect to LDAP via TLS. I also quickly discover that the server itself is unable to get user information. First guess was to check the ldap server, but its running fine.

Second guess is to try ldapsearch from the office server and I get my first hint. Ldapsearch was not accepting my Certificate Authority and was giving a bad signature algorithm error. Researching that lead me to the following bug report: Unable to authenticate to legacy LDAP server due to “not secure” certificate signature.

It becomes clear that due to the decision at Mozilla, who develop the NSS(Network Security Services) library to stop accepting MD5 hashes is the root issue. I then set out to downgrade NSS to pre 6.4 release. The following packages were the ones I downgraded to get my system working again: nss nss-sysinit nss-tools nss-util nss-pam-ldapd mod_nss.

Now some may ask why not just upgrade your certificates to a newer and thus more secure signing algorithm? Well to anyone who has ever worked with OpenSSL and creating certs for a multitude of servers, its not a trivial task. I’ll put it on my todo list, but for now its at the bottom.

Thanks to following blog post by Matt Micene that was the break in this case.

Moving from DKMS Nvidia to latest driver

The dkms version of the Nvidia driver is a great convenience, yet it usually means you wont have the latest version and when the kernel updates, that can be a problem. Like Today.

Running CentOS 6 on our workstations and decided time to update packages and I hoped dkms wouldn’t let me down, but it did. So boot into run level 3 and remove dkms and the dkms nvidia driver to make way for the latest version. Its normally a straight forward process, yet Nvidia can mess up the xorg installation, like it did for me. Some how libwfb.so was missing, so the Nvidia driver will install its own version.

This doesn’t work as you’ll get a message about unknown symbol PictureScreenPrivateIndex. Turns out for some installations that don’t have libwfb, Nvidia brings its own, yet since I’m supposed to have it, this led to trouble. The quick fix is to reinstall the right package. Use yum whatprovides */libwfb.so to find that xorg-x11-server-Xorg is the correct package and yum reinstall restores libwfb. Install the Nvidia driver via the script as usual and you should be good to go.

Lustre: LVM Metadata Snafu

It was supposed to be a simple memory upgrade for our lustre nodes, but of course they had something else in mind. I’ve got my suspicions as to how it happened, but the issue was one of the OSTs wasn’t mounting. Checking I find that LVM was showing all but one of the OSTs, its LVM metadata wasn’t registering for some reason. Well I’ve seen this before, just use the backup metadata to relabel the device. Not so fast…

The pvcreate command wasn’t working because it thought there was an existing partition table and reading up on LVM shows that when working with raw disk devices, there cannot be a partition table. Though the pvcreate manpage does provide the answer, use dd to write zeros to the first sector, thus clearing the table. It works. I was able to relabel the device, but again not so fast…

Not only was one device not labeled, but turned out that another device had its label swapped. So when I thought I had it fixed and tried mounting the lustre OSTs, one of them was not reconnecting and the clients seem to be oblivious that it was there. Checking /proc/fs/lustre/devices to see what IDs the mounted OSTs had told me what happened. I think that since the swapped OSTs were being mounted on their failover nodes, the clients were unsure as how to reconnect and thus the issue. Once I swapped the LVM metadata for those two devices, the clients were able to reconnect and everything came back online. This made for a very long night.

Upgrade the power grid

Hearing that people are still without power has become a regular thing on the news. If not from these occasional severe storms, then from the yearly snow storms and hurricanes. How many times do the power lines have to go down for people to realize that maybe it might be time to do something about it?

On CNN they had a spokesperson from a utility company and their response to the question of burring the lines was that it just couldn’t be done. Well I call BS on that. It just proves that its too easy to fall into maintaining the status quo. This would make a great infrastructure project to update and upgrade our electric grid. Prevent things like cascading blackouts, improve efficiencies so we lose less electricity during transmission, and protect it from the dangers of mother nature.

These problems are simply engineering problems, we just have to make the decision to fix them or simply patch them up.