Sometimes, when one updates the firmware for Mellanox Infiniband cards, the MAC/hardware address gets changed. This usually happens if the IB card is OEM, i.e. made by Mellanox but stamped with a different company's name.
When the MAC gets changed, the network interface will not come up. The fix is to update the HWADDR field in /etc/sysconfig/network-scripts/ifcfg-ib0 and /etc/sysconfig/network-scripts/ifcfg-ib1. Use "ip link list" to display the new MAC.
How-to's and technical news about Linux and open computing, with a sprinkling of Python.
2014-12-29
2014-12-16
RHEL 6.4 kernel 2.6.32-358.23.2, Mellanox OFED 2.1-1.0.6, and Lustre client 2.5.0
I am planning some upgrades for the cluster that I manage. As part of the updates, it would be good to have MVAPICH2 with GDR (GPU-Direct RDMA -- yes, that's an acronym of an acronym). MVAPICH2-GDR, which is provided only as binary RPMs, only supports Mellanox OFED 2.1.
Now, our cluster runs RHEL6.4, but with most non-kernel and non-glibc packages updated to whatever is in RHEL6.5. The plan is to update everything to whatever is in RHEL6.6, except for the kernel, leaving that at 2.6.32-358.23.2 which is the last RHEL6.4 kernel update. The reason for staying with that version of the kernel is because of Lustre.
We have a Terascala Lustre filesystem appliance. The latest release of TeraOS uses Lustre 2.5.0. Upgrading the server is pretty straightforward, according to the Terascala engineers. Updating the client is a bit trickier. Currently, the Lustre support matrix says that Lustre 2.5.0 is supported only on RHEL6.4.
The plan of attack is this:
Now, our cluster runs RHEL6.4, but with most non-kernel and non-glibc packages updated to whatever is in RHEL6.5. The plan is to update everything to whatever is in RHEL6.6, except for the kernel, leaving that at 2.6.32-358.23.2 which is the last RHEL6.4 kernel update. The reason for staying with that version of the kernel is because of Lustre.
We have a Terascala Lustre filesystem appliance. The latest release of TeraOS uses Lustre 2.5.0. Upgrading the server is pretty straightforward, according to the Terascala engineers. Updating the client is a bit trickier. Currently, the Lustre support matrix says that Lustre 2.5.0 is supported only on RHEL6.4.
The plan of attack is this:
- Update a base node with all RHEL packages, leaving the kernel at 2.6.32-358.23.2
- Upgrade Mellanox OFED from 1.9 to 2.1
- Build lustre-client-2.5.0 and upgrade the Lustre client packages
Updating the base node is straightforward. Just use "yum update", after commenting out the exclusions in /etc/yum.conf. If you had updated the <tt>redhat-release-server-6server<tt> package, which defines which RHEL release you have, you can downgrade it. (See RHEL Knowledgebase, subscription required.) First, install the last (as of 2014-12-15) RHEL6.4 kernel, and then do the downgrade:
Check with "cat /etc/redhat-release".
Next, install Mellanox OFED 2.1-1.0.6. You can install it directly using the provided installation script, or if you are paranoid like me, you can use the provided script to build RPMs against the exact kernel update you have installed.
Get the tarball directly from Mellanox. Extract, and make new RPMs:
Strictly speaking, the reboot is unnecessary: you can stop and restart a couple of services and the new OFED will load.
Next, for Lustre. Get the SRPM from Intel (who bought WhamCloud). You will notice that it is for kernel 2.6.32-358.18.1. Not mentioned is the fact that by default, it uses the generic OFED that RedHat rolls into its distribution. To use the Mellanox OFED, a slightly different installation method must be used.
# yum install kernel-2.6.32-358.23.2.el6
# reboot
# yum downgrade redhat-release-server-6Server
Check with "cat /etc/redhat-release".
Next, install Mellanox OFED 2.1-1.0.6. You can install it directly using the provided installation script, or if you are paranoid like me, you can use the provided script to build RPMs against the exact kernel update you have installed.
Get the tarball directly from Mellanox. Extract, and make new RPMs:
# tar xf MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64.tgz
# cd MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64
# ./mlnx_add_kernel_support.sh -m .
...
# cp /tmp/MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64-ext.tgz .
# tar xf MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64-ext.tgz
# cd MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64-ext
# ./mlnxofedinstall
# reboot
Strictly speaking, the reboot is unnecessary: you can stop and restart a couple of services and the new OFED will load.
Next, for Lustre. Get the SRPM from Intel (who bought WhamCloud). You will notice that it is for kernel 2.6.32-358.18.1. Not mentioned is the fact that by default, it uses the generic OFED that RedHat rolls into its distribution. To use the Mellanox OFED, a slightly different installation method must be used.
# rpm -Ivh lustre-client-2.5.0-2.6.32_358.18.1.el6.x86_64.src.rpmTo make the lustre module load at boot, I have a kludge: to /etc/init.d/netfs right after the line
# cd ~/rpmbuild/SOURCES
# cp lustre-2.5.0.tar.gz ~/tmp
# cd ~/tmp
# tar xf lustre-2.5.0.tar.gz
# cd lustre-2.5.0
# ./configure --disable-server --with-o2ib=/usr/src/ofa_kernel/default
# make rpms
# cd ~/rpmbuild/RPMS/x86_64
# yum install lustre-client-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm \
lustre-client-modules-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm \
lustre-client-tests-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm \
lustre-iokit-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm
STRING=$"Checking network-atttached filesystems"add
modprobe lustreReboot, and then check:
# lsmod | grep lustre
lustre 921744 0
lov 516461 1 lustre
mdc 199005 1 lustre
ptlrpc 1295397 6 mgc,lustre,lov,osc,mdc,fid
obdclass 1128062 41 mgc,lustre,lov,osc,mdc,fid,ptlrpc
lnet 343705 4 lustre,ko2iblnd,ptlrpc,obdclass
lvfs 16582 8 mgc,lustre,lov,osc,mdc,fid,ptlrpc,obdclass
libcfs 491320 11 mgc,lustre,lov,osc,mdc,fid,ko2iblnd,ptlrpc,obdclass,lnet,lvfs
2014-11-06
Mounting a HTC One on Ubuntu 14.04 Trusty; Re-flashing the ROM
Since I unlocked my HTC One (M7), I do not have a voicemail app since the phone was tied to a specific provider. Now, I'm trying to figure out how to re-flash the device to generic.
The first thing was to try to connect it to my Linux machine. However, upon plugging the phone into the USB port, an error appeared: "Unable to find matching udev device".
Apparently, this is fairly common and happens with other devices. The error boils down to a bug in Ubuntu's default Media Transfer Protocol (MTP) library. The fix is to install a later version.
First, add a new repository:
$ sudo add-apt-repository ppa:webupd8team/unstableThen, install the mtpfs package.
$ sudo apt-get update
UPDATE: HTC has a free bootloader unlocking utility. You just have to sign up for their free developer program. Then, follow the instructions here: http://www.htcdev.com/bootloader/unlock-instructions
The one thing they do not mention is that you have to have root privileges for the fastboot utility to work, so:
$ sudo ./fastboot oem get_identifier_tokenThen, they email you a file for unlocking. Next, re-flash the phone with a vendor-appropriate ROM from here: http://www.htcdev.com/devcenter/downloads
For AT&T (my phone's original ROM) and T-Mobile (my current provider), there is no binary image for flashing. I didn't bother looking to see if you could compile the source. Instead, I went with the Android Ice Cold Project (AICP). I discovered it via this step-by-step article. They also have a download of the full Google Apps.
2014-10-15
Another SSL vulnerability - The POODLE Attack
From the Mozilla Security Blog:
Scott Helme has a good run down on how to fix this issue, for various servers and browsers.
SSL version 3.0 is no longer secure. Browsers and websites need to turn off SSLv3 and use more modern security protocols as soon as possible, in order to avoid compromising users’ private information.Under RHEL 6.5 with Apache httpd, edit /etc/httpd/conf.d/ssl.conf and make sure the protocol line disables both SSLv2 and SSLv3:
SSLProtocol all -SSLv2 -SSLv3or you can just specify TLS only:
SSLProtocol +TLSv1 +TLSv1.1 +TLSv1.2Ars Technica has a good explanation.
Scott Helme has a good run down on how to fix this issue, for various servers and browsers.
2014-09-01
Python's with statement
Old habits die hard. I learned a long time ago (Python 1.x) this pattern for opening and operating on files:
Since Python 2.6, the with statement does this automatically:
The with statement works with some other classes, too.
PS Blogger really needs a code block style.
try:
f = open("filename.txt", "ro")
try:
for l in f:
print l
finally:
f.close()
except IOError as e:
print "I/O error({0}): {1}".format(e.errno, e.strerror)
Since Python 2.6, the with statement does this automatically:
with open("filename.txt", "ro") as f:
for l in f:
print l
The with statement works with some other classes, too.
PS Blogger really needs a code block style.
2014-08-27
A Git branching model
Vincent Driessen has a clear write-up on a git branching model that works for his team.
2014-07-24
Ganglia
Word to the wise: do not enable the multiplecpu multicpu module. It doesn't get disabled even if you append ".disabled" to the file name. Now, I have 265 CPU metrics.
Subscribe to:
Posts (Atom)