2018-10-31

Facebook's new suite of open source Linux kernel components and tools

Facebook has just announced a bunch of useful Linux kernel components and tools; specifically, useful for shared servers, which may include HPC servers. I could see oomd and cgroup2, in particular, being useful. Oomd takes out-of-memory handling into userspace and tries to take corrective action before an OOM occurs in kernel. Cgroup2 seems to be an successor to cgroups, which allows controlling the amount of system resources assigned to groups of workloads.

2018-08-31

XFS group and project quotas

XFS supports a quota system, that can handle quotas by user, group, or project. The project quota system is meant for setting quotas on directory hierarchies, e.g. you want to set quotas on users' home directories, but allow them a different quota in some shared directory.

What the documentation and the man page for xfs_quota(8) do not mention is that group quotas and project quotas are mutually exclusive. I.e. if you turn on group quotas on filesystem, you cannot turn on project quotas, and vice versa.

Here are my simple scripts for setting up XFS project quotas to set quotas on users' home directories: https://github.com/prehensilecode/xfs-project-quotas

2018-07-03

Docker containers for high performance computing

I have started to see more science application authors/groups provide their applications as Docker images. Having only a passing acquaintance with Docker (and with containers in general), I found this article at The New Stack useful: Containers for High Performance Computing (Joab Jackson).

The vendors of the tools I use -- Bright Cluster Manager, Univa Grid Engine -- have incorporated support for containers. It is good to read some independent information about the role of containers in HPC.

Christian Kniep of Docker points out some issues with the interaction of HPC and Docker, and came up with a preliminary solution (a proxy for Docker Engine) to address the issues. HPC commonly makes use of specific hardware (e.g. GPUs, InfiniBand), and this is counter to Docker's hardware-agnostic approach. Also, HPC workflows may rely on shared resources (e.g. i/o to a shared filesystem).

2018-06-15

The usefulness of awk in the present day

This anecdote illustrates exactly why scientists should invest some time in learning some of the standard tools of Unix/Linux.

https://news.ycombinator.com/item?id=17324122

2018-04-10

How to install the autoconf archive

This is a quick and dirty way to install the autoconf archive, “a collection of more than 500 macros for GNU Autoconf that have been contributed as free software.”

Either clone the git repository or download a tarball. Decide on a place for the macros: they will all reside in a single directory. Let’s say /usr/local/share/aclocal. Then, just copy all the .m4 files from the m4 subdirectory of the repo into /usr/local/share/aclocal.  Next, set in either a system or user-specific login script, the environment variable ACLOCAL_PATH=/usr/local/share/aclocal. Like other PATH-like environment variables, the value is a colon-delimited list.

2018-03-29

Deficiency in Tumblr's two-factor authentication (2FA) implementation

This blog is mirrored, using an IFTTT applet (f.k.a. recipe), to http://linuxfollies.tumblr.com/  Two-factor authentication on a Tumblr account supports two methods: an app code generator (e.g. Google Authenticator, Authy, Duo Mobile), and SMS. Notably, it does not generate a list of one-time backup codes like most services do.

Backup codes are necessary in case the device is not accessible, e.g. lost or stolen, particularly if you are abroad without your usual SIM (perhaps it is also stolen) which means that SMS would not reach you.

SMS is not recommended for routine two-factor use because SMS can be hijacked. The National Institute of Standards and Technology (NIST) does not recommend SMS for two-factor authentication. See also: The Verge, and Schneier. As such, I normally do not enable SMS as a second factor.

Getting to the point, I got a new phone yesterday. I spent a couple of hours the night before making sure I had backup codes and/or a secondary method for the 2nd factor. All went well, but I had no Tumblr backup codes. Nor did I set SMS as an auth method.

Tumblr's recovery process requires that you have a photo of your face on your Tumblr account (avatar, etc.). Then, you send a picture of yourself holding a piece of paper with something particular written on it, which you then send to them, together with the URL of the picture already on Tumblr.

So, linuxfollies.tumblr.com is now no longer under my control. It will, however, keep getting mirrors of posts here as long as IFTTT remains up.

2018-01-16

Python multiprocessing ignores cgroups

At work, I noticed a Python job that was causing a large overload condition. The job requested 16 CPU cores, and was running gensim.models.ldamulticore requesting 15 workers. However, the load on that server indicated an overload of many times over. This is despite a cgroups cpuset restricting the number of cores for that process to 16.

It turns out, gensim.models.ldamulticore uses Python multiprocessing. That module decides how many threads to run based on the number of CPU cores read directly from /proc/cpuinfo. This completely bypasses the limitations imposed by cgroups.

There is currently an open enhancement request to add a new function to multiprocessing for requesting the number of usable CPU cores rather than the total number of CPU cores.