Showing posts with label redhat. Show all posts
Showing posts with label redhat. Show all posts


Migrating LDAP server to a different machine and changing to OLC and from bdb to mdb (lmdb)

At our site, we have LDAP (openldap 2.4) running on one server. It uses the old slapd.conf configuration, and the Berkeley DB (bdb) backend.

As part of planning for the future, I want to move this LDAP server to a different machine. I wanted to also migrate to using on-line configuration (OLC), where the static slapd.conf file is replaced with the cn=config online LDAP "directory". This allows configuration changes to be made at runtime without restarting slapd.

I also wanted to change from using the Berkeley DB backend to the Lightning Memory-Mapped DB (LMDB; known as just "mdb" in the configs). LMDB is what OpenLDAP recommends as it is quick (everything in memory) and easier to manage (fewer tuning options, no db files to mess with). From here, I will refer to this as "mdb" per the slapd.conf line.

After doing the migration once, leaving the backend as bdb, I found out it was easier to do all three things at once: migrate to a different server, convert from slapd.conf to OLC, and change backend to mdb.

This is a multi-stage process, but nothing too strenuous.
  1. Dump the directory data to an LDIF:  slapcat -n 1 -l n1.ldif
  2. Copy n1.ldif to new machine
  3. Copy slapd.conf to new machine
  4. Edit new slapd.conf on new machine: change the line "database bdb" to "database mdb"
    1. Remove any bdb-specific options: idletimeout, cachesize
  5. Import n1.ldif: slapadd -f /etc/openldap/slapd.conf -l n1.ldif
  6. Convert slapd.conf to OLC: slaptest -f slapd.conf -F slapd.d ; chown -R ldap:ldap slapd.d
  7. Move slapd.conf out of the way: mv /etc/openldap/slapd.conf /etc/openldap/slapd.conf.old
Other complications:
  • You will probably need to generate a new SSL certificate for the new server
  • That may mean signing the new cert with your own (or established) CA
    • Or, you can set all your client nodes to not require the cert: in /etc/openldap/ldap.conf add “TLS_REQCERT never”. Fix up the /etc/sssd/sssd.conf file similarly: add ldap_tls_reqcert = never
  • Fix up your /etc/sssd/sssd.conf
Note that it is pretty easy to back things out and start from scratch. To restore the new server to a "blank slate" condition, delete everything in /etc/openldap/slapd.d/

     service slapd stop
     cd /etc/openldap/slapd.d
     rm -rf *

CAUTION: This process seems to create the n0 db with an olcRootDN of “cn=config” where it should be “cn=admin,dc=example,dc=com” (or whatever your LDAP rootDN should be; for Bright-configured clusters, that would be “cn=root,dc=cm,dc=cluster). I.e. you need to have:

dn: olcDatabase={0}config,cn=config
olcRootDN: cn=root,dc=cm,dc=cluster

but for olcRootDN, you have cn=config, instead.

To fix it, I dumped n0 and n1, deleted /etc/openldap/slapd.d, and “restored” from the dumped n0 and n1. Basically, emulating a restore from backup.

  • Dump {0} to n0.ldif
  • Shut down slapd
  • Modify n0.ldif to have the needed olcRootDN (as above)
  • Move away the old /etc/openldap/slapd.d/ directory: mv /etc/openldap/slapd.d /etc/openldap/slapd.d.BAK
  • Create a new slapd.d directory: mkdir /etc/openldap/slapd.d 
  • Add the dumped n0.ldif as the new config: slapadd -n 0 -F /etc/openldap/slapd.d -l n0.ldif
  • Fix permissions: chown -R ldap:ldap /etc/openldap/slapd.d ; chmod 700 /etc/openldap/slapd.d

These are the websites I found useful in figuring things out:


Mellanox Infiniband network cards on Linux

Sometimes, when one updates the firmware for Mellanox Infiniband cards, the MAC/hardware address gets changed. This usually happens if the IB card is OEM, i.e. made by Mellanox but stamped with a different company's name.

When the MAC gets changed, the network interface will not come up. The fix is to update the HWADDR field in /etc/sysconfig/network-scripts/ifcfg-ib0 and /etc/sysconfig/network-scripts/ifcfg-ib1. Use "ip link list" to display the new MAC.


RHEL 6.4 kernel 2.6.32-358.23.2, Mellanox OFED 2.1-1.0.6, and Lustre client 2.5.0

I am planning some upgrades for the cluster that I manage. As part of the updates, it would be good to have MVAPICH2 with GDR (GPU-Direct RDMA -- yes, that's an acronym of an acronym). MVAPICH2-GDR, which is provided only as binary RPMs, only supports Mellanox OFED 2.1.

Now, our cluster runs RHEL6.4, but with most non-kernel and non-glibc packages updated to whatever is in RHEL6.5. The plan is to update everything to whatever is in RHEL6.6, except for the kernel, leaving that at 2.6.32-358.23.2 which is the last RHEL6.4 kernel update. The reason for staying with that version of the kernel is because of Lustre.

We have a Terascala Lustre filesystem appliance. The latest release of TeraOS uses Lustre 2.5.0. Upgrading the server is pretty straightforward, according to the Terascala engineers. Updating the client is a bit trickier. Currently, the Lustre support matrix says that Lustre 2.5.0 is supported only on RHEL6.4.

The plan of attack is this:

  1. Update a base node with all RHEL packages, leaving the kernel at 2.6.32-358.23.2
  2. Upgrade Mellanox OFED from 1.9 to 2.1
  3. Build lustre-client-2.5.0 and upgrade the Lustre client packages

Updating the base node is straightforward. Just use "yum update", after commenting out the exclusions in /etc/yum.conf. If you had updated the <tt>redhat-release-server-6server<tt> package, which defines which RHEL release you have, you can downgrade it. (See RHEL Knowledgebase, subscription required.) First, install the last (as of 2014-12-15) RHEL6.4 kernel, and then do the downgrade:
# yum install kernel-2.6.32-358.23.2.el6
# reboot
# yum downgrade redhat-release-server-6Server

Check with "cat /etc/redhat-release".

Next, install Mellanox OFED 2.1-1.0.6. You can install it directly using the provided installation script, or if you are paranoid like me, you can use the provided script to build RPMs against the exact kernel update you have installed.

Get the tarball directly from Mellanox. Extract, and make new RPMs:
# tar xf MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64.tgz
# cd MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64
# ./ -m .
# cp /tmp/MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64-ext.tgz .
# tar xf MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64-ext.tgz
# cd MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64-ext
# ./mlnxofedinstall
# reboot

Strictly speaking, the reboot is unnecessary: you can stop and restart a couple of services and the new OFED will load.

Next, for Lustre. Get the SRPM from Intel (who bought WhamCloud). You will notice that it is for kernel 2.6.32-358.18.1. Not mentioned is the fact that by default, it uses the generic OFED that RedHat rolls into its distribution. To use the Mellanox OFED, a slightly different installation method must be used.

# rpm -Ivh lustre-client-2.5.0-2.6.32_358.18.1.el6.x86_64.src.rpm
# cd ~/rpmbuild/SOURCES
# cp lustre-2.5.0.tar.gz ~/tmp
# cd ~/tmp
# tar xf lustre-2.5.0.tar.gz
# cd lustre-2.5.0
# ./configure --disable-server --with-o2ib=/usr/src/ofa_kernel/default
# make rpms
# cd ~/rpmbuild/RPMS/x86_64
# yum install lustre-client-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm \
lustre-client-modules-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm \
lustre-client-tests-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm \
To make the lustre module load at boot, I have a kludge: to /etc/init.d/netfs right after the line
STRING=$"Checking network-atttached filesystems"
modprobe lustre
Reboot, and then check:
# lsmod | grep lustre
lustre                921744  0
lov                   516461  1 lustre
mdc                   199005  1 lustre
ptlrpc               1295397  6 mgc,lustre,lov,osc,mdc,fid
obdclass             1128062  41 mgc,lustre,lov,osc,mdc,fid,ptlrpc
lnet                  343705  4 lustre,ko2iblnd,ptlrpc,obdclass
lvfs                   16582  8 mgc,lustre,lov,osc,mdc,fid,ptlrpc,obdclass
libcfs                491320  11 mgc,lustre,lov,osc,mdc,fid,ko2iblnd,ptlrpc,obdclass,lnet,lvfs


Limiting logins under SSSD

Under SSSD, you can pretty easily limit logins to specific users or groups. The syntax is different from that of /etc/security/access.conf, and is actually easier. Red Hat has some documentation (may require login). There is also a man page for sssd.conf(5).

Under the your domain, add some lines to configure "simple" access control:
access_provider = simple 
simple_allow_users = topbanana 
simple_allow_groups = bunchofbananas,wheel


root cron jobs and /etc/security/access.conf

On RHEL6, if your root cron jobs do not run, check your /var/log/secure file for lines that look like:
crontab: pam_access(crond:account): access denied for user `root' from `cron'
You may also see the following message when, as root, you type "crontab -e":
Permission deniedYou (root) are not allowed to access to (crontab) because of pam configuration.

If there are any like that, check /etc/security/access.conf -- you need to allow root access via cron and crond by adding the following line:
+ : root : cron crond 


More on SSSD - getting finger(1) and command completion of ~ to work

I noticed after getting SSSD up and running that the finger(1) command no longer worked, and neither did command completion of ~username. In the first case, finger(1) never found any users, no matter if I used the exact username. In the second case, if I typed at the command line cd ~d<tab>, it would not expand to a list of possibilities. However, cd ~david worked just fine.

Turns out, there needs to be one setting in /etc/sssd/sssd.conf:

    enumerate = True

That allows a local precache to be created so that finger(1) can iterate over user info to find a matching record.

Then, restart the sssd service.

Useful links:


Own-horn-tooting: python-pbs

Hah! I'm in the changelog for the python-pbs package for Fedora/RedHat. Frankly, I don't even remember doing this, and I can't find any correspondence in my gmail about it, either.

The python-pbs package is a Python wrapper around libtorque, the library that underlies the Torque resource manager.


More Puppet and SELinux

Remember my previous post about Puppet and SELinux? Well, it turns out it wasn't complete. The policy file was missing a couple of policies. This happened because I didn't completely start from scratch at each iteration of testing, and at some point, I turned SELinux to permissive, so client certificates were being signed with no problem.

In moving to our production server, there were error messages on the client side:

err: Could not request certificate: Error 400 on SERVER: Permission denied - /var/lib/puppet/ssl/ca/serial
Exiting; failed to retrieve certificate and waitforcert is disabled

On the production puppet master, AVC denials looking like:

type=1400 audit(1328213559.254:21031): avc:  denied  { remove_name } for  pid=5901 comm="ruby" name="serial.tmp" dev=dm-2 ino=131791 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=system_u:object_r:puppet_var_lib_t:s0 tclass=dir

with corresponding items in /var/log/messages (why not in /var/log/audit/audit.log? I have no idea):

puppet-master[13193]: Could not rename /var/lib/puppet/ssl/ca/serial to /var/lib/puppet/ssl/ca/serial.tmp: Permission denied - /var/lib/puppet/ssl/ca/serial.tmp or /var/lib/puppet/ssl/ca/serial

(Still unsolved mystery: on the production server, ausearch did not show any AVC denials; the denials were logged to /var/log/messages. I did not try "semodule -DB" to disable all dontaudits.)

On the test system, there were also denials like:

type=AVC msg=audit(1328221549.372:27539363): avc:  denied  { unlink } for  pid=29452 comm="ruby" name="serial.tmp" dev=dm-2 ino=134565 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:puppet_var_lib_t:s0 tclass=file
What happens is when a certificate signing request (CSR) comes in to the puppet master from a client, a file /var/lib/puppet/ca/serial.tmp is created. At the end of the signing process, that file is moved to serial. I think it just does a cp and rm. (My suspicion is based on the unlink policy that it needs.)

In any case, here is an updated policy file. Note the version number compared to the previous one.
module puppet_passenger 1.15;

require {
        type httpd_t;
        type httpd_passenger_helper_t;
        type port_t;
        type puppet_var_lib_t;
        type puppet_var_run_t;
        type puppet_log_t;
        type proc_net_t;
        type init_t;
        type user_devpts_t;
        class dir { write getattr read create search add_name remove_name rename unlink rmdir };
        class file { write append relabelfrom getattr setattr read relabelto create open rename unlink };
        class udp_socket name_bind;

#============= httpd_passenger_helper_t ==============
allow httpd_passenger_helper_t httpd_t:dir { getattr search };
allow httpd_passenger_helper_t httpd_t:file { read open };

#============= httpd_t ==============
#!!!! This avc can be allowed using the boolean 'allow_ypbind'

allow httpd_t port_t:udp_socket name_bind;

allow httpd_t proc_net_t:file { read getattr open };

allow httpd_t puppet_var_lib_t:dir { write read create add_name remove_name rename unlink rmdir };
allow httpd_t puppet_var_lib_t:file { relabelfrom relabelto create write append rename unlink };

allow httpd_t puppet_var_run_t:dir { getattr search };

allow httpd_t puppet_log_t:file { getattr setattr };

allow httpd_passenger_helper_t init_t:file { read };
allow httpd_passenger_helper_t init_t:dir { getattr search };


Puppet, Apache, mod_passenger, and SELinux

At work, we are currently working on deploying Puppet with Apache on RedHat Enterprise Linux 6 to replace our cfengine on RHEL4/5 setup.

We install Puppet direct from Puppetlabs, and mod_passenger from Stealthy Monkeys.

There are quite a few issues with directory permissions and SELinux. The directory permission issues are fairly easy to diagnose because the httpd log files, and the error messages that httpd sends back generally tell you what permissions it expected.

SELinux is a different kettle of fish. After doing ausearch and using audit2allow, plus a little bit of pruning, this seems to be a minimal set of permissions that allow puppet to run under Passenger and Apache (the following is a .te file):

module puppet_passenger 1.5;

require {
        type httpd_t;
        type httpd_passenger_helper_t;
        type port_t;
        type puppet_var_lib_t;
        type proc_net_t;
        class dir { write getattr read create search add_name };
        class file { write append relabelfrom getattr read relabelto create open };
        class udp_socket name_bind;

#============= httpd_passenger_helper_t ==============
allow httpd_passenger_helper_t httpd_t:dir { getattr search };
allow httpd_passenger_helper_t httpd_t:file { read open };

#============= httpd_t ==============
#!!!! This avc can be allowed using the boolean 'allow_ypbind'

allow httpd_t port_t:udp_socket name_bind;

allow httpd_t proc_net_t:file { read getattr open };

allow httpd_t puppet_var_lib_t:dir { write read create add_name };
allow httpd_t puppet_var_lib_t:file { relabelfrom relabelto create write append };
To install these changes:
# mkdir -p /usr/share/selinux/packages/puppet_passenger/
# cp puppet_passenger.te /usr/share/selinux/packages/puppet_passenger
# cd /usr/share/selinux/packages/puppet_passenger
# checkmodule -M -m -o puppet_passenger.mod puppet_passenger.te
# semodule_package -o puppet_passenger.pp -m puppet_passenger.mod
# semodule -i puppet_passenger.pp

And if you ever want to remove the permissions, just do:
# semodule -r puppet_passenger.pp