Checkpoint: Best Practice Kernel Parameters for ClusterXL Stability

This article details how to configure the best practice kernel parameters to ensure ClusterXL stability.

It is recommended to set ALL of those values on your cluster – also see sk92723 on Cluster flapping prevention.

Make sure the changes to the kernel parameters are performed on both members!

To test (will not survive a reboot):

fw ctl set int fwha_freeze_state_machine_timeout 200
fw ctl set int fwha_policy_update_timeout_factor 3
fw ctl set int fwha_pnote_timeout_mechanism_monitor_cpu 1
fw ctl set int fwha_pnote_timeout_mechanism_cpu_load_limit 80
fw ctl set int fwha_if_connectivity_tolerance 3

Once you are happy with the above you can enter the values into the $FWDIR/boot/modules/fwkern.conf file to make sure that the parameters stick after a reboot.

If the fwkern.conf doesn’t exist create it using the “touch” command:

[Expert@fw-trinity:0] # touch $FWDIR/boot/modules/fwkern.conf

Using the vi editor, insert the below commands into the fwkern.conf file:
fwha_freeze_state_machine_timeout = 200
fwha_policy_update_timeout_factor = 3
fwha_pnote_timeout_mechanism_monitor_cpu = 1
fwha_pnote_timeout_mechanism_cpu_load_limit = 80
fwha_if_connectivity_tolerance = 3

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.