CPU at 100% – How do I connect to the box ?

Did you know you can setup a priority queue in your Check Point R77.30 appliance so that should the device become unresponsive due to high CPU load, you can still connect to it?

Have you ever had to run into a DC and pull a power cord because you could not connect to a box? Well now you can setup priority queues so you won’t have to do that.

Priority Queues are a mechanism that are intended to prioritize part of the traffic when we need to drop packets because the Security Gateway is stressed (CPU is fully utilized). In R77.20 and lower versions, when the CPU became fully utilized, part of the traffic was dropped regardless of the traffic type. As a result, control connections (described below) were dropped, which had serious negative impact (e.g., no SSH connectivity). In addition, several “heavy” connections could cause high CPU load on Security Gateway and cause issues for all other connections. However  R77.30 is “protecting” the CPU cores, on which Firewall is running.

To set this up follow theses instructions:

Instructions:

To check the current mode on Security Gateway:
[Expert@HostName]# fw ctl multik get_mode

To fully enable the Firewall Priority Queues on Security Gateway:

Note: In cluster environment, this procedure must be performed on all members of the cluster.
1.Run in Expert mode:
[Expert@HostName]# fw ctl multik set_mode 9

2.Reboot (in cluster, this might cause fail-over).

There are 3 modes (see chart)  and you can switch easily between them.

snip1
Firewall Priority Queues feature are now fully enabled however it is not currently on.  When is it on?  It turns on only in an extreme condition like when the CPU is overloaded.  The queues themselves are already predefined. See the chart below:

snip

You can also use this feature to monitor the Heavy Connections (that consume the most CPU resources) without interrupting the normal operation of Firewall, using the same command fw ctl multik set_mode 1

To learn more specifics check out sk105762

Happy uptime !!

references:

sk105762
sk105261
sk52421

 

Setting the Cluster_id from the command line on Check Point

This is the replacement for the mac-magic as of R77.30. If you are using an older version you still have to use mac-magic. We now use the cluster id. The reason that we set a cluster id is because if you wish to put two or more clusters on the same subnet this creates a problem. The Cluster Control Protocol (CCP) packets that are sent between the members of the same cluster, reach the neighbor cluster (connected to the same network) and “confuse” it.

So by changing this number ensures communication to the correct member.

Most of time you are setting this when you are running the first time wizard in the WebUI, however should you need to change it, or establish the cluster id from the command line it is a simple setup.

This must be done on both cluster members.

  1. Login as Expert
  2. View the current setting type: cphaconf cluster_id get
  3. type: cphaconf cluster_id set <number to set to ex…1-252>
  4. reboot the appliance
  • It is very important that you do not use 253, 254. These are the Default cluster id’s of the device. So if you are putting another cluster on the wire and you change the id to say 254 you haven’t achieved anything as that is the default.  you will still be getting errors in the log.

This command sets the value of Cluster Global ID permanently – the configured value is automatically and immediately inserted into the$FW_BOOT_DIR/ha_boot.conf file

You can also do this in GAIA when you first install the box. Here is a screenshot from the first time wizard.

cluster-id-set

references:

SK25977

No Downtime Check Point Cluster Upgrades

Ever wonder how to do a no downtime upgrade on a Check Point Cluster?

Many of my customers have asked me for best practices for upgrading their clusters. I have shown the methods (and done them) throughout the years of failover back and forth between members using state to effectively create a no downtime scenario, however there was no real guarantee of uptime or that a connection wouldn’t be dropped.

Well now in R77 you can do what is called a full connectivity upgrade. Basically it means:

• Connection failover is guaranteed.
• There is always at least one active cluster member that handles the traffic.
• Connections are synchronized between cluster members running different Check Point software versions.

Sound good ???

Here are none simple steps on how you do it:

Before you upgrade:
• Make sure that the cluster has 2 members, where one of them is the Active member and the other member is in Standby.
• Get the sync interface IP address and the cluster member ID of the Active cluster member.

From the command line in Expert type: cphaprob stat

To upgrade the cluster:

  1. In SmartDashboard:
    – In the Gateway Cluster General Properties window, change the Cluster version to the upgraded one.
    – In the Install Policy window, go to the Installation Mode area > Install on each selected gateway independently and clear For Gateway Clusters install on all the members, if it fails do not install at all.
  2. Install the security policy on the cluster.
    (Note – The policy successfully installs on the Ready cluster member and fails to install on the Active cluster member. This is expected, ignore the warning.)
  3. On the Active cluster member, run: cphaprob stat
    (Make sure the state is Active or Active Attention, and record the Sync IP and the Member ID of the cluster member.)
  4. On the upgraded cluster member, run these commands:
    cphaprob stat
    (Make sure that the cluster member is in Ready state.)
    cphacu start
    (The Connectivity Upgrade runs. When it finishes, the member’s state is Ready for Failover.)
    cphacu stat
    (Make sure that the Active cluster member handles the traffic.)
  5. On the Active cluster member, run these commands:
    cphaprob stat
    (Make sure the local member is in Active or Active Attention state, and the upgraded member is in Down state.)
    cpstop
    (The connections fail over to the upgraded cluster member.)
  6. On the upgraded cluster member, run: cphaprob stat
    (Make sure that it is now in the Active state.)
  7. On the new upgraded cluster member, run: cphacu stat
    (Make sure it handles the traffic.)
  8. Upgrade the former Active cluster member.
    (Make sure to reboot it after the upgrade.)
  9. Install Policy.

Note: After the cluster upgrade is complete, the Cluster Control Protocol works in broadcast mode.  I recommend this mode however if you were running in multicast mode and wish to return to that state then on all cluster members run the command: cphaconf set_ccp multicast

That is all it takes to have a successful full connectivity upgrade of an HA cluster. If however you have more than a two member cluster or have a VSX cluster or would like more information check out the CP_CU_BestPractices.pdf guide on how to do all this.

Capture

References :

Connectivity Upgrades Best Practices R77 (CP_CU_BestPractices.pdf)