In the EXOS switch, there are two ways to look at the CPU utilization. The top command shows the real-time utilization of the EXOS tree process and refreshes every second. In comparison, show cpu-monitoring command dissects the process in 5,10,30 secs and 1,5,30,60 minutes intervals and populates in alphabetic order.
- How to check CPU Utilization in EXOS
Option 1. Use 'top' command to check the real-time utilization.
The following lists a description of the table columns seen in the output of the TOP command:
· usr: user cpu time (or) % CPU time spent in user space
· sys: system cpu time (or) % CPU time spent in kernel space
· nic: user nice cpu time (or) % CPU time spent on low priority processes
· idle: idle cpu time (or) % CPU time spent idle
· irq: hardware irq (or) % CPU time spent servicing/handling hardware interrupts
· sirq: software irq (or) % CPU time spent servicing/handling software interrupts
The load average is based on the CPU average for 1, 5, and 15 minute intervals.
top
Switch# top
Mem: 391088K used, 589604K free, 716K shrd, 16888K buff, 120060K cached
CPU: 15.4% usr 2.2% sys 0.0% nic 82.2% idle 0.0% io 0.0% irq 0.0% sirq
Load average: 4.18 4.23 4.19 2/274 4199
PID PPID USER STAT RSS %RSS CPU %CPU COMMAND
1949 1 root S< 89492 9.0 0 14.8 ./hal
11681 1 root S 23672 2.4 0 0.3 ./expy -d -m exos.httpd
1955 1 root S 5480 0.5 0 0.3 ./fdb
1975 1 root S 4568 0.4 0 0.3 ./dot1ag
2060 1 root S 4960 0.5 0 0.3 ./pim
5804 5802 root S 8232 0.8 0 0.3 /exos/bin/hiveagent_pr { "upgradeVersion": "", "status": 0, "infor
1653 1 root S 5632 0.5 0 0.3 /exos/bin/epm -t 40 -f /exos/config/epmrc -d /exos/config/epmdprc
4188 3727 root R 1968 0.2 0 0.3 top -d 3
21691 2 root IW 0 0.0 0 0.3 [kworker/0:2]
1953 1 root S 6340 0.6 0 0.0 ./vlan
1939 1 root S 36640 3.7 0 0.0 ./cliMaster
5796 2019 root S 16128 1.6 0 0.0 /exos/bin/expy -m exos.apps.iqagent -v 2
2007 1 root S 16000 1.6 0 0.0 ./policy
1945 1 root S 6676 0.6 0 0.0 ./aaa -t random
1993 1 root S 5256 0.5 0 0.0 ./exsshd
2056 1 root S 4376 0.4 0 0.0 ./ospf
1941 1 root S 7924 0.8 0 0.0 ./snmpMaster
Press Ctrl + C or q to exit from the top command's monitoring screen.
Option 2. Use 'show cpu-monitoring' to check in seconds and minutes intervals.
By default, CPU monitoring is enabled and occurs every 5 seconds. The default CPU threshold value is 90%.
Depending on the software version running on your switch or your switch model, additional or different CPU and process information might be displayed.
The show cpu-monitoring command is helpful for understanding the behavior of a process over an extended period of time. The following information appears in a tabular format:
· Card: The location (MSM A or MSM B) where the process is running on a modular switch.
· Process: The name of the process.
· Range of time (5 seconds, 10 seconds, and so forth): The CPU utilization history of the process or the system. The CPU utilization history goes back only 1 hour.
· Total User/System CPU Usage: The amount of time recorded in seconds that the process spends occupying CPU resources. The values are cumulative meaning that the values are displayed as long as the system is running. You can use this information for debugging purposes to see where the process spends the most amount of time: user context or system context.
show cpu-monitoring
Switch# show cpu-monitoring
CPU Utilization Statistics - Monitored every 5 seconds
-----------------------------------------------------------------------
Process 5 10 30 1 5 30 1 Max Total
secs secs secs min mins mins hour User/System
util util util util util util util util CPU Usage
(%) (%) (%) (%) (%) (%) (%) (%) (secs)
-----------------------------------------------------------------------
System 0.0 0.0 0.0 0.0 0.0 0.0 0.0 84.1 17486.59 19182.42
aaa 0.1 0.0 0.0 0.2 0.1 0.1 0.1 1.6 1002.78 807.02
acl 0.0 0.1 0.0 0.1 0.1 0.1 0.1 1.9 4688.14 4188.75
bfd 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 934.45 766.22
bgp 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.6 280.48 112.71
brm 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 279.33 131.17
cfgmgr 0.0 0.0 0.0 0.0 0.2 0.1 0.2 4.8 2036.28 994.53
cli 0.0 0.0 0.0 0.2 0.0 0.0 0.0 33.5 558.08 256.61
devmgr 0.0 0.0 0.0 0.2 0.0 0.1 0.1 5.6 1844.89 434.18
- Understand the processes
Use the following commands to know about each process.
show process description
Switch# show process description
Process Name Description
----------------------------------------------------------------------
aaa Authentication, Authorization, and Accounting Server
acl Access Control List Manager
bfd IETF Bidirectional Forwarding Detection
...snipped ...
Switch# show process description thttpd
Process Name Description
----------------------------------------------------------------------
thttpd HTTP Services
- High CPU utilization cases
Here are several high CPU utilization cases and resolutions.
Case 1. High CPU utilization with hal process after upgrading to EXOS 30.7.
EXOS 30.6 and lower versions were not considered the consumption of the hal process, so you will see higher CPU utilization after upgrading the switch to EXOS 30.7 or higher.
* Resolution:
Upgrade to latest patch of recommended release. The following link provides recommended EXOS and Switch Engine releases for each hardware platform.
Upgrade to latest patch of recommended release. The following link provides recommended EXOS and Switch Engine releases for each hardware platform.
▶ ExtremeXOS and Switch Engine Release Recommendations
Even with default configuration, CPU utilization can be above 20% in EXOS 22.x. Especially on some lower-end switches, this can cause peaks of over 90% when a lot of programming is happening on the switches. The segment that handles the link scan as well as the multicast re-programming was present in the kernel space in earlier 21.x versions. This segment has been moved to the HAL process since the 22.x version.
* Resolution:
Upgrade to latest patch of recommended release.
* Symptoms:
High CPU utilization on the Backup node of a SummitStack. The following log message is generated:
<Warn:EPM.cpu> Slot-2: CPU utilization monitor: process vsm consumes 97 % CPU
* Cause:
TCP port 4001 (used for communication between MLAG peers) is open when it should not be
A large amount of traffic is being sent to TCP port 4001, leading to high CPU utilization from the vsm process
* Resolution:
Upgrade to a version of code that includes the fix for CR xos0052842.
* Symptoms:
- High CPU utilization from Nodemgr process
- System uptime equal to or greater than 994 days
A message similar to the following log entry may be seen:
07/25/2014 11:50:24.18 MSM-B: CPU utilization monitor: process nodemgr consumes 79 % CPU
* Environment
EXOS versions 12.4.x prior to 12.4.4.8-patch1-1
EXOS versions 12.5.x prior to 12.5.3.7
EXOS versions 12.3.x
* Cause
The nodemgr process constantly consumes excessive CPU usage once the system uptime reaches around 994 days.
* Resolution
This issue has been resolved under CR xos0042592
This has been fixed in the following EXOS Releases:
12.4.4.8-patch1-1 and later
12.5.3.7 and later
A temporary workaround is to reboot the switch.
* Symptoms
High CPU utilization of bcmRx process on sFlow enabled switch.
* Environment
EXOS 12.4, BlackDiamond 8810
* Cause
sFlow is sending excessive traffic to the CPU.
* Resolution
Disable sFlow or configure a lower CPU sample limit, for example "configure sflow max-cpu-sample-limit 100".
* Reference URLs
How to check CPU utilization in EXOS
Understanding the output of the TOP command
How to gather top CPU output over time to a text file
Does collecting "show tech-support" introduce any problems such as high CPU utilization?
CPU utilization is not accounting for some kernel thread utilization in the command output 'show cpu-monitoring'
Case 2. High CPU utilization with hal process in EXOS 22.x.
Even with default configuration, CPU utilization can be above 20% in EXOS 22.x. Especially on some lower-end switches, this can cause peaks of over 90% when a lot of programming is happening on the switches. The segment that handles the link scan as well as the multicast re-programming was present in the kernel space in earlier 21.x versions. This segment has been moved to the HAL process since the 22.x version.
* Resolution:
Upgrade to latest patch of recommended release.
Case 3. High CPU utilization from VSM process.
* Symptoms:
High CPU utilization on the Backup node of a SummitStack. The following log message is generated:
<Warn:EPM.cpu> Slot-2: CPU utilization monitor: process vsm consumes 97 % CPU
* Cause:
TCP port 4001 (used for communication between MLAG peers) is open when it should not be
A large amount of traffic is being sent to TCP port 4001, leading to high CPU utilization from the vsm process
* Resolution:
Upgrade to a version of code that includes the fix for CR xos0052842.
Case 4. High CPU Utilization from Nodemgr process.
* Symptoms:
- High CPU utilization from Nodemgr process
- System uptime equal to or greater than 994 days
A message similar to the following log entry may be seen:
07/25/2014 11:50:24.18 MSM-B: CPU utilization monitor: process nodemgr consumes 79 % CPU
* Environment
EXOS versions 12.4.x prior to 12.4.4.8-patch1-1
EXOS versions 12.5.x prior to 12.5.3.7
EXOS versions 12.3.x
* Cause
The nodemgr process constantly consumes excessive CPU usage once the system uptime reaches around 994 days.
* Resolution
This issue has been resolved under CR xos0042592
This has been fixed in the following EXOS Releases:
12.4.4.8-patch1-1 and later
12.5.3.7 and later
A temporary workaround is to reboot the switch.
Case 5. High CPU utilization from bcmRx process on sFlow enabled switch.
* Symptoms
High CPU utilization of bcmRx process on sFlow enabled switch.
* Environment
EXOS 12.4, BlackDiamond 8810
* Cause
sFlow is sending excessive traffic to the CPU.
* Resolution
Disable sFlow or configure a lower CPU sample limit, for example "configure sflow max-cpu-sample-limit 100".
* Reference URLs
How to check CPU utilization in EXOS
Understanding the output of the TOP command
How to gather top CPU output over time to a text file
Does collecting "show tech-support" introduce any problems such as high CPU utilization?
CPU utilization is not accounting for some kernel thread utilization in the command output 'show cpu-monitoring'
No comments:
Post a Comment