- Useful commands for troubleshooting
show running resource-monitor
show running resource-monitor minute last 5
show running resource-monitor second last 30
show running resource-monitor ingress-backlogs
show session id 123456
request session-discard id 123456
show session packet-buffer-protection
less dp-log dp-monitor.log
less dp0-log dp-monitor.log
( severity eq 'high' ) or ( severity eq 'critical' )
( addr.src in '10.254.156.0/24' ) and ( bytes geq '1000000' )
( addr.src in '10.254.157.0/24' ) and ( app eq 'gre' )
Monitor -> Logs -> System
( description contains 'congestion' )
Monitor -> Logs -> Threats
( subtype eq 'flood' )
▶ Symptom
A number of factors can cause the dataplane's CPUs to spike or continuously run high: a sudden increase due to the implementation of a new service or resource, or a buildup over time due to added connected networks, segments and hosts. These factors and more can cause increases in packet rate, packet buffer utilization, or a huge amount of new connections per second that lead to the dataplane CPU reaching a critical level.Learn to troubleshoot high dataplane CPU. This article shows a couple of ways to verify the actual load of the dataplane and help determine the potential impact on your network.
The first step is to isolate where the performance issue is occurring:
• Data Plane (DP) CPU
• Packet Buffers
• Session
• Management Plane (MP)
When you see a performance issue, generate a tech support file while the issue is occurring. Look for the "---panio" string in the dp-monitor log for historic logs (this information is logged every 15 minutes) or run the
show running resource-monitor command from the CLI to view DP resource usage.
less dp-log dp-monitor.log
If you are running the command on the PA-5200 series, use the following command:
• Packet Buffers
• Session
• Management Plane (MP)
- How to identify high dataplane CPU
When you see a performance issue, generate a tech support file while the issue is occurring. Look for the "---panio" string in the dp-monitor log for historic logs (this information is logged every 15 minutes) or run theshow running resource-monitor command from the CLI to view DP resource usage.
less dp-log dp-monitor.log
If you are running the command on the PA-5200 series, use the following command:
less dp0-log dp-monitor.log
show running resource-monitor
This command can be used to review dataplane CPU usage. Add a time operator to reflect a timeframe you would like to review.
- 'second' shows the last 60 seconds of CPU usage in per second increments
- 'minute' shows the last 60 minutes in minute increments and so on
- If no time operator is used, all views will be listed in one long output
> show running resource-monitor
> day Per-day monitoring statistics
> hour Per-hour monitoring statistics
> ingress-backlogs show statistics
> minute Per-minute monitoring statistics
> second Per-second monitoring statistics
> week Per-week monitoring statistics
The actual output will look similar to this but may differ slightly depending on the amount of dataplanes and cores your system has.
> show running resource-monitor
DP dp0:
Resource monitoring sampling data (per second):
This first section shows CPU utilization by processes running on the dataplane
CPU load sampling by group:
flow_lookup : 0%
flow_fastpath : 0%
flow_slowpath : 0%
flow_forwarding : 0%
flow_mgmt : 0%
flow_ctrl : 1%
nac_result : 0%
flow_np : 0%
dfa_result : 0%
module_internal : 0%
aho_result : 0%
zip_result : 0%
pktlog_forwarding : 0%
lwm : 0%
flow_host : 1%
flow_fastpath : 0%
flow_slowpath : 0%
flow_forwarding : 0%
flow_mgmt : 0%
flow_ctrl : 1%
nac_result : 0%
flow_np : 0%
dfa_result : 0%
module_internal : 0%
aho_result : 0%
zip_result : 0%
pktlog_forwarding : 0%
lwm : 0%
flow_host : 1%
Many cores regularly surpassing 75% warrants an investigation.
CPU load (%) during last 60 seconds:
core 0 1 2 3 4 5 6 7 8 9 10 11
0 39 47 38 46 73 81 73 81 82 88 83
0 39 46 38 45 73 82 73 82 82 89 82
0 39 46 39 45 73 81 73 81 83 88 83
0 38 52 37 50 71 81 71 80 82 98 82
0 40 46 39 45 74 82 74 82 83 89 83
0 40 50 39 49 74 85 74 84 83 90 83
0 39 46 38 45 72 81 72 81 81 88 81
0 42 52 41 50 75 86 75 86 84 91 84
0 42 76 79 85 75 10075 100100100 85 >>> indicates actual high CPU
0 44 51 43 50 78 85 78 86 86 91 86
0 42 51 41 50 76 86 76 86 85 92 85
0 43 53 43 52 77 89 77 89 86 93 86
0 38 46 37 45 71 81 72 81 81 88 81
0 42 50 41 50 76 85 76 85 85 90 85
... cut for brevity
This section shows session table utilization, which is the maximum number of active sessions supported by the platform. Any value above 80% warrants investigation.
Resource utilization (%) during last 60 seconds:
session:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
packet buffer:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
packet descriptor:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
packet descriptor (on-chip):
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Things to check in the output:
1. Check the CPU load during the last 60 seconds. If any number is at or close to 100, then high CPU is likely the cause of the performance issue.
2. Check the "packet buffer" and "packet descriptor" sections. If any number is at or close to 100, then the issue is likely caused by running out of packet buffers.
[AnalysisMan] Observed 5~10 packet losses from time to time when the packet descriptor hits at 100.
3. Check the session section. If any number is close to or above 80, then the performance issue is most likely session related.
- [AnalysisMan Case]
▶ Symptom
- Losing network connectivity.
- HA links down and got Packet buffer congestion logs.
- The issue is packet-descriptor on chip and buffers fill up.
- The packet buffer abusive session-id returns bad key.
- The Packet Buffer Protection (PBP) was not effective.
show running resource-monitor minute last 30
admin@PA-3220(active)> show running resource-monitor minute last 30
packet descriptor (on-chip) (average):
5 5 5 5 5 5 5 5 5 5 5 5 5 13 5
5 5 5 10 5 5 5 5 5 5 5 19 10 9 5
packet descriptor (on-chip) (maximum):
6 6 6 8 6 6 6 10 6 6 7 6 6 100 10
6 6 6 100 7 6 8 6 7 6 6 100 100 100 6
admin@PA-3220(active)> show session id 3145733
Session 3145733
Bad Key: c2s: 'c2s'
Bad Key: s2c: 's2c'
▶ Root Cause
Bug ID: PAN-125534
(PA-5200 Series and PA-7000 Series firewalls only) Fixed an issue where firewalls experienced high packet descriptor (on-chip) usage during uploads to the WildFire Cloud or WF-500 appliance.
This bug was also triggered on PA-3220.
▶ Resolution
Upgrade to PAN-OS 9.0.8.
A workaround was changing the WildFire File Size Limits to default values.
---
Several commands are available that can help identify why certain measurements are above the threshold in the resource monitor:
show session info
Outputs all data relevant to how many sessions are supported, how many are currently active and will inform you about the packet rate, new connection establish rate and throughput. If any of these parameters are exceptionally high, they can cause the dataplane CPU to rise.
> show session info
target-dp: *.dp0
--------------------------------------------------------------------------------
Number of sessions supported: 262142
Number of active sessions: 3 < If this figure rises to the level of the supported sessions,
Number of active TCP sessions: 0 the platform is reaching maximum capacity.
Number of active UDP sessions: 1
Number of active ICMP sessions: 0
Number of active BCAST sessions: 0
Number of active MCAST sessions: 0
Number of active predict sessions: 0
Session table utilization: 0%
Number of sessions created since bootup: 332481
Packet rate: 28/s < An unusual and sudden high packet rate could indicate a DoS attack.
Throughput: 13 kbps
New connection establish rate: 0 cps
--------------------------------------------------------------------------------
Session timeout
If timeouts values are too aggressive or too relaxed, the system could run out of resources.
TCP default timeout: 3600 secs
TCP session timeout before SYN-ACK received: 5 secs
TCP session timeout before 3-way handshaking: 10 secs
TCP half-closed session timeout: 120 secs
TCP session timeout in TIME_WAIT: 15 secs
TCP session timeout for unverified RST: 30 secs
UDP default timeout: 30 secs
ICMP default timeout: 6 secs
other IP default timeout: 30 secs
Captive Portal session timeout: 30 secs
Session timeout in discard state:
TCP: 90 secs, UDP: 60 secs, other IP protocols: 60 secs
--------------------------------------------------------------------------------
Session accelerated aging: True
If accelerated aging has been turned off, some sessions may remain active in the session table
longer than needed and take up resources.
Accelerated aging threshold: 80% of utilization
Scaling factor: 2 X
--------------------------------------------------------------------------------
Session setup
If any of these have been turned off for troubleshooting and not turned back on, they can
consume additional resources to inspect malformed packets that would otherwise
be discarded.
TCP - reject non-SYN first packet: True
Hardware session offloading: True
IPv6 firewalling: True
Strict TCP/IP checksum: True
ICMP Unreachable Packet Rate: 200 pps
--------------------------------------------------------------------------------
Application trickling scan parameters:
Timeout to determine application trickling: 10 secs
Resource utilization threshold to start scan: 80%
Scan scaling factor over regular aging: 8
--------------------------------------------------------------------------------
Session behavior when resource limit is reached: drop
This indicates the behavior of the firewall when it has reached its extreme limit of
resource depletion.
--------------------------------------------------------------------------------
Pcap token bucket rate : 10485760
--------------------------------------------------------------------------------
Max pending queued mcast packets per session : 0
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Number of sessions supported: 262142
Number of active sessions: 3 < If this figure rises to the level of the supported sessions,
Number of active TCP sessions: 0 the platform is reaching maximum capacity.
Number of active UDP sessions: 1
Number of active ICMP sessions: 0
Number of active BCAST sessions: 0
Number of active MCAST sessions: 0
Number of active predict sessions: 0
Session table utilization: 0%
Number of sessions created since bootup: 332481
Packet rate: 28/s < An unusual and sudden high packet rate could indicate a DoS attack.
Throughput: 13 kbps
New connection establish rate: 0 cps
--------------------------------------------------------------------------------
Session timeout
If timeouts values are too aggressive or too relaxed, the system could run out of resources.
TCP default timeout: 3600 secs
TCP session timeout before SYN-ACK received: 5 secs
TCP session timeout before 3-way handshaking: 10 secs
TCP half-closed session timeout: 120 secs
TCP session timeout in TIME_WAIT: 15 secs
TCP session timeout for unverified RST: 30 secs
UDP default timeout: 30 secs
ICMP default timeout: 6 secs
other IP default timeout: 30 secs
Captive Portal session timeout: 30 secs
Session timeout in discard state:
TCP: 90 secs, UDP: 60 secs, other IP protocols: 60 secs
--------------------------------------------------------------------------------
Session accelerated aging: True
If accelerated aging has been turned off, some sessions may remain active in the session table
longer than needed and take up resources.
Accelerated aging threshold: 80% of utilization
Scaling factor: 2 X
--------------------------------------------------------------------------------
Session setup
If any of these have been turned off for troubleshooting and not turned back on, they can
consume additional resources to inspect malformed packets that would otherwise
be discarded.
TCP - reject non-SYN first packet: True
Hardware session offloading: True
IPv6 firewalling: True
Strict TCP/IP checksum: True
ICMP Unreachable Packet Rate: 200 pps
--------------------------------------------------------------------------------
Application trickling scan parameters:
Timeout to determine application trickling: 10 secs
Resource utilization threshold to start scan: 80%
Scan scaling factor over regular aging: 8
--------------------------------------------------------------------------------
Session behavior when resource limit is reached: drop
This indicates the behavior of the firewall when it has reached its extreme limit of
resource depletion.
--------------------------------------------------------------------------------
Pcap token bucket rate : 10485760
--------------------------------------------------------------------------------
Max pending queued mcast packets per session : 0
--------------------------------------------------------------------------------
debug dataplane pool statistics
Returns the status of all the buffers being used by the system and their status.• The number on the left indicates how much buffer is still available
• The number on the right indicates the total size
• If the number on the left drops to 0, the buffer is depleted
> debug dataplane pool statistics
Hardware Pools
[ 0] Packet Buffers : 57223/57344 0x8000000030c00000
[ 1] Work Queue Entries : 229315/229376 0x8000000037c00000
[ 2] Output Buffers : 1007/1024 0x800000000fc00000
[ 3] DFA Result : 3995/4000 0x8000000039800000
[ 4] Timer Buffers : 4096/4096 0x8000000039be8000
[ 5] PAN_FPA_LWM_POOL : 1024/1024 0x800000000fd00000
[ 6] ZIP Commands : 1023/1024 0x800000000fd40000
[ 7] PAN_FPA_BLAST_PO : 1024/1024 0x800000000ff40000
Software Pools
[ 0] software packet buffer 0 ( 512): 32767/32768 0x8000000043b2a680
[ 1] software packet buffer 1 ( 1024): 32768/32768 0x8000000044b4a780
[ 2] software packet buffer 2 ( 2048): 81920/81920 0x8000000046b6a880
[ 3] software packet buffer 3 (33280): 20480/20480 0x8000000050bba980
[ 4] software packet buffer 4 (66048): 304/304 0x80000000795cea80
[ 5] Shared Pool 24 ( 24): 560000/560000 0x800000007a8f6780
[ 6] Shared Pool 32 ( 32): 530000/530000 0x800000007b7eaa80
[ 7] Shared Pool 40 ( 40): 70000/70000 0x800000007ca1cf00
[ 8] Shared Pool 192 ( 192): 1269999/1270000 0x800000007cd0cf80
[ 9] Shared Pool 256 ( 256): 140000/140000 0x800000008ba70880
[10] ZIP Results ( 184): 1024/1024 0x80000000a1827300
[11] CTD AV Block ( 1024): 32/32 0x80000000be43d380
[12] Regex Results (11544): 8000/8000 0x80000000be466100
[13] SSH Handshake State ( 6512): 64/64 0x80000000c580c680
[14] SSH State ( 3248): 512/512 0x80000000c5872480
[15] TCP host connections ( 176): 15/16 0x80000000c5a08e80
show counter global filter delta yes
Returns, from the second time the command is issued onward, a snapshot of the global counters triggered during the timeframe between the current and previous iteration, and could expose unusual numbers of discarded packets.
> show counter global filter delta yes
System Statistics: ('q' to quit, 'h' for help)
Top 20 Application Statistics: ('q' to quit, 'h' for help)
It is also important to note that type of traffic and complexity in configuration can also add to CPU utilization. Hence it is important to look at the traffic pattern to understand the potential application causing the high CPU utilization.
TOP SESSIONS:
SESS-ID PCT GRP-ID COUNT
The command displays a maximum of the top five sessions that each use 2% or more of the packet buffer.
The sample output above indicates that Session 6 is using 92% of the packet buffer with TCP packets (protocol 6) coming from source IP address 192.168.2.35.
• SESS-ID—Indicates the global session ID that is used in all other
show session commands. The global session ID is unique within the firewall.
• GRP-ID—Indicates an internal stage of processing packets.
• COUNT—Indicates how many packets are in that GRP-ID for that session.
• APP—Indicates the App-ID extracted from the Session information, which can help you determine whether the traffic is legitimate. For example, if packets use a common TCP or UDP port but the CLI output indicates an APP of undecided, the packets are possibly attack traffic. The APP is
undecided when Application IP Decoders cannot get enough information to determine the application. An APP of unknown indicates that Application IP Decoders cannot determine the application; a session of unknown APP that uses a high percentage of the packet buffer is also suspicious.
ATOMIC queues of work must be executed by a single CPU core. Orderable queues of work may be distributed among multiple CPU cores. (The root/etymological meaning of 'atomic' is indivisible (or more literally, non-cuttable).
Hardware Pools
[ 0] Packet Buffers : 57223/57344 0x8000000030c00000
[ 1] Work Queue Entries : 229315/229376 0x8000000037c00000
[ 2] Output Buffers : 1007/1024 0x800000000fc00000
[ 3] DFA Result : 3995/4000 0x8000000039800000
[ 4] Timer Buffers : 4096/4096 0x8000000039be8000
[ 5] PAN_FPA_LWM_POOL : 1024/1024 0x800000000fd00000
[ 6] ZIP Commands : 1023/1024 0x800000000fd40000
[ 7] PAN_FPA_BLAST_PO : 1024/1024 0x800000000ff40000
Software Pools
[ 0] software packet buffer 0 ( 512): 32767/32768 0x8000000043b2a680
[ 1] software packet buffer 1 ( 1024): 32768/32768 0x8000000044b4a780
[ 2] software packet buffer 2 ( 2048): 81920/81920 0x8000000046b6a880
[ 3] software packet buffer 3 (33280): 20480/20480 0x8000000050bba980
[ 4] software packet buffer 4 (66048): 304/304 0x80000000795cea80
[ 5] Shared Pool 24 ( 24): 560000/560000 0x800000007a8f6780
[ 6] Shared Pool 32 ( 32): 530000/530000 0x800000007b7eaa80
[ 7] Shared Pool 40 ( 40): 70000/70000 0x800000007ca1cf00
[ 8] Shared Pool 192 ( 192): 1269999/1270000 0x800000007cd0cf80
[ 9] Shared Pool 256 ( 256): 140000/140000 0x800000008ba70880
[10] ZIP Results ( 184): 1024/1024 0x80000000a1827300
[11] CTD AV Block ( 1024): 32/32 0x80000000be43d380
[12] Regex Results (11544): 8000/8000 0x80000000be466100
[13] SSH Handshake State ( 6512): 64/64 0x80000000c580c680
[14] SSH State ( 3248): 512/512 0x80000000c5872480
[15] TCP host connections ( 176): 15/16 0x80000000c5a08e80
show counter global filter delta yes
Returns, from the second time the command is issued onward, a snapshot of the global counters triggered during the timeframe between the current and previous iteration, and could expose unusual numbers of discarded packets.
> show counter global filter delta yes
Global counters:
Elapsed time since last sampling: 2.981 seconds
Elapsed time since last sampling: 2.981 seconds
name value rate severity category aspect description
--------------------------------------------------------------------------------
pkt_recv 89 29 info packet pktproc Packets received
pkt_recv_zero 87 29 info packet pktproc Packets received from QoS 0
pkt_sent 3 1 info packet pktproc Packets transmitted
session_allocated 1 0 info session resource Sessions allocated
session_freed 1 0 info session resource Sessions freed
session_installed 1 0 info session resource Sessions installed
flow_arp_pkt_rcv 83 27 info flow arp ARP packets received
flow_host_pkt_rcv 2 0 info flow mgmt Packets received from control plane
flow_host_pkt_xmt 2 0 info flow mgmt Packets transmitted to control plane
flow_host_service_allow 2 0 info flow mgmt Device management session allowed
appid_ident_by_icmp 1 0 info appid pktproc Application identified by icmp type
dfa_sw 2 0 info dfa pktproc The total number of dfa match using software
aho_sw 1 0 info aho pktproc The total usage of software for AHO
ctd_pkt_slowpath 2 0 info ctd pktproc Packets processed by slowpath
pkt_flow_np 87 29 info packet resource Packets entered module flow stage np
pkt_flow_host 2 0 info packet resource Packets entered module flow stage host
--------------------------------------------------------------------------------
Total counters shown: 16
--------------------------------------------------------------------------------
Some common counters explained:--------------------------------------------------------------------------------
pkt_recv 89 29 info packet pktproc Packets received
pkt_recv_zero 87 29 info packet pktproc Packets received from QoS 0
pkt_sent 3 1 info packet pktproc Packets transmitted
session_allocated 1 0 info session resource Sessions allocated
session_freed 1 0 info session resource Sessions freed
session_installed 1 0 info session resource Sessions installed
flow_arp_pkt_rcv 83 27 info flow arp ARP packets received
flow_host_pkt_rcv 2 0 info flow mgmt Packets received from control plane
flow_host_pkt_xmt 2 0 info flow mgmt Packets transmitted to control plane
flow_host_service_allow 2 0 info flow mgmt Device management session allowed
appid_ident_by_icmp 1 0 info appid pktproc Application identified by icmp type
dfa_sw 2 0 info dfa pktproc The total number of dfa match using software
aho_sw 1 0 info aho pktproc The total usage of software for AHO
ctd_pkt_slowpath 2 0 info ctd pktproc Packets processed by slowpath
pkt_flow_np 87 29 info packet resource Packets entered module flow stage np
pkt_flow_host 2 0 info packet resource Packets entered module flow stage host
--------------------------------------------------------------------------------
Total counters shown: 16
--------------------------------------------------------------------------------
Counter | Description |
---|---|
pkt_recv | packet received |
session_allocated | session allocate |
flow_ipfrag_recv | ipfrag receive rate |
dfa_sw | dfa matching by sw |
dfa_fpga | dfa matching by fpga |
ctd_pkt_slowpath | pkt processed by ctd slowpath |
aho_sw | aho by sw |
aho_fpga | aho by fpga |
log_traffic_cnt | traffic log count |
flow_fwd_l3_ttl_zero | packets dropped: IP TTL |
fpga_pkt | pkt hold by fpga request. Note: this is not a cumulative counter but represents how many packets are being currently held waiting on FPGA. |
show system statistics session
Displays aggregated packet rate and throughput values of all dataplanes in a live dashboard fashion, which refreshes automatically
Displays aggregated packet rate and throughput values of all dataplanes in a live dashboard fashion, which refreshes automatically
System Statistics: ('q' to quit, 'h' for help)
Device is up : 0 day 2 hours 34 mins 57 sec
Packet rate : 32/s
Throughput : 92 Kbps
Total active sessions : 14
Active TCP sessions : 8
Active UDP sessions : 4
Active ICMP sessions : 2
Packet rate : 32/s
Throughput : 92 Kbps
Total active sessions : 14
Active TCP sessions : 8
Active UDP sessions : 4
Active ICMP sessions : 2
Toggle to view 'top 20 applications' by pressing 'a' and back by pressing 's'
Top 20 Application Statistics: ('q' to quit, 'h' for help)
Top 20 Application Statistics: ('q' to quit, 'h' for help)
Virtual System: vsys1
application sessions packets bytes
-------------------------------- ---------- ------------ ------------
ssl 349 47051 53368023
firefox-update 4 32503 31679603
google-base 57 19879 17687367
ms-update 30 2513 2413399
tor 1 715 710161
web-browsing 60 1140 588357
ping 2920 5840 572320
windows-push-notifications 57 819 266737
apt-get 6 662 260149
dns 728 1484 185056
dhcp 124 248 86800
ms-spynet 3 71 45382
ocsp 9 215 33311
google-update 2 28 8624
ntp 45 90 8100
ipv6 2 64 5248
sip 5 5 2283
ldap 8 9 2241
insufficient-data 5 13 780
icmp 6 6 520
application sessions packets bytes
-------------------------------- ---------- ------------ ------------
ssl 349 47051 53368023
firefox-update 4 32503 31679603
google-base 57 19879 17687367
ms-update 30 2513 2413399
tor 1 715 710161
web-browsing 60 1140 588357
ping 2920 5840 572320
windows-push-notifications 57 819 266737
apt-get 6 662 260149
dns 728 1484 185056
dhcp 124 248 86800
ms-spynet 3 71 45382
ocsp 9 215 33311
google-update 2 28 8624
ntp 45 90 8100
ipv6 2 64 5248
sip 5 5 2283
ldap 8 9 2241
insufficient-data 5 13 780
icmp 6 6 520
It is also important to note that type of traffic and complexity in configuration can also add to CPU utilization. Hence it is important to look at the traffic pattern to understand the potential application causing the high CPU utilization.
- Identify Sessions That Use an Excessive Percentage of the Packet Buffer
Monitor for Abusers in Real Time
The 'show running resource-monitor ingress-backlogs' CLI command can be used to find out packet buffer abusers at runtime:
show running resource-monitor ingress-backlogs
admin@PA-7050> show running resource-monitor ingress-backlogs
-- SLOT:s1, DP:dp1 -- USAGE - ATOMIC: 92% TOTAL: 93%
TOP SESSIONS:
SESS-ID PCT GRP-ID COUNT
6 92% 1 156
7 1732
7 1732
SESSION DETAILS SESS-ID PROTO SZONESRC SPORT DST DPORT IGR-IF EGR-IF APP
6 6 trust 192.168.2.35 55653 10.1.8.89 80 ethernet1/21 ethernet1/22 undecided
The command displays a maximum of the top five sessions that each use 2% or more of the packet buffer.
The sample output above indicates that Session 6 is using 92% of the packet buffer with TCP packets (protocol 6) coming from source IP address 192.168.2.35.
• SESS-ID—Indicates the global session ID that is used in all other
show session commands. The global session ID is unique within the firewall.
• GRP-ID—Indicates an internal stage of processing packets.
The mapping of group ID to function is:
Group ID Group Name
0 flow_lookup
1 flow_fastpath
2 flow_slowpath (On older hardware models (ex: PA-3000, PA-5000, PA-7000 w/ old NPC))
3 flow_forwarding
4 flow_mgmt
5 flow_ctrl
6 nac_result
7 flow_np
8 dfa_result
9 module_internal
10 aho_result
11 zip_result
12 pktlog_forwarding
16-18 flow_slowpath (On newer hardware models (ex: PA-3200, PA-5200, PA-7000 w/ new NPC))
16-18 flow_slowpath (On newer hardware models (ex: PA-3200, PA-5200, PA-7000 w/ new NPC))
• APP—Indicates the App-ID extracted from the Session information, which can help you determine whether the traffic is legitimate. For example, if packets use a common TCP or UDP port but the CLI output indicates an APP of undecided, the packets are possibly attack traffic. The APP is
undecided when Application IP Decoders cannot get enough information to determine the application. An APP of unknown indicates that Application IP Decoders cannot determine the application; a session of unknown APP that uses a high percentage of the packet buffer is also suspicious.
ATOMIC queues of work must be executed by a single CPU core. Orderable queues of work may be distributed among multiple CPU cores. (The root/etymological meaning of 'atomic' is indivisible (or more literally, non-cuttable).
To restrict the display output:
On a PA-7000 Series model only, you can limit output to a slot, a dataplane, or both. For example:
admin@PA-7050> show running resource-monitor ingress-backlogs slot s1
admin@PA-7050> show running resource-monitor ingress-backlogs slot s1 dp dp1
On PA-5000 Series, PA-5200 Series, and PA-7000 Series models only, you can limit output to a dataplane. For example:
On PA-5000 Series, PA-5200 Series, and PA-7000 Series models only, you can limit output to a dataplane. For example:
admin@PA-5060> show running resource-monitor ingress-backlogs dp dp1
Use the command output to determine whether the source at the source IP address using a high percentage of the packet buffer is sending legitimate or attack traffic.
Use the command output to determine whether the source at the source IP address using a high percentage of the packet buffer is sending legitimate or attack traffic.
In the sample output above, a single-session attack is likely occurring. A single session (Session ID 6) is using 92% of the packet buffer for Slot 1, DP 1, and the application at that point is
undecided.
• If you determine a single user is sending an attack and the traffic is not offloaded, you can End a Single Session DoS Attack. At a minimum, you can Configure DoS Protection Against Flooding of New Sessions.
• On a hardware model that has a field-programmable gate array (FPGA), the firewall offloads traffic to the FPGA when possible to increase performance. If the traffic is offloaded to hardware, clearing the session does not help because then it is the software that must handle the barrage of packets. You should instead Discard a Session Without a Commit.
To see whether a session is offloaded or not, use the show session id
Session 353706
c2s flow:
source: 10.24.65.6 [Trust_Lab]
dst: 134.141.79.85
proto: 6
sport: 824 dport: 834
state: ACTIVE type: FLOW
src user: unknown
dst user: unknown
s2c flow:
source: 134.141.79.85 [Trust_WAN_local]
dst: 10.24.65.6
proto: 6
sport: 834 dport: 824
state: ACTIVE type: FLOW
src user: unknown
dst user: unknown
qos node: ethernet1/20.12, qos member N/A Qid 0
start time : Wed Feb 19 21:49:50 2020
timeout : 15 sec
time to live : 6 sec
total byte count(c2s) : 13520
total byte count(s2c) : 440902
layer7 packet count(c2s) : 210
layer7 packet count(s2c) : 311
vsys : vsys1
application : ypserv
rule : lab_to_corp
service timeout override(index) : False
session to be logged at end : True
session in session ager : True
session updated by HA peer : False
layer7 processing : enabled
ctd version : 6
URL filtering enabled : True
URL category : any
session via prediction : True
use parent's policy : False
session via syn-cookies : False
session terminated on host : False
session traverses tunnel : True
session terminate tunnel : False
captive portal session : False
ingress interface : ethernet1/20.12
egress interface : tunnel.9
session QoS rule : N/A (class 4)
tracker stage firewall : TCP FIN
end-reason : tcp-fin
▶ Discard hose sessions using the 'request session-discard' command:
request session-discard id 128232
admin@PA-3220(active)> request session-discard id 128232
session id: 128232 vsys: vsys1 szone: Trust_Lab src: 10.68.216.48 sport: 57745 dst: 10.204.108.89 dport: 2500 protocol_6 igr-if: ethernet1/20.12 egr- if: tunnel.24 app: veeam discarded by admin: admin timeout: 3600
▶ Related posts:
* Reference URLs:
How to Interpret: show running resource-monitor
How to Troubleshoot High Dataplane CPU
How to troubleshoot high DP CPU because of IPSEC traffic
DotW: Dataplane Usage
High Dataplane CPU Caused From "too small" or "too large" Packets For Content Inspection
Why is the message "Dataplane under severe load" displayed in the System log?
How to interpret output of "debug dataplane pow performance" during troubleshooting high DP CPU
No comments:
Post a Comment