Calico network troubleshooting¶
This page lists general Calico network troubleshooting steps.
Confirm whether the IP pool addresses are exhausted¶
Confirm whether the block of the IP pool has been allocated by the node:
-
Calculate the total number of IPs
IPS
= 2 ^ (32 - mask)If the mask is 20, the total number of IPs = 2^12 = 4096
-
Count how many
Block
there areView the
BlockSize
in the IP pool. The default is 26, so aBlock
has 2 ^ (32 - 26) = 64 addresses, and the number of Blocks = 4096 / 64 = 64. -
Check if
Block
is allocatedIn Calico, each node should have at least one block. Therefore, if the number of nodes is greater than 64, you may need to consider adding an IP pool or modifying the size of the original IP pool, otherwise Pods on some nodes may not be able to be accessed normally. Please refer to
ippool.md
.
Confirm that the tunnel interface is working properly¶
Take vxlan
mode as an example:
-
View
vxlan.calico
status$ ip a show vxlan.calico 6: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default link/ether 66:07:01:14:06:90 brd ff:ff:ff:ff:ff:ff inet 10.244.0.0/32 scope global vxlan.calico valid_lft forever preferred_lft forever
Note that its
state
should beUNKNOWN
. -
Check whether the tunnel IP is normal
Under normal circumstances, the tunnel IP address should be within the IP range of
Block
to which this node belongs. If not, there may be communication problems, most of which are usually caused by insufficient addresses. -
Check whether the routing is normal
The route delivered by
calico
to each node is based on the block range of the node, that is, a block corresponds to a static route:$ ip r default via 172.18.0.1 dev eth0 ... 10.244.0.48/28 via 10.244.0.48 dev vxlan.calico onlink ...
As above
10.244.0.48/28 via 10.244.0.48 dev vxlan.calico onlink
, there may be multiple routes like this, corresponding to the addresses of different nodes and blocks.Confirm whether the target Pod IP has the target address of the above static route on this node: such as
10.244.0.48/28
. If not present, there is a problem with routing updates.If it exists, check again whether the next hop of this static route: 10.244.0.48 (that is, the
vxlan
tunnel IP of the peer node) is on the same node as the Block corresponding to the target address. If not, it may cause communication problems. If yes, check the local neighbor table:Go to the corresponding node to confirm: whether the Mac of the vxlan tunnel IP of the peer node is:
66:46:b3:ce:05:5d
. If not, the neighbor table learning error, you can delete thearp
cache of this section to make it learn again.
tcpdump
packet capture confirmation¶
Use tcpdump
to capture packets in the veth
virtual network card, tunnel network card, host network card or Pod
, and observe at which stage the message is discarded.
iptables
rule view¶
If the route forwarding is normal, it may be filtered out by some abnormal iptables
rules.
TODO: Calico iptables rule combing
If the above is still not resolved: save the scene, pack the log, contact R&D.
calicoctl
¶
Resource Operations¶
- get
- create
- patch
- delete
- apply -replace
ipam
¶
-
View currently allocated blocks and their usage
$ calicoctl ipam show --show-blocks +----------+---------------------+------------+--- ---------+-------------------+ | GROUPING | CIDR | IPS TOTAL | IPS IN USE | IPS FREE | +----------+---------------------+------------+--- ---------+-------------------+ | IP Pool | 10.244.0.0/26 | 64 | 12 (19%) | 52 (81%) | | Block | 10.244.0.0/28 | 16 | 1 (6%) | 15 (94%) | | Block | 10.244.0.16/28 | 16 | 3 (19%) | 13 (81%) | | Block | 10.244.0.32/28 | 16 | 7 (44%) | 9 (56%) | | Block | 10.244.0.48/28 | 16 | 1 (6%) | 15 (94%) | | IP Pool | fd1a:29a5:aa4e::/48 | 1.2089e+24 | 0 (0%) | 1.2089e+24 (100%) | +----------+---------------------+------------+--- ---------+-------------------+
-
Check if an IP is in use
-
Release an IP
If it is confirmed that an IP has not been allocated and Calico has not properly reclaimed the IP, it can be released manually:
NOTE: Be sure to confirm that no Pod is using this IP.
node¶
-
View node status
-
node status check
More command details: calicoctl --help