Contents
- 1 K8S问题总结一
- 1.1 crictl images list
- 1.2 [ERROR FileContent–proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-cal
- 1.3 [kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”nnect: connection refused.
- 1.4 pause:3.6错误
- 1.5 [root@k8s-master01 ~]# kubectl get nodes
- 1.6 写doployment的yaml报错
- 1.7 annotation错误
- 1.8 spec.containers[0].name: Invalid value:
- 1.9 pod删除不了
- 1.10 命名空间删除不了
- 1.11 plugin type=”flannel” failed (add): failed to delegate add: failed to set bridge addr: “cni0” already has an IP address different from 10.244.1.1/24
- 1.12 container配置文件 /etc/containerd/config.toml
- 1.13 镜像下载不下来
- 1.14 master主节点无法运行pod
- 1.15 error: no objects passed to apply
K8S问题总结一
crictl images list
[root@master k8s_install]# crictl images list
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
E1010 17:19:18.816289 3832 remote_image.go:119] “ListImages with filter from image service failed” err=”rpc error: code = Unavailable desc = connection error: desc = \”transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory\”” filter=”&ImageFilter{Image:&ImageSpec{Image:list,Annotations:map[string]string{},},}”
FATA[0000] listing images: rpc error: code = Unavailable desc = connection error: desc = “transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory”
出现如上报错的原因时,crictl下载镜像时使用的是默认端点[unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]。这些端点废弃了,需要重新指定containerd.sock。后面的报错就是找不到dockershim.sock。
解决方法:修改crictl配置文件
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
timeout: 0
debug: false
pull-image-on-create: false
EOF
[ERROR FileContent–proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-cal
永久解决方法:
在/etc/sysctl.conf中添加:
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
执行sysctl -p 时刷新
sysctl -p
如果出现 缺少文件的现象
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: 没有那个文件或目录
这是因为之前配置的br_netfilter没有启动,运行一下命令即可,则确认是否驱动加载完成
#驱动加载
modprobe br_netfilter
[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”nnect: connection refused.
原因1、没有修改docker的工作方式为systemd
// 输入下面命令确定问题原因
docker info | grep Cgroup
Cgroup Driver: cgroupfs
Cgroup Version: 1
sudo cat /var/lib/kubelet/config.yaml | grep cgroup
cgroupDriver: systemd
将vim /etc/docker/daemon.json
{
"registry-mirrors": ["https://dpxn2pal.mirror.aliyuncs.com"],
"exec-opts": [ "native.cgroupdriver=systemd" ]
}
// 重启 Docker 服务
systemctl daemon-reload
systemctl restart docker
原因2:hostname没有解析
pause:3.6错误
"registry.k8s.io/pause:3.6\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts
ailed to resolve reference \"registry.k8s.io/pause:3.6\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/
k8s-master containerd[xxxx]: time="20xx-xx-xxTxx:xx:xx.xxxxxxx+08:00"
level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:etcd-k8s-master,Uid:xxx,Namespace:kube-system,Attempt:0,} failed, error"
error="failed to get sandbox image \"registry.k8s.io/pause:3.6\": failed to pull image \"registry.k8s.io/pause:3.6\":
failed to pull and unpack image \"registry.k8s.io/pause:3.6\":
failed to resolve reference \"registry.k8s.io/pause:3.6\":
failed to do request:
Head \"https://us-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6\":
dial tcp xxx.xxx.xxx.xxx:443: connect: connection refused"
ctr -n k8s.io images pull -k registry.aliyuncs.com/google_containers/pause:3.6
ctr -n k8s.io images tag registry.aliyuncs.com/google_containers/pause:3.6 registry.k8s.io/pause:3.6
#重命名镜像registry.aliyuncs.com/google_containers/pause:3.6的tag为registry.k8s.io/pause:3.6
kubeadm reset -f
然后重新执行初始化Master节点的命令即可。以上命令请在子节点也运行!!!
[root@k8s-master01 ~]# kubectl get nodes
Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “kubernetes”)
rm -rf ~/.kube/*
cp /etc/kubernetes/admin.conf ~/.kube/config
service kubelet restart
写doployment的yaml报错
Error from server (BadRequest): error when creating “deploy.yml”: Deployment in version “v1” cannot be handled as a Deployment: strict decoding error: unknown field “spec.selector.matchLables”
apiVersion: v1
kind: Deployment
metadata:
name: deploy01
spec:
replicas: 2
selector:
matchLabels:
app: myapp
server: nginx
template:
正确写法
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy01
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
server: nginx
template:
metadata:
labels:
app: myapp
server: nginx
spec:
containers:
- name: nginx
image: ikubernetes/myapp:v1
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
annotation错误
[root@k8s-master01 k8s]# kubectl apply -f svc.yml
Warning: resource services/mynginx-svc is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
apiVersion: v1
kind: Service
metadata:
annotations: 这一段需要添加,因为这是修改资源不是创建资源,写上修改注释
kubectl.kubernetes.io/last-applied-configuration: "mylaste modify"
labels:
app: mynginx
name: mynginx-svc
namespace: default
spec:
spec.containers[0].name: Invalid value:
The Pod "pod-deme" is invalid: spec.containers[0].name: Invalid value: "configVolume": a lowercase RFC 1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')
容器名称必须只能有小写字母,数字和中划线祖成,大写字母出错
pod删除不了
将finalizers设置为空
kubectl patch pod xxx -n xxx -p ‘{“metadata”:{“finalizers”:null}}’
或者强制删除
$ kubectl delete pod xxx -n xxx --force --grace-period=0
命名空间删除不了
一个终端挂上代理
kubectl proxy &
另一个终端运行以下命令即可删除namespace
kubernetes-dashboard 替换为要删除的namespace
前提要安装jq
yun install -y jq
NAMESPACE=kubernetes-dashboard
kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' > temp.json
curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize
plugin type=”flannel” failed (add): failed to delegate add: failed to set bridge addr: “cni0” already has an IP address different from 10.244.1.1/24
重新生成cni0,将ip设置为报错的那个ip
ip link add cni0 type bridge
ip link set dev cni0 up
ifconfig cni0 10.244.2.1/24
ifconfig cni0 mtu 1450 up
container配置文件 /etc/containerd/config.toml
sandbox_image = “registry.aliyuncs.com/google_containers/pause:3.9”
镜像下载不下来
方法1:从其他库下载
必须设置,存在镜像就不去从库拉取
image:xxxx
imagePullPolicy: IfNotPresent
查看镜像系统中的镜像
ctr -n k8s.io image ls
以下docker地址都是可以的
"https://hub.docker.com/",
"https://registry.docker-cn.com",
"https://ue05qxiu.mirror.aliyuncs.com",
"https://docker.mirrors.ustc.edu.cn/",
"https://hub-mirror.c.163.com/",
"https://reg-mirror.qiniu.com"
IMAGE是需要的镜像版本,SOURCE是加速镜像地址,每个节点都要下载镜像,除非固定节点
IMAGE=nginx:1.21
SOURCE=ue05qxiu.mirror.aliyuncs.com
ctr -n k8s.io images pull $SOURCE/library/$IMAGE
ctr -n k8s.io images tag $SOURCE/library/$IMAGE docker.io/library/$IMAGE
ctr -n k8s.io i rm $SOURCE/library/$IMAGE
方法2:直接将docker镜像导入k8s
IMAGE=nginx:1.21
docker save -o $IMAGE $IMAGE
ctr -n=k8s.io images import $IMAGE
master主节点无法运行pod
在配置中添加污点容忍度
版本不同配置可能不同
spec:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
error: no objects passed to apply
原因是hostpath没有给类型
volumes:
- name: config-volume
configMap:
name: alertmanager
- name: storage-volume
hostPath:
path: /data/alertmanager/
type: Directory #这一行