K8S问题总结一

K8S问题总结一

crictl images list

[root@master k8s_install]# crictl images list
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
E1010 17:19:18.816289 3832 remote_image.go:119] “ListImages with filter from image service failed” err=”rpc error: code = Unavailable desc = connection error: desc = \”transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory\”” filter=”&ImageFilter{Image:&ImageSpec{Image:list,Annotations:map[string]string{},},}”
FATA[0000] listing images: rpc error: code = Unavailable desc = connection error: desc = “transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory”

出现如上报错的原因时,crictl下载镜像时使用的是默认端点[unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]。这些端点废弃了,需要重新指定containerd.sock。后面的报错就是找不到dockershim.sock。

解决方法:修改crictl配置文件

cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
timeout: 0
debug: false
pull-image-on-create: false
EOF

[ERROR FileContent–proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-cal

永久解决方法:

在/etc/sysctl.conf中添加:

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

执行sysctl -p 时刷新

sysctl -p
如果出现 缺少文件的现象

sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: 没有那个文件或目录
这是因为之前配置的br_netfilter没有启动,运行一下命令即可,则确认是否驱动加载完成

#驱动加载
modprobe br_netfilter 

[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get “http://localhost:10248/healthz”nnect: connection refused.

原因1、没有修改docker的工作方式为systemd
// 输入下面命令确定问题原因

 docker info | grep Cgroup

 Cgroup Driver: cgroupfs
 Cgroup Version: 1

sudo cat /var/lib/kubelet/config.yaml | grep cgroup

cgroupDriver: systemd

将vim /etc/docker/daemon.json

{
  "registry-mirrors": ["https://dpxn2pal.mirror.aliyuncs.com"],
  "exec-opts": [ "native.cgroupdriver=systemd" ]
}
// 重启 Docker 服务
systemctl daemon-reload
systemctl restart docker

原因2:hostname没有解析

pause:3.6错误

"registry.k8s.io/pause:3.6\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts
ailed to resolve reference \"registry.k8s.io/pause:3.6\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/

k8s-master containerd[xxxx]: time="20xx-xx-xxTxx:xx:xx.xxxxxxx+08:00"
level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:etcd-k8s-master,Uid:xxx,Namespace:kube-system,Attempt:0,} failed, error" 
error="failed to get sandbox image \"registry.k8s.io/pause:3.6\": failed to pull image \"registry.k8s.io/pause:3.6\": 
failed to pull and unpack image \"registry.k8s.io/pause:3.6\": 
failed to resolve reference \"registry.k8s.io/pause:3.6\": 
failed to do request: 
Head \"https://us-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6\": 
dial tcp xxx.xxx.xxx.xxx:443: connect: connection refused"
ctr -n k8s.io images pull -k registry.aliyuncs.com/google_containers/pause:3.6
ctr -n k8s.io images tag registry.aliyuncs.com/google_containers/pause:3.6 registry.k8s.io/pause:3.6
#重命名镜像registry.aliyuncs.com/google_containers/pause:3.6的tag为registry.k8s.io/pause:3.6
kubeadm reset -f

然后重新执行初始化Master节点的命令即可。以上命令请在子节点也运行!!!

[root@k8s-master01 ~]# kubectl get nodes

Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “kubernetes”)

rm -rf ~/.kube/*
cp /etc/kubernetes/admin.conf ~/.kube/config
service kubelet restart

写doployment的yaml报错

Error from server (BadRequest): error when creating “deploy.yml”: Deployment in version “v1” cannot be handled as a Deployment: strict decoding error: unknown field “spec.selector.matchLables”

apiVersion: v1 
kind: Deployment 
  metadata:   
  name: deploy01 
spec:   
  replicas: 2   
  selector:     
  matchLabels:       
  app: myapp       
  server: nginx   
  template:

正确写法

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deploy01
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
      server: nginx
  template:
    metadata:
      labels:
        app: myapp
        server: nginx
    spec:
      containers:
      - name: nginx
        image: ikubernetes/myapp:v1
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 80

annotation错误

[root@k8s-master01 k8s]# kubectl apply -f svc.yml
Warning: resource services/mynginx-svc is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
apiVersion: v1
kind: Service
metadata:
  annotations: 这一段需要添加,因为这是修改资源不是创建资源,写上修改注释
    kubectl.kubernetes.io/last-applied-configuration: "mylaste modify"
  labels:
    app: mynginx
  name: mynginx-svc
  namespace: default
spec:

spec.containers[0].name: Invalid value:

The Pod "pod-deme" is invalid: spec.containers[0].name: Invalid value: "configVolume": a lowercase RFC 1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name',  or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')

容器名称必须只能有小写字母,数字和中划线祖成,大写字母出错

pod删除不了

将finalizers设置为空
kubectl patch pod xxx -n xxx -p ‘{“metadata”:{“finalizers”:null}}’
或者强制删除

$ kubectl delete pod xxx -n xxx --force --grace-period=0

命名空间删除不了

一个终端挂上代理
kubectl proxy &

另一个终端运行以下命令即可删除namespace
kubernetes-dashboard 替换为要删除的namespace
前提要安装jq
yun install -y jq

NAMESPACE=kubernetes-dashboard 
kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' > temp.json
curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize 

plugin type=”flannel” failed (add): failed to delegate add: failed to set bridge addr: “cni0” already has an IP address different from 10.244.1.1/24

重新生成cni0,将ip设置为报错的那个ip

ip link add cni0 type bridge
ip link set dev cni0 up
ifconfig cni0 10.244.2.1/24
ifconfig cni0 mtu 1450 up

container配置文件 /etc/containerd/config.toml

sandbox_image = “registry.aliyuncs.com/google_containers/pause:3.9”

镜像下载不下来

方法1:从其他库下载

必须设置,存在镜像就不去从库拉取
image:xxxx
imagePullPolicy: IfNotPresent

查看镜像系统中的镜像
ctr -n k8s.io image ls

以下docker地址都是可以的
 "https://hub.docker.com/",
    "https://registry.docker-cn.com",
    "https://ue05qxiu.mirror.aliyuncs.com",
    "https://docker.mirrors.ustc.edu.cn/",
    "https://hub-mirror.c.163.com/",
    "https://reg-mirror.qiniu.com"

IMAGE是需要的镜像版本,SOURCE是加速镜像地址,每个节点都要下载镜像,除非固定节点

IMAGE=nginx:1.21
SOURCE=ue05qxiu.mirror.aliyuncs.com
ctr  -n k8s.io  images pull  $SOURCE/library/$IMAGE
ctr -n k8s.io images tag $SOURCE/library/$IMAGE docker.io/library/$IMAGE
ctr -n k8s.io i rm $SOURCE/library/$IMAGE

方法2:直接将docker镜像导入k8s

IMAGE=nginx:1.21
docker save -o $IMAGE $IMAGE
ctr -n=k8s.io images import $IMAGE

master主节点无法运行pod

在配置中添加污点容忍度
版本不同配置可能不同

 spec:
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/control-plane

error: no objects passed to apply

原因是hostpath没有给类型

 volumes:
        - name: config-volume
          configMap:
            name: alertmanager
        - name: storage-volume
          hostPath:
            path: /data/alertmanager/
            type: Directory #这一行
点赞

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注