在Proxmox VE上部署Kubernetes集群

本文记录了在自己的工作站上部署一个测试集群的过程 参考文档

环境

  • HP ML350 工作站,安装了Proxmox VE 8.2
  • Mac Book Pro 本地工作机

制作一个ubuntu虚机模版

基础准备

采用ubuntu 24.04 server amd64 已经将用户ubuntu加入到sudo组

1
2
# 先更新一下包
sudo apt -y update && sudo apt -y dist-upgrade

安装容器引擎

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

sudo apt install containerd.io

配置

/etc/containerd/config.toml

1
2
disabled_plugins = ["cri"]
SystemdCgroup = true

禁用Swap

1
2

sudo swapoff -a

开启网络包转发

1
2
3
4
5
6
7
8
9

sudo tee /etc/sysctl.d/kubernetes.conf<<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF


sudo sysctl --system

配置内核模块

1
2
3
4
5
6

sudo tee /etc/modules-load.d/k8s.conf <<EOF
br_netfilter
EOF

modprobe br_netfilter

安装k8s

1
2
3
4
5
6
7
8
9
10
11

sudo apt-get install -y apt-transport-https ca-certificates curl

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
sudo chmod 644 /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo chmod 644 /etc/apt/sources.list.d/kubernetes.list]
sudo chmod 644 /etc/apt/sources.list.d/kubernetes.list
sudo apt update
sudo apt install -y kubectl

装集群

把上面做好的ubuntu模版clone 3份备用

容器运行时

如果你从软件包(例如,RPM 或者 .deb)中安装 containerd,你可能会发现其中默认禁止了 CRI 集成插件。 你需要启用 CRI 支持才能在 Kubernetes 集群中使用 containerd。 要确保 cri 没有出现在 /etc/containerd/config.toml 文件中 disabled_plugins 列表内。如果你更改了这个文件,也请记得要重启 containerd。 相关文档

  • 重置containerd配置

    1
    sudo containerd config default > /etc/containerd/config.toml
  • 配置 systemd cgroup 驱动

    1
    2
    3
    4
    5
    # 编辑 /etc/containerd/config.toml
    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
    ...
    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true
  • 重载沙箱(pause)镜像 我用的kubeadm 是1.30 需要修改默认的沙箱镜像不然会有警告

    1
    2
    3
    # 编辑 /etc/containerd/config.toml
    [plugins."io.containerd.grpc.v1.cri"]
    sandbox_image = "registry.k8s.io/pause:3.9"
  • 一旦你更新了这个配置文件,可能就同样需要重启 containerd:

    1
    sudo systemctl restart containerd
  • GFW

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    #1 编辑 /etc/environment
    PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
    HTTP_PROXY=<http://172.17.90.17:7890
    HTTPS_PROXY=<http://172.17.90.17:7890
    NO_PROXY=localhost,127.0.0.1,.cluster.local,10.244.0.0/16,10.96.0.0/12,172.17.0.0/16,172.16.0.0/16,172.18.0.0/16

    #2 给containerd服务加代理
    sudo mkdir -p /etc/systemd/system/containerd.service.d/
    sudo cat > /etc/systemd/system/containerd.service.d/proxy.conf <<EOF
    [Service]
    EnvironmentFile=/etc/environment
    EOF
    # 给沙盒添加代理
    sudo mkdir -p /etc/systemd/system/sandbox-image.service.d
    sudo cat > /etc/systemd/system/sandbox-image.service.d/proxy.conf <<EOF
    [Service]
    EnvironmentFile=/etc/environment
    EOF

    #3 重启服务
    sudo systemctl daemon-reload
    sudo systemctl restart containerd.service

用kubeadm初始化控制平面

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
sudo kubeadm init --control-plane-endpoint=172.17.0.220 --pod-network-cidr=10.244.0.0/16

...

# 返回显示如下:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

kubeadm join 172.17.0.220:6443 --token cp7q2f.xxxx \
--discovery-token-ca-cert-hash sha256:xxxxxxxx \
--control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.17.0.220:6443 --token cp7q2f.xxxx \
--discovery-token-ca-cert-hash sha256:xxxxxxxx

记住上面的token和证书hash后面加入节点用

安装网络组件Calico

1
2
3
4
5
6
7
8

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/tigera-operator.yaml


# 可选,安装calicoctl(作为kubectl插件)
curl -L https://github.com/projectcalico/calico/releases/download/v3.25.0/calicoctl-linux-amd64 -o kubectl-calico
chmod +X kubectl-calico
mv kubectl-calico /usr/local/bin/

加入节点

1
2
3

kubeadm join --token <token> <control-plane-host>:<control-plane-port> --discovery-token-ca-cert-hash sha256:<hash>