腾讯云联网环境下搭建kubernetes集群

背景:

网络环境参照:云联网体验,上海 北京两个vpc网络。服务器分布如下:

image.png

讲一下为什么使用TencentOS Server 3.1 (TK4)的系统。还不是因为centos8不提供长期维护了....,顺便体验一下腾讯云开源的tencentos.详情见腾讯云官网:https://cloud.tencent.com/document/product/213/38027。毕竟是与centos8兼容的,按照centos8的搭建kubernetes的流程搭建一遍kubernetes体验一下跨区域是否可行!

基本规划:

注:嗯多区域打散比较也可以高可用!

ip

hostname

所在区域

10.10.2.8

sh-master-01

上海2区

10.10.2.10

sh-master-02

上海2区

10.10.5.4

sh-master-03

上海5区

10.10.4.7

sh-work-01

上海4区

10.10.4.14

sh-work-02

上海4区

10.10.12.9

bj-work-01

北京5区

创建一个内网负载均衡slb,做apiserver的vip,过去一直用的传统型,现在只有应用型负载均衡了......

image.png
image.png
image.png
image.png

系统初始化

注:1-12为所有节点执行

1.更改主机名

注:主机名没有初始化的修改主机名

代码语言:txt
复制
[root@VM-2-8-centos ~]# hostnamectl set-hostname sh-master-01
[root@VM-2-8-centos ~]# cat /etc/hostname
sh-master-01
image.png

其他几台同样的方式

2. 关闭swap交换分区

代码语言:txt
复制
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab

3. 关闭selinux

代码语言:txt
复制
[root@sh-master-01 ~]# setenforce  0 
ssive/SELINUX=disabled/g" /etc/selinux/configsetenforce: SELinux is disabled
[root@sh-master-01 ~]# sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/sysconfig/selinux 
[root@sh-master-01 ~]# sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config 
[root@sh-master-01 ~]# sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/sysconfig/selinux 
[root@sh-master-01 ~]# sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/selinux/config

4. 关闭防火墙

代码语言:txt
复制
systemctl disable --now firewalld
chkconfig firewalld off

注:都没有安装firewalld and iptables可以忽略

5. 调整文件打开数等配置

代码语言:txt
复制
cat> /etc/security/limits.conf <<EOF
* soft nproc 1000000
* hard nproc 1000000
* soft nofile 1000000
* hard nofile 1000000
* soft  memlock  unlimited
* hard memlock  unlimited
EOF

当然了貌似tencentos limits.d目录下有个80-nofile.conf,修改配置文件可以都放在这里。这样可以避免修改主文件

image.png

6. yum update

代码语言:txt
复制
yum update
yum -y install  gcc bc gcc-c++ ncurses ncurses-devel cmake elfutils-libelf-devel openssl-devel flex* bison* autoconf automake zlib* fiex* libxml* ncurses-devel libmcrypt* libtool-ltdl-devel* make cmake  pcre pcre-devel openssl openssl-devel   jemalloc-devel tlc libtool vim unzip wget lrzsz bash-comp* ipvsadm ipset jq sysstat conntrack libseccomp conntrack-tools socat curl wget git conntrack-tools psmisc nfs-utils tree bash-completion conntrack libseccomp net-tools crontabs sysstat iftop nload strace bind-utils tcpdump htop telnet lsof

当然了 我这里忽略了......我cvm初始化一般会用oneinstack的脚本完成初始化一下

7. ipvs添加

tencentos的系统内核是5.4.119

代码语言:txt
复制
:> /etc/modules-load.d/ipvs.conf
module=(
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
br_netfilter
  )
for kernel_module in ${module[@]};do
    /sbin/modinfo -F filename $kernel_module |& grep -qv ERROR && echo $kernel_module >> /etc/modules-load.d/ipvs.conf || :
done
代码语言:txt
复制
systemctl daemon-reload
systemctl enable --now systemd-modules-load.service

验证ipvs是否加载成功

代码语言:txt
复制
# lsmod | grep ip_vs
ip_vs_sh               16384  0
ip_vs_wrr              16384  0
ip_vs_rr               16384  5
ip_vs                 151552  11 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack          114688  5 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE,ip_vs
nf_defrag_ipv6         20480  2 nf_conntrack,ip_vs

8. 优化系统参数(不一定是最优,各取所有)

oneinstack默认的 初始化安装的,先不改了,慢慢看 。等一会有问题了找问题

cat /etc/sysctl.d/99-sysctl.conf

代码语言:txt
复制
fs.file-max=1000000
net.ipv4.tcp_max_tw_buckets = 6000
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 16384 4194304
net.ipv4.tcp_max_syn_backlog = 16384
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_fin_timeout = 20
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_syncookies = 1
#net.ipv4.tcp_tw_len = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.ip_local_port_range = 1024 65000
net.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_established = 3600

9. containerd安装

dnf 与yum centos8的变化,具体的自己去看了呢。差不多吧.......,添加阿里云的源习惯了如下:

代码语言:txt
复制
dnf install dnf-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
sudo yum update -y && sudo yum install -y containerd.io
containerd config default > /etc/containerd/config.toml
# 替换 containerd 默认的 sand_box 镜像,编辑 /etc/containerd/config.toml

sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2"

重启containerd

$ systemctl daemon-reload
$ systemctl restart containerd

看来还是搞不定....匹配的版本不对啊哈哈哈,咋整?

image.png

找一下腾讯的源试一下,当然了先删除一下阿里的源:

代码语言:txt
复制
rm -rf /etc/yum.repos.d/docker-ce.repo
yum clean all

https://mirrors.cloud.tencent.com/docker-ce/linux/centos/

image.png
代码语言:txt
复制
dnf install dnf-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo http://mirrors.cloud.tencent.com/docker-ce/linux/centos/docker-ce.repo
sudo yum update -y && sudo yum install -y containerd.io
containerd config default > /etc/containerd/config.toml

替换 containerd 默认的 sand_box 镜像,编辑 /etc/containerd/config.toml

sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2"

重启containerd

$ systemctl daemon-reload
$ systemctl restart containerd

image.png

依然如此.......没有自己匹配一下系统啊....咋整?手动修改一下?

image.png

成功了,这里也希望tencentos能够自己支持一下常用的yum源...别让我手动转换啊

image.png
代码语言:txt
复制
containerd config default > /etc/containerd/config.toml
image.png
代码语言:txt
复制
# 重启containerd
systemctl daemon-reload
systemctl restart containerd
systemctl status containerd
image.png

10. 配置 CRI 客户端 crictl

注:貌似有版本匹配的

代码语言:txt
复制
VERSION="v1.22.0"
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/VERSION/crictl-VERSION-linux-amd64.tar.gz
sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin
rm -f crictl-$VERSION-linux-amd64.tar.gz

也可能下不动,github下载到桌面,手动上传吧....

代码语言:txt
复制
cat <<EOF > /etc/crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF

验证是否可用(可以顺便验证一下私有仓库)

crictl pull nginx:alpine
crictl rmi nginx:alpine
crictl images

嗯 修改一下/etc/containerd/config.toml 中plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"endpoint为阿里云的加速器地址(当然 了也可以是其他加速器的),另外, plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options也添加了SystemdCgroup = true

image.png
image.png

endpoint 更换为阿里云加速器地址:https://2lefsjdg.mirror.aliyuncs.com

image.png

重启containerd服务重新下载镜像验证:

代码语言:txt
复制
systemctl restart containerd.service
crictl pull nginx:alpine

OK

image.png

11. 安装 Kubeadm(centos8没有对应yum源使用centos7的阿里云yum源)

注:为什么安装1.21.3版本呢?因为我线上的也是1.21.3版本的。正好到时候测试一下升级

代码语言:txt
复制
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

删除旧版本,如果安装了

yum remove kubeadm kubectl kubelet kubernetes-cni cri-tools socat

查看所有可安装版本 下面两个都可以啊

yum list --showduplicates kubeadm --disableexcludes=kubernetes

安装指定版本用下面的命令

yum -y install kubeadm-1.21.3 kubectl-1.21.3 kubelet-1.21.3

or

安装默认最新稳定版本,当前版本1.22.4

#yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

开机自启

systemctl enable kubelet.service

image.png

当然了,这里也可以直接使用腾讯云的源了....道理一样。

12. 修改kubelet配置

代码语言:txt
复制
vi /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS= --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock

master节点额外操作:

1. 安装haproxy

注:三台master节点都要安装haproxy,以及相关配置......

代码语言:txt
复制
yum install haproxy
代码语言:txt
复制
cat <<EOF >  /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------

Example configuration for a possible web application. See the

full configuration options online.

http://haproxy.1wt.eu/download/1.4/doc/configuration.txt

#---------------------------------------------------------------------

#---------------------------------------------------------------------

Global settings

#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2

chroot      /var/lib/haproxy
pidfile     /var/run/haproxy.pid
maxconn     4000
user        haproxy
group       haproxy
daemon

# turn on stats unix socket
stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------

common defaults that all the 'listen' and 'backend' sections will

use if not designated in their block

#---------------------------------------------------------------------
defaults
mode tcp
log global
option tcplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000

#---------------------------------------------------------------------

main frontend which proxys to the backends

#---------------------------------------------------------------------
frontend kubernetes
bind *:8443 #配置端口为8443
mode tcp
default_backend kubernetes
#---------------------------------------------------------------------

static backend for serving up images, stylesheets and such

#---------------------------------------------------------------------
backend kubernetes #后端服务器,也就是说访问10.3.2.12:6443会将请求转发到后端的三台,这样就实现了负载均衡
balance roundrobin
server master1 10.10.2.8:6443 check maxconn 2000
server master2 10.10.2.10:6443 check maxconn 2000
server master3 10.10.5.4:6443 check maxconn 2000
EOF
systemctl enable haproxy && systemctl start haproxy && systemctl status haproxy

登陆腾讯云负载均衡管理后台:https://console.cloud.tencent.com/clb,创建TCP监听器命名k8s监听6443端口,后端服务绑定三台master节点 8443端口,权重默认10没有修改。

image.png

2. sh-master-01节点生成配置文件

注:当然了 也可以是sh-master-02 or sh-master-03节点

代码语言:txt
复制
kubeadm config print init-defaults > config.yaml
image.png

修改一下配置文件如下:

代码语言:txt
复制
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:

  • groups:
    • system:bootstrappers:kubeadm:default-node-token
      token: abcdef.0123456789abcdef
      ttl: 24h0m0s
      usages:
    • signing
    • authentication
      kind: InitConfiguration
      localAPIEndpoint:
      advertiseAddress: 10.10.2.8
      bindPort: 6443
      nodeRegistration:
      criSocket: /run/containerd/containerd.sock
      name: sh-master-01
      taints:
    • effect: NoSchedule
      key: node-role.kubernetes.io/master

apiServer:
timeoutForControlPlane: 4m0s
certSANs:

  • sh-master-01
  • sh-master-02
  • sh-master-03
  • sh-master.k8s.io
  • localhost
  • 127.0.0.1
  • 10.10.2.8
  • 10.10.2.10
  • 10.10.5.4
  • 10.10.2.4
  • xx.xx.xx.xx
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controlPlaneEndpoint: "10.10.2.4:6443"
    controllerManager: {}
    dns:
    type: CoreDNS
    etcd:
    local:
    dataDir: /var/lib/etcd
    imageRepository: registry.aliyuncs.com/google_containers
    kind: ClusterConfiguration
    kubernetesVersion: 1.21.3
    networking:
    dnsDomain: cluster.local
    serviceSubnet: 172.31.0.0/16
    scheduler: {}

apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: "rr"
strictARP: false
syncPeriod: 15s
iptables:
masqueradeAll: true
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s

增加了ipvs的配置,指定了service的subnet,还有国内的镜像仓库,xx.xx.xx.xx是我预留了一个ip(可以预留ip的,方便以后扩容主节点起码)

3. kubeadm master-01节点初始化

代码语言:txt
复制
kubeadm init --config /root/config.yaml

注:下面截图跟上面命令不匹配,因为我开始想安装cilium来...结果失败了哈哈哈还是先搞一下calico吧

image.png

嗯 优化系统参数的时候没有搞上net.ipv4.ip_forward 强调一下,sysctl -w是临时的哦

代码语言:txt
复制
sysctl -w net.ipv4.ip_forward=1

长久的还是再配置文件中加一下:

代码语言:txt
复制
cat <<EOF > /etc/sysctl.d/99-sysctl.conf
fs.file-max=1000000
net.ipv4.tcp_max_tw_buckets = 6000
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 16384 4194304
net.ipv4.tcp_max_syn_backlog = 16384
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_fin_timeout = 20
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_syncookies = 1
#net.ipv4.tcp_tw_len = 1
net.ipv4.ip_forward = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.ip_local_port_range = 1024 65000
net.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_established = 3600
EOF

sysctl --system

注:所有节点执行

代码语言:txt
复制
kubeadm init --config /root/config.yaml 
image.png

4. sh-master-02,sh-master-03控制平面节点加入集群

代码语言:txt
复制
mkdir -p $HOME/.kube
mkdir -p HOME/.kube sudo cp -i /etc/kubernetes/admin.conf HOME/.kube/config
sudo chown (id -u):(id -g) $HOME/.kube/config

按照输出sh-master-02 ,sh-master-03节点加入集群
将sh-master-01 /etc/kubernetes/pki目录下ca.* sa.* front-proxy-ca.* etcd/ca* 打包分发到sh-master-02,sh-master-03 /etc/kubernetes/pki目录下
kubeadm join 10.10.2.4:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:ccfd4e2b85a6a07fde8580422769c9e14113e8f05e95272e51cca2f13b0eb8c3 --control-plan
然后同sh-master-01一样执行一遍下面的命令:
mkdir -p $HOME/.kube
sudo \cp /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown (id -u):(id -g) $HOME/.kube/config

image.png
image.png
代码语言:txt
复制
kubectl get nodes

嗯 由于没有安装cni 网络插件都是notready状态。

image.png

work节点加入集群

代码语言:txt
复制
 kubeadm join 10.10.2.4:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:ccfd4e2b85a6a07fde8580422769c9e14113e8f05e95272e51cca2f13b0eb8c3
image.png
image.png

首先cnn管理控制台先购买了1Mbps的带宽,毕竟是做一下测试:

image.png

安装cni网络插件

初步先跑一下简单的calico了(搞flannel cilium开始没有整起来。先跑通一个算一个。其他的后面慢慢学习优化)

代码语言:txt
复制
sed -i -e "s?192.168.0.0/16?172.31.0.0/16?g" calico.yaml
代码语言:txt
复制
kubectl apply -f calico.yaml
kubectl get pods -o kube-system -o wide
befb0f85e1a67cc4c9e4789ee0d6a0e.png

注: 我还额外在腾讯云私有网络控制台添加了辅助cidr,我在想这样的话我跟其他区域的容器网络是不是也可以互通?还没有测试....就是想起来添加一下了:

43d6efd8a12b380b54a703c1b8eb94c.png

做一下简单的ping测试:

1. 上海区部署两个pod

代码语言:txt
复制
cat<<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx:alpine
name: nginx
ports:
- containerPort: 80

apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80

apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:

  • name: busybox
    image: busybox:1.28.4
    command:
    • sleep
    • "3600"
      imagePullPolicy: IfNotPresent
      restartPolicy: Always
      EOF

嗯 都跑在了上海区

代码语言:txt
复制
[root@sh-master-01 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 14 14h 172.31.45.132 sh-work-01 <none> <none>
nginx-7fb7fd49b4-zrg77 1/1 Running 0 14h 172.31.45.131 sh-work-01 <none> <none>

2. nodeSelector调度在北京区启动一个pod

然后我还想启动一个pod运行在北京区,怎么搞?偷个懒 打标签,nodeSelector调度吧!

代码语言:txt
复制
kubectl label node bj-work-01  zone=beijing

cat nginx1.yaml

代码语言:txt
复制
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: nginx1
name: nginx1
spec:
nodeSelector: #将pod部署到指定标签为zone 为beijing的节点上
zone: "beijing"
containers:
  • image: nginx
    name: nginx1
    resources: {}
    dnsPolicy: ClusterFirst
    restartPolicy: Always
    status: {}
  • 代码语言:txt
    复制
    kubectl apply -f nginx1.yaml
    代码语言:txt
    复制
    [root@sh-master-01 ~]# kubectl get pods -o wide
    NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    busybox 1/1 Running 14 14h 172.31.45.132 sh-work-01 <none> <none>
    nginx-7fb7fd49b4-zrg77 1/1 Running 0 14h 172.31.45.131 sh-work-01 <none> <none>
    nginx1 1/1 Running 0 14h 172.31.89.194 bj-work-01 <none> <none>

    3. ping 测试

    在sh-master-02节点ping 北京pod 与上海pod的ping值

    9fc0b6f20ca3ea032dcbe395ff4400e.png
    e67000cb2465de6855ca01e33c6529e.png

    基本都是差不多的样子。主要是想验证一下是否可以跨区域vpc去搭建kubernetes集群的可行性。网络质量什么的还没有想好怎么测试。只是抛砖引玉。云上是很大成都上方便了许多。起码bgp什么的配置的都相对省略了。如果有云上跨区域搭建kubernetes集群的可以参考一下。