一、场景一:同一个应用部署至不同宿主机

在使用Kubernetes时,一般都会有很多节点运行容器,此时可以使用Pod反亲和力将同一个应用部署到不同的节点上,达到更高的可用率,以免同一个应用部署到相同的宿主机带来的风险。

1.查看节点污点情况

[root@k8s-master01 ~]# kubectl describe node | grep Taint
Taints:             <none>
Taints:             <none>
Taints:             <none>
Taints:             <none>

2.定义一个名为podAntiAffinity01的yaml文件

[root@k8s-master01 Affinity]# vim podAntiAffinity01.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: must-be-diff-nodes
  name: must-be-diff-nodes
  namespace: kube-public
spec:
  replicas: 3
  selector:
    matchLabels:
      app: must-be-diff-nodes
  template:
    metadata:
      labels:
        app: must-be-diff-nodes
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - must-be-diff-nodes
            topologyKey: kubernetes.io/hostname
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/zq-demo/nginx:1.14.2
        imagePullPolicy: IfNotPresent
        name: must-be-diff-nodes

3.开始部署

[root@k8s-master01 Affinity]# kubectl create -f podAntiAffinity01.yaml

4.查看pod状态,观察到pod被部署到不同节点

[root@k8s-master01 Affinity]# kubectl get po -n kube-public -owide
NAME                                 READY   STATUS    RESTARTS   AGE     IP              NODE           NOMINATED NODE   READINESS GATES
must-be-diff-nodes-bdbb64998-2l9j4   1/1     Running   0          4m19s   172.27.14.221   k8s-node02     <none>           <none>
must-be-diff-nodes-bdbb64998-lx444   1/1     Running   0          4m19s   172.17.125.19   k8s-node01     <none>           <none>
must-be-diff-nodes-bdbb64998-slntf   1/1     Running   0          4m19s   172.18.195.5    k8s-master03   <none>           <none>

5.扩容副本数为5个

[root@k8s-master01 Affinity]# kubectl scale deployment must-be-diff-nodes --replicas=6 -n kube-public
deployment.apps/must-be-diff-nodes scaled

6.再次查看Pod状态,发现有一个节点处于Pending

[root@k8s-master01 Affinity]# kubectl get po -n kube-public
NAME                                 READY   STATUS    RESTARTS   AGE
must-be-diff-nodes-bdbb64998-2l9j4   1/1     Running   0          7m34s
must-be-diff-nodes-bdbb64998-58ngr   0/1     Pending   0          102s
must-be-diff-nodes-bdbb64998-drkk2   1/1     Running   0          102s
must-be-diff-nodes-bdbb64998-lx444   1/1     Running   0          7m34s
must-be-diff-nodes-bdbb64998-slntf   1/1     Running   0          7m34s
must-be-diff-nodes-bdbb64998-t9zld   1/1     Running   0          102s

7.详细查看Pod处于Pending原因,这是因为我们设置的是硬亲和力,必须部署在不同宿主机上,节点只有5个,而副本数设置为6个,所以导致一直处于Pending

[root@k8s-master01 Affinity]# kubectl describe po must-be-diff-nodes-bdbb64998-58ngr -n kube-public
...
...
...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  85s   default-scheduler  0/5 nodes are available: 5 node(s) didn't match pod anti-affinity rules.
  Warning  FailedScheduling  53s   default-scheduler  0/5 nodes are available: 5 node(s) didn't match pod anti-affinity rules.
  Warning  FailedScheduling  7s    default-scheduler  0/5 nodes are available: 5 node(s) didn't match pod anti-affinity rules.

二、场景二:Pod只能部署在Node节点上

方式一:同一个topology设置Pod反亲和力使Pod只能部署在Node节点上

1.查看节点污点情况

[root@k8s-master01 ~]# kubectl describe node | grep Taint
Taints:             <none>
Taints:             <none>
Taints:             <none>
Taints:             <none>

2.给node节点打上标签

[root@k8s-master01 ~]# kubectl label node k8s-node01  k8s-node02  app=must-be-diff-nodes

3.定义一个名为podAntiAffinity02的yaml文件

[root@k8s-master01 Affinity]# vim podAntiAffinity02.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: must-be-diff-nodes
  name: must-be-diff-nodes
  namespace: kube-public
spec:
  replicas: 2
  selector:
    matchLabels:
      app: must-be-diff-nodes
  template:
    metadata:
      labels:
        app: must-be-diff-nodes
    spec:
      nodeSelector:
        app: must-be-diff-nodes
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/zq-demo/nginx:1.14.2
        imagePullPolicy: IfNotPresent
        name: must-be-diff-nodes

4.开始部署

[root@k8s-master01 Affinity]# kubectl create -f podAntiAffinity02.yaml
deployment.apps/must-be-diff-nodes created

5.查看pod状态,观察到pod被部署到不同节点

[root@k8s-master01 Affinity]# kubectl get po -n kube-public -owide
NAME                                 READY   STATUS    RESTARTS   AGE     IP              NODE           NOMINATED NODE   READINESS GATES
must-be-diff-nodes-bdbb64998-2l9j4   1/1     Running   0          4m19s   172.27.14.221   k8s-node02     <none>           <none>
must-be-diff-nodes-bdbb64998-lx444   1/1     Running   0          4m19s   172.17.125.19   k8s-node01     <none>           <none>
must-be-diff-nodes-bdbb64998-slntf   1/1     Running   0          4m19s   172.18.195.5    k8s-master03   <none>           <none>

方式二:设置不同topology使Pod只能部署在Node节点上

1.查看节点污点情况

[root@k8s-master01 ~]# kubectl describe node | grep Taint
Taints:             <none>
Taints:             <none>
Taints:             <none>
Taints:             <none>

2.给master节点打上app=must-be-diff-nodes标签

[root@k8s-master01 ~]# kubectl label node  k8s-master01  k8s-master02  k8s-master03 app=must-be-diff-nodes

3.定义一个名为podAntiAffinity03的yaml文件

[root@k8s-master01 Affinity]# vim podAntiAffinity03.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: must-be-diff-nodes
  name: must-be-diff-nodes
  namespace: kube-public
spec:
  replicas: 2
  selector:
    matchLabels:
      app: must-be-diff-nodes
  template:
    metadata:
      labels:
        app: must-be-diff-nodes
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: NotIn
                values:
                - must-be-diff-nodes
            topologyKey: app
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/zq-demo/nginx:1.14.2
        imagePullPolicy: IfNotPresent
        name: must-be-diff-nodes

4.开始部署

[root@k8s-master01 Affinity]# kubectl create -f podAntiAffinity03.yaml
deployment.apps/must-be-diff-nodes created

5.查看pod状态,观察到pod被部署到node节点

[root@k8s-master01 Affinity]# kubectl get po -n kube-public -owide
NAME                                 READY   STATUS    RESTARTS   AGE   IP               NODE           NOMINATED NODE   READINESS GATES
must-be-diff-nodes-b594b4ff4-f54ch   1/1     Running   0          21s   172.17.125.14    k8s-node01     <none>           <none>
must-be-diff-nodes-b594b4ff4-p767f   1/1     Running   0          21s   172.27.14.202    k8s-node02     <none>           <none>

方式三:设置节点亲和力使Pod只能部署在Node节点上

1.查看节点污点情况

[root@k8s-master01 ~]# kubectl describe node | grep Taint
Taints:             <none>
Taints:             <none>
Taints:             <none>
Taints:             <none>

2.给master节点打上app=must-be-diff-nodes标签

[root@k8s-master01 ~]# kubectl label node  k8s-master01  k8s-master02  k8s-master03 app=must-be-diff-nodes

3.定义一个名为podAntiAffinity04的yaml文件

[root@k8s-master01 Affinity]# vim podAntiAffinity04.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: must-be-diff-nodes
  name: must-be-diff-nodes
  namespace: kube-public
spec:
  replicas: 2
  selector:
    matchLabels:
      app: must-be-diff-nodes
  template:
    metadata:
      labels:
        app: must-be-diff-nodes
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: app
                operator: NotIn
                values:
                - must-be-diff-nodes
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/zq-demo/nginx:1.14.2
        imagePullPolicy: IfNotPresent
        name: must-be-diff-nodes

4.开始部署

[root@k8s-master01 Affinity]# kubectl create -f podAntiAffinity04.yaml
deployment.apps/must-be-diff-nodes created

5.查看pod状态,观察到pod被部署到node节点

[root@k8s-master01 Affinity]# kubectl get po -n kube-public -owide
NAME                                 READY   STATUS    RESTARTS   AGE   IP               NODE           NOMINATED NODE   READINESS GATES
must-be-diff-nodes-b594b4ff4-f54ch   1/1     Running   0          21s   172.17.125.14    k8s-node01     <none>           <none>
must-be-diff-nodes-b594b4ff4-p767f   1/1     Running   0          21s   172.27.14.202    k8s-node02     <none>           <none>

三、场景三:尽量调度到高配置服务器

1.给k8s-node01节点打上ssh=true的标签,k8s-master01打上ssh=true和gpu=true的标签,k8s-node02节点打上type=physical的标签方便演示

[root@k8s-master01 study]# kubectl label node k8s-master01 k8s-node01 ssd=true
[root@k8s-master01 study]# kubectl label node k8s-master01  gpu=true
[root@k8s-master01 study]# kubectl label node k8s-node02 type=physical

2.观察以上操作是否完成

[root@k8s-master01 study]# kubectl get node --show-labels | grep ssd
k8s-master01   Ready    control-plane,master   8d    v1.23.14   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master01,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,ssd=true
k8s-node01     Ready    <none>                 8d    v1.23.14   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node01,kubernetes.io/os=linux,ssd=true

[root@k8s-master01 study]# kubectl get node --show-labels | grep gpu
k8s-master01   Ready    control-plane,master   8d    v1.23.14   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master01,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,ssd=true

3.定义一个yaml文件

[root@k8s-master01 study]# vim podAntiAffinity05.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: prefer-ssd
  name: prefer-ssd
  namespace: kube-public
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prefer-ssd
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: prefer-ssd
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: ssd
                operator: In
                values:
                - "true"
              - key: gpu
                operator: NotIn
                values:
                - "true"
            weight: 100
          - preference:
              matchExpressions:
              - key: type
                operator: In
                values:
                - physical
            weight: 10
      containers:
      - env:
        - name: TZ
          value: Asia/Shanghai
          name: LANG
          value: C.UTF-8
        image: registry.cn-beijing.aliyuncs.com/dotbalo/nginx:1.15.12-alpine
        imagePullPolicy: IfNotPresent
        name: prefer-ssd

4.开始部署

[root@k8s-master01 study]# kubectl create  -f podAntiAffinity05.yaml

5.结果查看,观察到Pod节点部署到k8s-node01上

[root@k8s-master01 study]# kubectl get po -owide -n kube-public
NAME                          READY   STATUS        RESTARTS   AGE     IP             NODE           NOMINATED NODE   READINESS GATES
prefer-ssd-7cc4cd68c9-rdx2f   1/1     Running       0          9s      172.17.125.3   k8s-node01     <none>           <none>

6.把k8s-node01节点上的ssd=true标签去除,再次进行创建

[root@k8s-master01 study]# kubectl label node k8s-node01 ssd-
[root@k8s-master01 study]# kubectl delete   -f podAntiAffinity02.yaml
[root@k8s-master01 study]# kubectl create  -f podAntiAffinity02.yaml

7.结果查看,观察到Pod节点部署到k8s-node02上。这是因为master01节点有gpu限制,所以会次优部署到k8s-node02上。

[root@k8s-master01 study]# kubectl get po -owide -n kube-public
NAME                          READY   STATUS        RESTARTS   AGE   IP              NODE           NOMINATED NODE   READINESS GATES

prefer-ssd-7cc4cd68c9-m26ml   1/1     Running       0          32s   172.27.14.197   k8s-node02     <none>           <none>

四、场景四:同一个应用不同副本的固定节点

有时候公司会有一些有状态的服务需要部署在Kubernetes集群中,我们并不需要该服务所在节点宕机时自动漂移Pod,因为可能会带来数据上的丢失,所以需要固定节点去运行,而我们同时也不想每个副本出现部署在同一台宿主机的情况,此时可以使用NodeSelector(NodeAffinity也可以)+PodAntiAffinity实现这一需求。

1.查看节点污点情况

[root@k8s-master01 ~]# kubectl describe node | grep Taint
Taints:             <none>
Taints:             <none>
Taints:             <none>
Taints:             <none>

2.给master节点打上app=must-be-diff-nodes标签

[root@k8s-master01 ~]# kubectl label node  k8s-master01  k8s-master02  k8s-master03 app=must-be-diff-nodes

3.定义一个名为podAntiAffinity06的yaml文件

[root@k8s-master01 Affinity]# vim podAntiAffinity06.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: must-be-diff-nodes
  name: must-be-diff-nodes
  namespace: kube-public
spec:
  replicas: 3
  selector:
    matchLabels:
      app: must-be-diff-nodes
  template:
    metadata:
      labels:
        app: must-be-diff-nodes
    spec:
      nodeSelector:
        app: must-be-diff-nodes
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - must-be-diff-nodes
            topologyKey: kubernetes.io/hostname
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/zq-demo/nginx:1.14.2
        imagePullPolicy: IfNotPresent
        name: must-be-diff-nodes

4.开始部署

[root@k8s-master01 Affinity]# kubectl create -f podAntiAffinity06.yaml
deployment.apps/must-be-diff-nodes created

5.查看pod状态,观察到pod被部署到不同master节点

[root@k8s-master01 Affinity]# kubectl get po -n kube-public -owide
NAME                                  READY   STATUS    RESTARTS   AGE   IP               NODE           NOMINATED NODE   READINESS GATES
must-be-diff-nodes-776df5fc4c-l4nbk   1/1     Running   0          5s    172.18.195.16    k8s-master03   <none>           <none>
must-be-diff-nodes-776df5fc4c-mbjf8   1/1     Running   0          5s    172.25.244.216   k8s-master01   <none>           <none>
must-be-diff-nodes-776df5fc4c-x5s22   1/1     Running   0          5s    172.25.92.81     k8s-master02   <none>           <none>

五、场景五:应用和缓存尽量部署在同一个域内

一种很常见的架构是后端服务器需要请求缓存中间件,而不是直接请求数据库来提高数据加载速度。在实际使用时,缓存中间件可能也是部署在Kubernetes集群中(如上一个示例),所以此时可以利用Pod反亲和力将后端应用尽量和缓存中间件部署在同一个域内的不同节点,用以减少网络上的消耗。

1.查看节点污点情况

[root@k8s-master01 ~]# kubectl describe node | grep Taint
Taints:             <none>
Taints:             <none>
Taints:             <none>
Taints:             <none>

2.定义一个名为podAntiAffinity07的yaml文件

[root@k8s-master01 Affinity]# vim podAntiAffinity07.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: must-be-diff-nodes
  name: must-be-diff-nodes
  namespace: kube-public
spec:
  replicas: 3
  selector:
    matchLabels:
      app: must-be-diff-nodes
  template:
    metadata:
      labels:
        app: must-be-diff-nodes
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - must-be-diff-nodes
            topologyKey: kubernetes.io/hostname
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/zq-demo/nginx:1.14.2
        imagePullPolicy: IfNotPresent
        name: must-be-diff-nodes

4.开始部署

[root@k8s-master01 Affinity]# kubectl create -f podAntiAffinity07.yaml
deployment.apps/must-be-diff-nodes created

5.查看pod状态,观察到pod被部署到不同master节点

[root@k8s-master01 Affinity]# kubectl get po -n kube-public -owide
NAME                                 READY   STATUS    RESTARTS   AGE   IP               NODE           NOMINATED NODE   READINESS GATES
must-be-diff-nodes-b9df95fcd-4c2hn   1/1     Running   0          4s    172.25.244.217   k8s-master01   <none>           <none>
must-be-diff-nodes-b9df95fcd-4z59n   1/1     Running   0          4s    172.18.195.17    k8s-master03   <none>           <none>
must-be-diff-nodes-b9df95fcd-8h5bd   1/1     Running   0          4s    172.27.14.217    k8s-node02     <none>           <none>