一、问题二：特殊资源如何不浪费?¶

Day018-K8s污点和容忍：集群资源精细化隔离-特殊资源如何不浪费

在 Kubernetes 集群中，通过以下策略实现特殊资源的高效利用，避免浪费：

1、主节点资源隔离（禁止非系统 Pod 调度）

默认污点：主节点（Master）默认带有污点 node-role.kubernetes.io/master:NoSchedule，阻止普通 Pod 调度。
操作建议：
保持默认配置：除非必要，不要移除主节点的污点，确保其仅运行核心组件（如 API Server、Scheduler）。
容忍谨慎使用：若需在主节点运行监控或日志收集 Pod，需显式添加容忍：

yaml tolerations: - key: "node-role.kubernetes.io/master" operator: "Exists" effect: "NoSchedule"

2、特殊资源节点精细化调度（如 GPU/高性能节点）

步骤一：标记节点 为特殊资源节点添加标签和污点：

shell # 添加标签（如标识 GPU） kubectl label node Node01 hardware-type=gpu # 添加污点（仅允许声明容忍的 Pod 调度） kubectl taint node Node01 hardware-type=gpu:NoSchedule

步骤二：Pod 声明容忍和资源请求

在需要使用特殊资源的 Pod 中配置容忍和资源请求：

yaml spec: tolerations: - key: "hardware-type" operator: "Equal" value: "gpu" effect: "NoSchedule" containers: - name: gpu-app image: nvidia/cuda:latest resources: limits: nvidia.com/gpu: 1 # 明确请求 GPU 资源

3、通过亲和性优化调度

节点亲和性（Node Affinity）：

强制 Pod 调度到特定资源节点：

yaml affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: hardware-type operator: In values: - gpu

Pod 反亲和性（Pod Anti-Affinity）：

yaml affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app: my-app topologyKey: "kubernetes.io/hostname"

4、资源监控与动态调整

监控工具：使用 Prometheus + Grafana 监控节点和 Pod 的资源利用率（如 GPU 使用率、内存占用）。
动态扩缩容：结合 Cluster Autoscaler 和 Horizontal Pod Autoscaler（HPA），根据负载自动扩展节点或 Pod 副本。

5、示例场景：GPU 资源优化

问题：GPU 节点（Node01）运行了无需 GPU 的 Pod，导致 GPU 闲置。
解决：
为 Node01 添加污点 gpu=true:NoSchedule。
仅允许声明了 tolerations 和 resources.limits.nvidia.com/gpu 的 Pod 调度。
使用节点亲和性确保 GPU 任务优先调度到 Node01。

Kubernetes特殊资源如何避免浪费：用污点隔离GPU和高性能节点

一、问题二：特殊资源如何不浪费?¶

评论区