PowerShell 技能连载 - Kubernetes 管理自动化

适用于 PowerShell 7.0 及以上版本

Kubernetes 已成为容器编排的事实标准,几乎所有的云原生应用都运行在 K8s 集群之上。虽然 kubectl 是日常操作的主要命令行工具,但在企业自动化场景中,运维团队往往需要将 Kubernetes 操作集成到更大规模的工作流里——比如批量部署微服务、定期巡检集群健康状态、自动化 Helm Release 管理、以及跨集群的配置同步。纯靠手敲 kubectl 命令既容易出错,也无法做到可重复、可审计。

PowerShell 凭借强大的对象管道和脚本编排能力,是构建 K8s 自动化工作流的理想胶水语言。通过调用 kubectl CLI 并解析其 JSON 输出,PowerShell 可以将集群资源管理、应用部署、健康巡检等操作封装为结构化的脚本模块。结合 .NET 的 Kubernetes 客户端 SDK,还能实现更细粒度的 API 交互。

本文将通过三个实战场景展示如何用 PowerShell 实现 Kubernetes 管理自动化:集群连接与资源管理、部署自动化、以及运维巡检工具集。每个场景都提供了可运行的脚本模板,帮助你快速搭建自己的 K8s 运维工具箱。

集群连接与资源管理

管理多个 Kubernetes 集群时,频繁切换上下文是日常操作。下面的脚本封装了 kubeconfig 上下文切换、资源查询和状态汇总功能,让你在 PowerShell 中高效管理多个集群的 Pod、Deployment 和 Service 资源。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# K8s 集群管理辅助函数集

function Get-K8sContexts {
<# 获取所有可用的 K8s 上下文 #>
$Raw = kubectl config get-contexts -o name 2>$null
if ($LASTEXITCODE -ne 0) {
Write-Error "无法获取 K8s 上下文,请确认 kubeconfig 已配置"
return @()
}
return $Raw | Where-Object { $_.Trim() }
}

function Switch-K8sContext {
param([Parameter(Mandatory)][string]$ContextName)
kubectl config use-context $ContextName 2>$null | Out-Null
if ($LASTEXITCODE -eq 0) {
Write-Host "已切换到上下文: $ContextName" -ForegroundColor Green
} else {
Write-Error "切换上下文失败: $ContextName"
}
}

function Get-K8sResourceSummary {
param(
[string]$Namespace = 'default',
[string]$Context
)

if ($Context) { Switch-K8sContext $Context }

# 查询 Pod 状态汇总
$Pods = kubectl get pods -n $Namespace -o json 2>$null |
ConvertFrom-Json

$PodSummary = $Pods.items | Group-Object status.phase |
Select-Object @{N='Phase'; E={$_.Name}}, Count

# 查询 Deployment 状态
$Deployments = kubectl get deployments -n $Namespace -o json 2>$null |
ConvertFrom-Json

$DeployStatus = $Deployments.items | ForEach-Object {
$Replicas = $_.status.replicas ?? 0
$Ready = $_.status.readyReplicas ?? 0
$Updated = $_.status.updatedReplicas ?? 0
[PSCustomObject]@{
Name = $_.metadata.name
Replicas = $Replicas
Ready = $Ready
Updated = $Updated
Available = $_.status.availableReplicas ?? 0
Status = if ($Ready -eq $Replicas -and $Replicas -gt 0) { 'Healthy' } else { 'Degraded' }
}
}

# 查询 Service 端点
$Services = kubectl get services -n $Namespace -o json 2>$null |
ConvertFrom-Json

$SvcInfo = $Services.items | ForEach-Object {
$Ports = ($_.spec.ports | ForEach-Object { "$($_.port):$($_.targetPort)/$($_.protocol)" }) -join ', '
[PSCustomObject]@{
Name = $_.metadata.name
Type = $_.spec.type
Ports = $Ports
IP = if ($_.spec.type -eq 'LoadBalancer') {
$_.status.loadBalancer.ingress[0].ip ?? 'Pending'
} else {
$_.spec.clusterIP
}
}
}

Write-Host "`n=== Pod 状态汇总 (Namespace: $Namespace) ===" -ForegroundColor Cyan
$PodSummary | Format-Table -AutoSize

Write-Host "=== Deployment 状态 ===" -ForegroundColor Cyan
$DeployStatus | Format-Table -AutoSize

Write-Host "=== Service 列表 ===" -ForegroundColor Cyan
$SvcInfo | Format-Table -AutoSize
}

# 使用示例
Write-Host "当前可用上下文:" -ForegroundColor Yellow
Get-K8sContexts

# 切换到生产集群并查看资源概况
Get-K8sResourceSummary -Namespace 'production' -Context 'prod-cluster'

执行结果示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
当前可用上下文:
prod-cluster
staging-cluster
dev-cluster

已切换到上下文: prod-cluster

=== Pod 状态汇总 (Namespace: production) ===
Phase Count
----- -----
Running 24
Pending 2
Succeeded 5

=== Deployment 状态 ===
Name Replicas Ready Updated Available Status
---- -------- ----- ------- --------- ------
api-gateway 3 3 3 3 Healthy
user-service 2 2 2 2 Healthy
order-service 3 2 3 2 Degraded
payment-service 2 2 2 2 Healthy

=== Service 列表 ===
Name Type Ports IP
---- ---- ----- --
api-gateway LoadBalancer 80:8080/TCP 203.0.113.50
user-service ClusterIP 8080:8080/TCP 10.96.0.10
order-service ClusterIP 8080:8080/TCP 10.96.0.20
payment-service ClusterIP 8443:443/TCP 10.96.0.30

部署自动化

手动执行 kubectl applykubectl rollout 在管理少量应用时尚可应对,但当微服务数量超过几十个时,就需要自动化部署流水线。下面的脚本演示了如何用 PowerShell 生成部署 YAML、执行滚动更新、监控发布状态、以及在出现问题时快速回滚。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
# K8s 部署自动化工具

function New-K8sDeploymentManifest {
param(
[Parameter(Mandatory)][string]$AppName,
[Parameter(Mandatory)][string]$Image,
[int]$Replicas = 2,
[int]$Port = 8080,
[hashtable]$Labels = @{},
[string]$Namespace = 'default'
)

$AllLabels = @{ app = $AppName } + $Labels

$Manifest = @{
apiVersion = 'apps/v1'
kind = 'Deployment'
metadata = @{
name = $AppName
namespace = $Namespace
labels = $AllLabels
}
spec = @{
replicas = $Replicas
selector = @{ matchLabels = @{ app = $AppName } }
template = @{
metadata = @{ labels = $AllLabels }
spec = @{
containers = @(
@{
name = $AppName
image = $Image
ports = @(@{ containerPort = $Port })
resources = @{
requests = @{ cpu = '100m'; memory = '128Mi' }
limits = @{ cpu = '500m'; memory = '512Mi' }
}
readinessProbe = @{
httpGet = @{ path = '/health'; port = $Port }
initialDelaySeconds = 5
periodSeconds = 10
}
livenessProbe = @{
httpGet = @{ path = '/health'; port = $Port }
initialDelaySeconds = 15
periodSeconds = 20
}
}
)
}
}
}
}

return $Manifest
}

function Start-K8sRollingUpdate {
param(
[Parameter(Mandatory)][string]$AppName,
[Parameter(Mandatory)][string]$NewImage,
[string]$Namespace = 'default',
[int]$TimeoutSeconds = 300
)

Write-Host "开始滚动更新: $AppName -> $NewImage" -ForegroundColor Cyan

# 设置新镜像
kubectl set image "deployment/$AppName" `
"$AppName=$NewImage" -n $Namespace 2>$null | Out-Null

if ($LASTEXITCODE -ne 0) {
Write-Error "设置镜像失败"
return $false
}

# 等待滚动更新完成
Write-Host "等待滚动更新完成 (超时: ${TimeoutSeconds}s)..." -ForegroundColor Yellow
$Deadline = (Get-Date).AddSeconds($TimeoutSeconds)

while ((Get-Date) -lt $Deadline) {
$Status = kubectl rollout status "deployment/$AppName" `
-n $Namespace --timeout=30s 2>&1

if ($LASTEXITCODE -eq 0) {
Write-Host "滚动更新成功: $AppName" -ForegroundColor Green
return $true
}

# 显示当前 Pod 状态
$Pods = kubectl get pods -n $Namespace -l "app=$AppName" -o json 2>$null |
ConvertFrom-Json

$Pods.items | ForEach-Object {
$Phase = $_.status.phase
$Containers = $_.status.containerStatuses
$Ready = ($Containers | Where-Object { $_.ready }).Count
$Total = $Containers.Count
Write-Host " Pod: $($_.metadata.name) | Phase: $Phase | Ready: $Ready/$Total"
}

Start-Sleep -Seconds 5
}

Write-Warning "滚动更新超时,准备回滚..."
Undo-K8sRollout -AppName $AppName -Namespace $Namespace
return $false
}

function Undo-K8sRollout {
param(
[Parameter(Mandatory)][string]$AppName,
[string]$Namespace = 'default',
[int]$Revision = 0
)

if ($Revision -eq 0) {
Write-Host "回滚到上一版本: $AppName" -ForegroundColor Yellow
kubectl rollout undo "deployment/$AppName" -n $Namespace 2>$null
} else {
Write-Host "回滚到修订版本 $Revision: $AppName" -ForegroundColor Yellow
kubectl rollout undo "deployment/$AppName" -n $Namespace --to-revision=$Revision 2>$null
}

if ($LASTEXITCODE -eq 0) {
Write-Host "回滚成功" -ForegroundColor Green
# 查看部署历史
kubectl rollout history "deployment/$AppName" -n $Namespace
} else {
Write-Error "回滚失败"
}
}

# 生成部署清单并应用
$Manifest = New-K8sDeploymentManifest `
-AppName 'web-frontend' `
-Image 'registry.example.com/web-frontend:v2.3.0' `
-Replicas 3 `
-Port 8080 `
-Labels @{ tier = 'frontend'; env = 'production' } `
-Namespace 'production'

$YamlPath = '/tmp/web-frontend-deployment.yaml'
$Manifest | ConvertTo-Json -Depth 10 | Set-Content $YamlPath
kubectl apply -f $YamlPath

# 执行滚动更新
Start-K8sRollingUpdate `
-AppName 'web-frontend' `
-NewImage 'registry.example.com/web-frontend:v2.4.0' `
-Namespace 'production' `
-TimeoutSeconds 300

执行结果示例:

1
2
3
4
5
6
7
8
开始滚动更新: web-frontend -> registry.example.com/web-frontend:v2.4.0
等待滚动更新完成 (超时: 300s)...
Pod: web-frontend-7d9b8f6c4d-abc12 | Phase: Running | Ready: 1/1
Pod: web-frontend-7d9b8f6c4d-def34 | Phase: Running | Ready: 1/1
Pod: web-frontend-8a2c3e7f5b-ghi56 | Phase: ContainerCreating | Ready: 0/1
Pod: web-frontend-8a2c3e7f5b-jkl78 | Phase: Running | Ready: 1/1
Pod: web-frontend-8a2c3e7f5b-mno90 | Phase: Running | Ready: 1/1
滚动更新成功: web-frontend

运维巡检工具集

Kubernetes 集群的日常运维需要定期检查节点健康、资源水位、异常 Pod 和事件告警。下面是一套完整的巡检脚本,可以一次性生成集群健康报告,适合集成到定时任务或 CI/CD 流水线中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# K8s 集群运维巡检工具集

function Invoke-K8sClusterHealthCheck {
param(
[string]$Context,
[string]$OutputPath = "./k8s-health-report-$(Get-Date -Format 'yyyyMMdd-HHmmss').txt"
)

if ($Context) { Switch-K8sContext $Context }

$Report = [System.Text.StringBuilder]::new()
$null = $Report.AppendLine("=" * 60)
$null = $Report.AppendLine("K8s 集群健康巡检报告 - $(Get-Date -Format 'yyyy-MM-dd HH:mm:ss')")
$null = $Report.AppendLine("=" * 60)

# 1. 节点状态检查
$null = $Report.AppendLine("`n--- 节点状态 ---")
$Nodes = kubectl get nodes -o json 2>$null | ConvertFrom-Json

foreach ($Node in $Nodes.items) {
$Name = $Node.metadata.name
$Conditions = $Node.status.conditions
$Ready = ($Conditions | Where-Object { $_.type -eq 'Ready' }).status -eq 'True'
$MemoryPressure = ($Conditions | Where-Object { $_.type -eq 'MemoryPressure' }).status -eq 'True'
$DiskPressure = ($Conditions | Where-Object { $_.type -eq 'DiskPressure' }).status -eq 'True'

$StatusIcon = if ($Ready -and -not $MemoryPressure -and -not $DiskPressure) { 'OK' } else { 'WARN' }
$null = $Report.AppendLine("[$StatusIcon] $Name | Ready: $Ready | MemoryPressure: $MemoryPressure | DiskPressure: $DiskPressure")

# 节点资源使用
$Allocatable = $Node.status.allocatable
$null = $Report.AppendLine(" CPU: $($Allocatable.cpu) | Memory: $($Allocatable.memory)")
}

# 2. 异常 Pod 扫描(所有命名空间)
$null = $Report.AppendLine("`n--- 异常 Pod ---")
$AllPods = kubectl get pods -A -o json 2>$null | ConvertFrom-Json

$UnhealthyPods = $AllPods.items | Where-Object {
$_.status.phase -notin @('Running', 'Succeeded') -or
($_.status.containerStatuses | Where-Object { $_.restartCount -gt 5 }).Count -gt 0
}

if ($UnhealthyPods) {
foreach ($Pod in $UnhealthyPods) {
$Ns = $Pod.metadata.namespace
$Name = $Pod.metadata.name
$Phase = $Pod.status.phase
$Restarts = ($Pod.status.containerStatuses | Measure-Object -Property restartCount -Sum).Sum
$null = $Report.AppendLine(" [ALERT] $Ns/$Name | Phase: $Phase | Restarts: $Restarts")
}
} else {
$null = $Report.AppendLine(" 所有 Pod 运行正常")
}

# 3. 资源使用报告(通过 metrics-server)
$null = $Report.AppendLine("`n--- 资源使用 Top 10 ---")
$TopPods = kubectl top pods -A --sort-by=memory --no-headers 2>$null

if ($LASTEXITCODE -eq 0) {
$Rank = 1
$TopPods | Select-Object -First 10 | ForEach-Object {
$null = $Report.AppendLine(" #$Rank $_")
$Rank++
}
} else {
$null = $Report.AppendLine(" metrics-server 未安装或不可用,跳过资源使用统计")
}

# 4. 最近事件告警
$null = $Report.AppendLine("`n--- 最近告警事件 (Warning) ---")
$Events = kubectl get events -A --field-selector type=Warning -o json 2>$null |
ConvertFrom-Json

$RecentWarnings = $Events.items |
Sort-Object { [datetime]$_.lastTimestamp } -Descending |
Select-Object -First 10

foreach ($Evt in $RecentWarnings) {
$Time = $Evt.lastTimestamp
$Ns = $Evt.metadata.namespace
$Msg = $Evt.message
$Involved = "$($Evt.involvedObject.kind)/$($Evt.involvedObject.name)"
$null = $Report.AppendLine(" [$Time] $Ns/$Involved - $Msg")
}

# 输出报告
$ReportContent = $Report.ToString()
$ReportContent | Set-Content $OutputPath -Encoding UTF8
Write-Host $ReportContent
Write-Host "`n报告已保存到: $OutputPath" -ForegroundColor Green

# 返回摘要对象,便于后续自动化处理
return [PSCustomObject]@{
TotalNodes = $Nodes.items.Count
UnhealthyPods = $UnhealthyPods.Count
WarningEvents = $RecentWarnings.Count
ReportPath = $OutputPath
}
}

# 执行巡检
$HealthResult = Invoke-K8sClusterHealthCheck -Context 'prod-cluster'

# 根据巡检结果触发告警
if ($HealthResult.UnhealthyPods -gt 0 -or $HealthResult.WarningEvents -gt 5) {
$AlertMsg = "K8s 巡检告警: 异常Pod=$($HealthResult.UnhealthyPods), 告警事件=$($HealthResult.WarningEvents)"
Write-Host $AlertMsg -ForegroundColor Red
# 可在此处接入钉钉、飞书、Slack 等通知渠道
}

执行结果示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
============================================================
K8s 集群健康巡检报告 - 2026-02-13 09:30:00
============================================================

--- 节点状态 ---
[OK] k8s-node-01 | Ready: True | MemoryPressure: False | DiskPressure: False
CPU: 8 | Memory: 32762308Ki
[OK] k8s-node-02 | Ready: True | MemoryPressure: False | DiskPressure: False
CPU: 8 | Memory: 32762308Ki
[WARN] k8s-node-03 | Ready: True | MemoryPressure: True | DiskPressure: False
CPU: 8 | Memory: 32762308Ki

--- 异常 Pod ---
[ALERT] production/order-service-6b8d4f-x2k9l | Phase: CrashLoopBackOff | Restarts: 17
[ALERT] staging/api-gateway-5c7a2e-m4n7p | Phase: Pending | Restarts: 0

--- 资源使用 Top 10 ---
#1 production/redis-cache-0 512Mi 250m
#2 production/elasticsearch-0 480Mi 350m
#3 production/order-service-7d9b 256Mi 150m
#4 production/user-service-8a2c 128Mi 80m
#5 monitoring/prometheus-0 380Mi 200m

--- 最近告警事件 (Warning) ---
[2026-02-13T09:28:00Z] production/Pod/order-service-6b8d4f-x2k9l - Back-off restarting failed container
[2026-02-13T09:25:00Z] staging/Pod/api-gateway-5c7a2e-m4n7p - Insufficient cpu (3) to schedule pod
[2026-02-13T09:20:00Z] production/Node/k8s-node-03 - Node is experiencing memory pressure

报告已保存到: ./k8s-health-report-20260213-093000.txt

注意事项

  1. kubectl 前置依赖:所有脚本都依赖 kubectl 命令行工具,运行前需确保已安装并与目标集群版本兼容(建议客户端版本不低于集群版本的 1 个小版本)。可通过 kubectl version --client 检查客户端版本,集群端需网络可达且 kubeconfig 配置正确。

  2. JSON 输出解析:脚本中大量使用 kubectl -o json 配合 ConvertFrom-Json 解析 K8s 资源。当集群资源量非常大(例如上万 Pod)时,JSON 反序列化可能消耗较多内存。建议在大型集群中结合 -l 标签选择器或 --field-selector 缩小查询范围。

  3. 滚动更新超时策略Start-K8sRollingUpdate 中的超时时间应根据应用启动速度合理设置。Java 等慢启动应用可能需要 5-10 分钟才能通过就绪探针检查,而 Go/Node.js 应用通常在 30 秒内就绪。超时时间过短会导致误判失败并触发不必要的回滚。

  4. metrics-server 部署:资源使用统计功能依赖 metrics-server 组件,部分托管集群(如 EKS、GKE)默认安装,但自建集群需要手动部署。如果巡检脚本中 kubectl top 命令返回错误,请先通过 kubectl apply -f 部署 metrics-server 清单。

  5. 命名空间与权限控制:脚本中的 Get-K8sResourceSummary 默认只查询指定命名空间。在 RBAC 严格的生产集群中,ServiceAccount 可能只被授权访问部分命名空间。建议为巡检脚本创建专用的 ServiceAccount 和 ClusterRole,仅授予只读权限(getlistwatch),避免使用高权限账户运行自动化脚本。

  6. kubeconfig 安全管理:多集群环境下,kubeconfig 文件中包含各集群的认证凭据(证书或 Token)。切勿将 kubeconfig 提交到代码仓库,应通过安全的密钥管理方案(如 HashiCorp Vault、Azure Key Vault)分发凭据,并定期轮换 ServiceAccount Token。

PowerShell 技能连载 - Kubernetes 客户端操作

适用于 PowerShell 7.0 及以上版本(跨平台)

在 Kubernetes 生态中,kubectl 是最常用的命令行工具,但它的输出是纯文本或 JSON 字符串,难以直接用于复杂的自动化流程。当我们需要在 CI/CD 管道中动态创建资源、在运维脚本中批量查询 Pod 状态,或者构建自定义的 Kubernetes 监控面板时,直接调用 Kubernetes API 会比反复解析 kubectl 输出更高效、更可靠。

PowerShell 7 的跨平台特性使其成为与 Kubernetes API 交互的理想选择。通过 Kubernetes 官方提供的 .NET 客户端库(KubernetesClient),我们可以用 PowerShell 脚本直接操作 Kubernetes API,获得完整的类型安全、自动补全和管道支持。这种方式不仅能处理认证、证书验证等底层细节,还能与 PowerShell 的对象模型无缝融合。

本文将介绍如何安装和配置 Kubernetes .NET 客户端,并通过三个实用场景——集群状态查询、资源批量操作和事件监控——展示 PowerShell 作为 Kubernetes 客户端的强大能力。

安装 Kubernetes 客户端模块

首先,我们需要安装 KubernetesClient NuGet 包并创建与集群的连接。该客户端会自动读取 ~/.kube/config 中的上下文信息,无需手动配置认证参数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 安装 Kubernetes .NET 客户端
Install-Module -Name KubernetesClient -Scope CurrentUser -Force

# 导入模块并创建客户端实例
using module KubernetesClient

# 方式一:使用默认 kubeconfig 自动连接
$k8sClient = [KubernetesClient.KubernetesClientConfiguration]::BuildDefaultConfig()
$client = [KubernetesClient.Kubernetes]::new($k8sClient)

# 方式二:指定特定的 kubeconfig 上下文
$config = [KubernetesClient.KubernetesClientConfiguration]::BuildConfigFromConfigFile(
$null, "$HOME/.kube/config", 'production-cluster'
)
$client = [KubernetesClient.Kubernetes]::new($config)

# 验证连接:列出所有命名空间
$namespaces = $client.ListNamespaceAsync().Result
$namespaces.Items | Select-Object -Property Name, Status | Format-Table

执行结果示例:

1
2
3
4
5
6
7
8
9
Name                Status
---- ------
default Active
kube-system Active
kube-public Active
kube-node-lease Active
monitoring Active
production Active
staging Active

场景一:集群资源状态巡检

在生产环境中,快速掌握集群中各类资源的运行状态是日常运维的基础。下面的脚本封装了一个巡检函数,它遍历所有命名空间中的 Pod、Deployment 和 Service,汇总资源使用情况,并以 PowerShell 对象的形式输出结构化的巡检报告。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
function Get-K8sClusterReport {
[CmdletBinding()]
param(
[Parameter(Mandatory)]
[KubernetesClient.Kubernetes]$Client,

[Parameter()]
[string[]]$Namespaces
)

# 如果未指定命名空间,则获取所有命名空间
if (-not $Namespaces) {
$nsList = $Client.ListNamespaceAsync().Result
$Namespaces = $nsList.Items | ForEach-Object { $_.Metadata.Name }
}

$report = foreach ($ns in $Namespaces) {
# 获取该命名空间下的所有 Pod
$pods = $Client.ListNamespacedPodAsync($ns).Result.Items

$totalPods = $pods.Count
$runningPods = ($pods | Where-Object {
$_.Status.Phase -eq 'Running'
}).Count
$failedPods = ($pods | Where-Object {
$_.Status.Phase -eq 'Failed'
}).Count
$pendingPods = ($pods | Where-Object {
$_.Status.Phase -eq 'Pending'
}).Count

# 获取 Deployment 信息
$deployments = $Client.ListNamespacedDeploymentAsync($ns).Result.Items
$totalDeployments = $deployments.Count

# 统计未就绪的 Deployment
$unreadyDeployments = ($deployments | Where-Object {
$_.Status.ReadyReplicas -ne $_.Status.Replicas
}).Count

# 获取 Service 信息
$services = $Client.ListNamespacedServiceAsync($ns).Result.Items

[PSCustomObject]@{
Namespace = $ns
TotalPods = $totalPods
RunningPods = $runningPods
PendingPods = $pendingPods
FailedPods = $failedPods
Deployments = $totalDeployments
UnreadyDeploys = $unreadyDeployments
Services = $services.Count
HealthScore = if ($totalPods -gt 0) {
[math]::Round(($runningPods / $totalPods) * 100, 1)
} else { 100 }
}
}

return $report
}

# 执行巡检
$config = [KubernetesClient.KubernetesClientConfiguration]::BuildDefaultConfig()
$k8s = [KubernetesClient.Kubernetes]::new($config)

$report = Get-K8sClusterReport -Client $k8s
$report | Sort-Object HealthScore | Format-Table -AutoSize

执行结果示例:

1
2
3
4
5
6
7
Namespace       TotalPods RunningPods PendingPods FailedPods Deployments UnreadyDeploys Services HealthScore
--------- --------- ----------- ----------- ---------- ----------- -------------- -------- -----------
production 120 118 2 0 15 1 28 98.3
staging 30 29 1 0 8 0 12 96.7
monitoring 18 18 0 0 5 0 8 100
kube-system 15 15 0 0 3 0 7 100
default 3 3 0 0 1 0 2 100

场景二:批量资源标签管理

在多环境、多团队的 Kubernetes 集群中,标签(Label)是资源分类、筛选和策略执行的基础。当需要批量更新标签(例如标记维护窗口、变更环境归属或添加成本中心标签)时,通过 PowerShell 调用 Kubernetes API 可以高效完成。下面的脚本展示了如何批量查询并修改指定命名空间中所有 Deployment 的标签。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
function Update-K8sDeploymentLabels {
[CmdletBinding()]
param(
[Parameter(Mandatory)]
[KubernetesClient.Kubernetes]$Client,

[Parameter(Mandatory)]
[string]$Namespace,

[Parameter(Mandatory)]
[hashtable]$LabelsToAdd,

[Parameter()]
[string]$LabelSelector
)

# 获取指定命名空间的 Deployment 列表
$deployments = $Client.ListNamespacedDeploymentAsync(
$Namespace,
labelSelector: $LabelSelector
).Result.Items

$results = foreach ($deploy in $deployments) {
$name = $deploy.Metadata.Name

# 在现有标签基础上添加新标签
foreach ($key in $LabelsToAdd.Keys) {
$deploy.Metadata.Labels[$key] = $LabelsToAdd[$key]
}

# 构造更新用的 patch 对象
$patchBody = @{
metadata = @{
labels = $deploy.Metadata.Labels
}
} | ConvertTo-Json -Depth 10

try {
$updated = $Client.PatchNamespacedDeploymentAsync(
[KubernetesClient.V1Patch]::new(
$patchBody,
[KubernetesClient.V1Patch]::StrategicMergePatchType
),
$name,
$Namespace
).Result

[PSCustomObject]@{
Name = $name
Status = 'Updated'
NewLabels = ($LabelsToAdd.Keys | ForEach-Object {
"${_}=$($LabelsToAdd[$_])"
}) -join ', '
}
}
catch {
[PSCustomObject]@{
Name = $name
Status = "Failed: $($_.Exception.Message)"
NewLabels = 'N/A'
}
}
}

return $results
}

# 批量为 production 命名空间的 Deployment 添加成本标签
$config = [KubernetesClient.KubernetesClientConfiguration]::BuildDefaultConfig()
$k8s = [KubernetesClient.Kubernetes]::new($config)

$updateResults = Update-K8sDeploymentLabels -Client $k8s `
-Namespace 'production' `
-LabelsToAdd @{
'cost-center' = 'engineering'
'maintenance' = '2025-Q4'
'managed-by' = 'powershell'
} `
-LabelSelector 'app-type=web'

$updateResults | Format-Table -AutoSize

执行结果示例:

1
2
3
4
5
6
7
Name                 Status   NewLabels
---- ------ ---------
web-frontend Updated cost-center=engineering, maintenance=2025-Q4, managed-by=powershell
web-api-gateway Updated cost-center=engineering, maintenance=2025-Q4, managed-by=powershell
web-notification Updated cost-center=engineering, maintenance=2025-Q4, managed-by=powershell
web-user-service Failed: The Deployment "web-user-service" is being deleted: N/A
web-payment Updated cost-center=engineering, maintenance=2025-Q4, managed-by=powershell

场景三:实时事件流监控

Kubernetes 事件(Event)是排查集群问题的重要信息源。与 kubectl get events 的一次性查询不同,通过客户端的 Watch 机制可以实现事件流的实时订阅。下面的脚本演示了如何使用 PowerShell 监控指定命名空间的事件流,并根据事件类型进行分类统计和告警。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
function Watch-K8sEvents {
[CmdletBinding()]
param(
[Parameter(Mandatory)]
[KubernetesClient.Kubernetes]$Client,

[Parameter()]
[string]$Namespace = 'default',

[Parameter()]
[int]$DurationSeconds = 60
)

$startTime = Get-Date
$eventStats = @{
Normal = 0
Warning = 0
Total = 0
}
$warningEvents = [System.Collections.Generic.List[object]]::new()

Write-Host "开始监控命名空间 '$Namespace' 的事件流(持续 $DurationSeconds 秒)..."
Write-Host ('=' * 60)

# 获取事件列表
$events = $Client.ListNamespacedEventAsync($Namespace).Result.Items

foreach ($evt in $events) {
$eventStats['Total']++

$eventType = if ($evt.Type -eq 'Normal') { 'Normal' } else { 'Warning' }
$eventStats[$eventType]++

# 对 Warning 级别的事件进行重点记录
if ($eventType -eq 'Warning') {
$warningEvents.Add(
[PSCustomObject]@{
Time = $evt.LastTimestamp
Object = "$($evt.InvolvedObject.Kind)/$($evt.InvolvedObject.Name)"
Reason = $evt.Reason
Message = $evt.Message
}
)
}
}

# 输出统计摘要
Write-Host "`n事件统计摘要:"
Write-Host " 总事件数: $($eventStats['Total'])"
Write-Host " Normal: $($eventStats['Normal'])"
Write-Host " Warning: $($eventStats['Warning'])"
Write-Host ('-' * 60)

# 输出告警级别事件详情
if ($warningEvents.Count -gt 0) {
Write-Host "`n告警事件详情:"
$warningEvents | Sort-Object Time -Descending | Select-Object -First 10 |
Format-Table Time, Object, Reason -Wrap
}
else {
Write-Host "`n无告警级别事件,集群运行正常。"
}

return [PSCustomObject]@{
MonitoredAt = $startTime
Namespace = $Namespace
TotalEvents = $eventStats['Total']
NormalEvents = $eventStats['Normal']
WarningEvents = $eventStats['Warning']
Warnings = $warningEvents
}
}

# 监控 production 命名空间的事件
$config = [KubernetesClient.KubernetesClientConfiguration]::BuildDefaultConfig()
$k8s = [KubernetesClient.Kubernetes]::new($config)

$eventReport = Watch-K8sEvents -Client $k8s -Namespace 'production'

执行结果示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
开始监控命名空间 'production' 的事件流(持续 60 秒)...
============================================================

事件统计摘要:
总事件数: 47
Normal: 39
Warning: 8
------------------------------------------------------------

告警事件详情:

Time Object Reason Message
---- ------ ------ -------
2025-10-15T06:42:11Z Pod/web-api-7d8f6c4b5-xk2mn FailedScheduling 0/5 nodes are available...
2025-10-15T06:41:58Z Pod/web-api-7d8f6c4b5-xk2mn InsufficientCPU Node didn't have enough...
2025-10-15T06:40:33Z Pod/payment-worker-5c9b8d7f-n4r1 OOMKilled Container payment-worker...
2025-10-15T06:39:15Z Ingress/api-ingress BackendError Error refreshing SSL cert...

注意事项

  1. 认证配置优先级:Kubernetes .NET 客户端会按照 KUBECONFIG 环境变量、~/.kube/config 文件、Pod 内 ServiceAccount token 的顺序查找认证信息。在 CI/CD 环境中建议显式指定 kubeconfig 路径,避免因环境差异导致连接失败。

  2. 异步方法与 Result 属性:客户端库的 API 大多是异步方法(返回 Task)。在 PowerShell 中可以直接访问 .Result 属性获取同步结果,但如果脚本需要处理大量并发请求,建议使用 [System.Threading.Tasks.Task]::WhenAll() 进行并行调度,避免阻塞主线程。

  3. API 版本兼容性:不同版本的 Kubernetes 集群支持的 API 版本不同。使用 KubernetesClient 之前应确认客户端库版本与目标集群版本的兼容性,例如 apps/v1 是 Kubernetes 1.9+ 才稳定的 API 组,老版本集群可能只支持 apps/v1beta2

  4. 资源限流与服务器压力:批量操作(如遍历所有命名空间的所有 Pod)可能对 API Server 造成较大压力。建议在循环中添加适当的延迟(Start-Sleep -Milliseconds 200),并在查询时利用 labelSelectorfieldSelector 缩小结果范围,避免不必要的数据传输。

  5. JSON 序列化深度:Kubernetes 资源对象嵌套层级较深,使用 ConvertTo-Json 时务必指定足够的 -Depth 参数(建议 10 以上),否则深层字段(如容器规格中的环境变量、挂载点等)会被截断为字符串 System.Collections.Hashtable,导致 patch 操作失败。

  6. 错误处理与重试机制:Kubernetes API 在高负载时可能返回 429 Too Many Requests503 Service Unavailable。建议在关键操作的外层包装重试逻辑,配合指数退避策略(如初次等待 1 秒,后续每次翻倍),并区分可重试错误(网络超时、5xx)和不可重试错误(403 权限不足、404 资源不存在),避免无意义的重试循环。

PowerShell 技能连载 - Kubernetes 运维管理

适用于 PowerShell 7.0 及以上版本,需安装 kubectl

Kubernetes 已成为容器编排的事实标准,无论是自建集群还是使用托管服务(AKS、EKS、GKE),日常运维都离不开与 Kubernetes API 交互。虽然 kubectl 是官方命令行工具,但 PowerShell 的管道、对象处理和脚本能力可以为 Kubernetes 运维带来更高的效率——特别是在批量操作、日志聚合和自动化巡检等场景中。

本文将讲解如何使用 PowerShell 封装 kubectl 命令,实现高效的 Kubernetes 运维管理。

kubectl 环境准备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 检查 kubectl 是否可用
kubectl version --client 2>$null
if ($LASTEXITCODE -ne 0) {
Write-Host "kubectl 未安装" -ForegroundColor Red
# 安装 kubectl
Install-Script -Name install-kubectl -Scope CurrentUser -Force
}

# 查看当前集群信息
kubectl cluster-info

# 查看节点状态
kubectl get nodes -o wide

# 切换命名空间(使用 kubens 或 kubectl)
$kubeConfig = "$env:USERPROFILE\.kube\config"

# 列出所有命名空间
kubectl get namespaces | Select-Object -Skip 1 |
ForEach-Object { ($_ -split '\s+')[0] }

执行结果示例:

1
2
3
4
5
6
7
Client Version: v1.30.0
Kubernetes control plane is running at https://k8s-api.example.com:6443

NAME STATUS ROLES AGE VERSION INTERNAL-IP
node-01 Ready control-plane 90d v1.30.0 10.0.0.1
node-02 Ready worker 90d v1.30.0 10.0.0.2
node-03 Ready worker 90d v1.30.0 10.0.0.3

解析 kubectl 输出为 PowerShell 对象

kubectl 的文本输出可以通过管道解析为结构化对象:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
function Get-K8sPod {
<#
.SYNOPSIS
获取 Kubernetes Pod 信息
#>
param(
[string]$Namespace = "default",
[string]$LabelSelector,
[string]$FieldSelector
)

$args = @('get', 'pods', '-n', $Namespace, '-o', 'custom-columns=NAME:.metadata.name,STATUS:.status.phase,NODE:.spec.nodeName,RESTARTS:.status.containerStatuses[0].restartCount,AGE:.metadata.creationTimestamp,IP:.status.podIP')

if ($LabelSelector) { $args += @('-l', $LabelSelector) }
if ($FieldSelector) { $args += @('--field-selector', $FieldSelector) }

$output = kubectl @args 2>$null
if ($output.Count -le 1) { return @() }

$output | Select-Object -Skip 1 | ForEach-Object {
$parts = $_ -split '\s+'
[PSCustomObject]@{
Name = $parts[0]
Status = $parts[1]
Node = $parts[2]
Restarts = [int]$parts[3]
Age = $parts[4]
IP = $parts[5]
}
}
}

# 查看所有 Pod
Get-K8sPod -Namespace "production" | Format-Table -AutoSize

# 按标签筛选
Get-K8sPod -Namespace "production" -LabelSelector "app=web" |
Format-Table -AutoSize

# 查看失败的 Pod
Get-K8sPod | Where-Object { $_.Status -ne 'Running' }

执行结果示例:

1
2
3
4
5
6
7
8
9
10
11
Name                    Status   Node     Restarts Age      IP
---- ------ ---- -------- --- --
web-app-7d4f8b-x2k9l Running node-02 0 5d 10.244.1.15
web-app-7d4f8b-m8n3p Running node-03 0 5d 10.244.2.22
api-server-5c9a2d-q7j4 Running node-02 3 12d 10.244.1.10
redis-master-0 Running node-01 0 30d 10.244.0.5

Name Status Node Restarts Age IP
---- ------ ---- -------- --- --
web-app-7d4f8b-x2k9l Running node-02 0 5d 10.244.1.15
web-app-7d4f8b-m8n3p Running node-03 0 5d 10.244.2.22

集群健康巡检

使用 PowerShell 构建自动化的 Kubernetes 集群健康检查:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
function Get-K8sClusterHealth {
<#
.SYNOPSIS
Kubernetes 集群健康巡检
#>
param(
[string[]]$Namespaces = @('default', 'production', 'monitoring')
)

Write-Host "========== Kubernetes 集群健康报告 ==========" -ForegroundColor Cyan
Write-Host "检查时间:$(Get-Date -Format 'yyyy-MM-dd HH:mm:ss')`n"

# 1. 节点状态
Write-Host "--- 节点状态 ---" -ForegroundColor Yellow
$nodes = kubectl get nodes -o json | ConvertFrom-Json
foreach ($node in $nodes.items) {
$conditions = $node.status.conditions | Where-Object { $_.type -eq 'Ready' }
$status = $conditions.status
$cpuAlloc = $node.status.allocatable.cpu
$memAlloc = $node.status.allocatable.memory

$color = if ($status -eq 'True') { 'Green' } else { 'Red' }
Write-Host " $($node.metadata.name): " -NoNewline
Write-Host $status -ForegroundColor $color -NoNewline
Write-Host " (CPU: $cpuAlloc, Memory: $memAlloc)"
}

# 2. Pod 健康检查
Write-Host "`n--- Pod 状态汇总 ---" -ForegroundColor Yellow
foreach ($ns in $Namespaces) {
$pods = kubectl get pods -n $ns -o json 2>$null | ConvertFrom-Json
if (-not $pods.items) { continue }

$running = ($pods.items | Where-Object { $_.status.phase -eq 'Running' }).Count
$failed = ($pods.items | Where-Object { $_.status.phase -eq 'Failed' }).Count
$pending = ($pods.items | Where-Object { $_.status.phase -eq 'Pending' }).Count
$total = $pods.items.Count

$highRestart = $pods.items | Where-Object {
($_.status.containerStatuses | Measure-Object -Property restartCount -Maximum).Maximum -gt 5
}

Write-Host " 命名空间 $ns : $total 个 Pod" -NoNewline
Write-Host " (Running: $running, Failed: $failed, Pending: $pending)" -ForegroundColor $(if ($failed -gt 0) { 'Red' } elseif ($pending -gt 0) { 'Yellow' } else { 'Green' })

if ($highRestart) {
Write-Host " 高重启 Pod:" -ForegroundColor Red
$highRestart | ForEach-Object {
$restarts = ($_.status.containerStatuses | Measure-Object -Property restartCount -Maximum).Maximum
Write-Host " $($_.metadata.name) - 重启 $restarts 次" -ForegroundColor Red
}
}
}

# 3. 资源使用率
Write-Host "`n--- 资源使用 Top ---" -ForegroundColor Yellow
$topNodes = kubectl top nodes 2>$null
if ($topNodes) {
$topNodes | Select-Object -Skip 1 | ForEach-Object {
$parts = $_ -split '\s+'
$cpuPct = $parts[2] -replace '%',''
$memPct = $parts[4] -replace '%',''
$color = if ([int]$cpuPct -gt 80 -or [int]$memPct -gt 80) { 'Red' }
elseif ([int]$cpuPct -gt 60 -or [int]$memPct -gt 60) { 'Yellow' }
else { 'Green' }
Write-Host " $($_)" -ForegroundColor $color
}
} else {
Write-Host " Metrics Server 未安装,无法获取资源使用率" -ForegroundColor DarkGray
}

# 4. 事件检查
Write-Host "`n--- 最近事件(Warning) ---" -ForegroundColor Yellow
$events = kubectl get events -A --field-selector type=Warning --sort-by='.lastTimestamp' 2>$null
if ($events) {
$events | Select-Object -First 5 | ForEach-Object { Write-Host " $_" }
} else {
Write-Host " 无警告事件" -ForegroundColor Green
}
}

Get-K8sClusterHealth

执行结果示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
========== Kubernetes 集群健康报告 ==========
检查时间:2025-05-30 08:30:00

--- 节点状态 ---
node-01: True (CPU: 4, Memory: 16Gi)
node-02: True (CPU: 8, Memory: 32Gi)
node-03: True (CPU: 8, Memory: 32Gi)

--- Pod 状态汇总 ---
命名空间 default : 5 个 Pod (Running: 5, Failed: 0, Pending: 0)
命名空间 production : 12 个 Pod (Running: 11, Failed: 1, Pending: 0)
高重启 Pod:
api-server-5c9a2d-q7j4 - 重启 15
命名空间 monitoring : 3 个 Pod (Running: 3, Failed: 0, Pending: 0)

--- 资源使用 Top ---
node-01 1200m/4 8Gi/16Gi (30%/50%)
node-02 4500m/8 28Gi/32Gi (56%/87%)
node-03 2100m/8 18Gi/32Gi (26%/56%)

--- 最近事件(Warning) ---
15m Warning PodLoadBalancer api-server-5c9a2d Failed to pull image

批量操作与日志收集

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# 批量重启 Deployment
function Restart-K8sDeployment {
param(
[Parameter(Mandatory)]
[string]$Namespace,

[string]$LabelSelector
)

$deployments = kubectl get deployments -n $Namespace -l $LabelSelector -o json |
ConvertFrom-Json

foreach ($deploy in $deployments.items) {
$name = $deploy.metadata.name
Write-Host "重启 Deployment:$name" -ForegroundColor Cyan
kubectl rollout restart deployment/$name -n $Namespace

# 等待滚动更新完成
kubectl rollout status deployment/$name -n $Namespace --timeout=300s
}
}

# 收集多个 Pod 的日志
function Get-K8sPodLogs {
param(
[string]$Namespace = "default",
[string]$LabelSelector,
[int]$Tail = 100
)

$pods = kubectl get pods -n $Namespace -l $LabelSelector -o json |
ConvertFrom-Json

$allLogs = @()
foreach ($pod in $pods.items) {
$podName = $pod.metadata.name
$logs = kubectl logs $podName -n $Namespace --tail $Tail 2>$null

foreach ($line in $logs -split "`n") {
if ($line -match 'ERROR|WARN|Exception') {
$allLogs += [PSCustomObject]@{
Pod = $podName
Level = if ($line -match 'ERROR|FATAL') { 'Error' } else { 'Warning' }
Message = $line.Trim()
}
}
}
}

$allLogs | Sort-Object Level, Pod | Format-Table -AutoSize -Wrap
}

# 收集所有 Web 服务的错误日志
Get-K8sPodLogs -Namespace "production" -LabelSelector "app=web" -Tail 500

执行结果示例:

1
2
3
4
5
6
7
8
重启 Deployment:web-app
deployment "web-app" successfully rolled out

Pod Level Message
---- ----- -------
web-app-7d4f8b-x2k9l Error ERROR [DB] Connection pool exhausted
web-app-7d4f8b-m8n3p Warning WARN Cache miss rate exceeded 50%
api-server-5c9a2d-q7j4 Error ERROR Failed to process request: timeout

注意事项

  1. JSON 输出解析:kubectl 的 -o json 输出通过 ConvertFrom-Json 解析为 PowerShell 对象,比文本解析更可靠
  2. API Server 连接:确保 kubeconfig 正确配置,可以访问目标集群的 API Server
  3. 命名空间隔离:多命名空间环境下操作时,始终使用 -n 参数指定命名空间,避免误操作
  4. 资源配额:大规模集群中避免频繁 kubectl get all 类型的查询,可能给 API Server 造成压力
  5. 日志量控制:生产 Pod 日志量可能很大,使用 --tail--since 参数限制查询范围
  6. kubectl 插件:kubectx、kubens、kubectl-tree 等插件可以提升操作效率,建议安装