GitOps 实践

GitOps 是一种以 Git 为唯一真相来源的运维方法论,通过声明式配置管理基础设施和应用,实现持续部署、环境一致性审核和快速回滚。

核心理念

理念 说明
Git as Single Source of Truth 所有配置存储在 Git 中
Self-Service 开发人员自助部署,无需运维介入
Auditability 每次变更都有完整的审计日志
Idempotent 多次应用同一配置结果一致
Fast Rollback 一行命令回滚到任意版本

GitOps 工作流


┌─────────────────────────────────────────────────────────────────┐
│                        Developer                                │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Code Repository                             │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │  myapp/                                                 │    │
│  │  ├── src/                   ← 应用代码                  │    │
│  │  ├── k8s/                   ← Kubernetes 配置文件       │    │
│  │  │   ├── base/                                       │    │
│  │  │   │   ├── deployment.yaml                         │    │
│  │  │   │   ├── service.yaml                            │    │
│  │  │   │   └── ingress.yaml                            │    │
│  │  │   └── overlays/                                    │    │
│  │  │       ├── staging/                                 │    │
│  │  │       │   └── kustomization.yaml                   │    │
│  │  │       └── production/                              │    │
│  │  │           └── kustomization.yaml                   │    │
│  │  └── Jenkinsfile                                         │    │
│  └─────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    CI Pipeline                                  │
│  Code Commit → Build → Test → Build Image → Push to Registry   │
│                              │                                   │
│                              ▼                                   │
│                     GitOps Repository                           │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │  gitops-repo/                                          │    │
│  │  ├── apps/                                             │    │
│  │  │   └── myapp/                                        │    │
│  │  │       ├── staging.yaml     ← 引用新镜像             │    │
│  │  │       └── production.yaml  ← 引用新镜像(可选)     │    │
│  │  └── infrastructure/                                   │    │
│  └─────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    CD Tool (ArgoCD/Flux)                        │
│                              │                                   │
│         ┌────────────────────┼────────────────────┐             │
│         ▼                    ▼                    ▼             │
│  ┌─────────────┐      ┌─────────────┐      ┌─────────────┐       │
│  │   Staging   │      │  Pre-Prod   │      │ Production │       │
│  │   Cluster   │      │   Cluster   │      │   Cluster  │       │
│  └─────────────┘      └─────────────┘      └─────────────┘       │
└─────────────────────────────────────────────────────────────────┘

GitOps 仓库设计

仓库分离策略

策略一:App Repo 包含 K8s 配置(推荐小型团队)


app-repo/
├── src/
├── Dockerfile
└── k8s/
    ├── base/
    └── overlays/

策略二:独立 GitOps Repo(推荐大型团队)


gitops-repo/
├── apps/
│   ├── myapp/
│   │   ├── base/
│   │   └── overlays/
│   └── otherapp/
├── infrastructure/
│   ├── networking/
│   ├── monitoring/
│   └── databases/
└── clusters/
    ├── staging/
    └── production/

Kustomize 多环境管理

基础配置


# k8s/base/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: myapp
          image: myapp:latest
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10

Staging Overlay


# k8s/overlays/staging/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - ../../base

namespace: staging

commonLabels:
  environment: staging

replicas:
  - name: myapp
    count: 2

images:
  - name: myapp
    newTag: "staging-latest"

patches:
  - target:
      kind: Deployment
      name: myapp
    patch: |
      - op: replace
        path: /spec/replicas
        value: 2

Production Overlay


# k8s/overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - ../../base

namespace: production

commonLabels:
  environment: production

replicas:
  - name: myapp
    count: 5

images:
  - name: myapp
    newTag: "v1.2.3"

patches:
  - target:
      kind: Deployment
      name: myapp
    patch: |
      - op: replace
        path: /spec/replicas
        value: 5
      - op: replace
        path: /spec/strategy/rollingUpdate/maxUnavailable
        value: 1
      - op: replace
        path: /spec/strategy/rollingUpdate/maxSurge
        value: 2

Helm + GitOps

GitOps 友好的 Helm Release


# gitops/apps/myapp/production.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp-production
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://charts.example.com
    chart: myapp
    targetRevision: "1.2.3"
    helm:
      valueFiles:
        - values-production.yaml
      parameters:
        - name: replicaCount
          value: "5"
        - name: image.tag
          value: "v1.2.3"
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

镜像自动更新

Renovate Bot


// renovate.json
{
  "packageRules": [
    {
      "matchDatasources": ["docker"],
      "groupName": "app-images",
      "labels": ["dependencies"]
    }
  ],
  "regexManagers": [
    {
      "fileMatch": ["^.*\.yaml$"],
      "matchStrings": [
        "image:\s*['"]?(? <dep>myapp[^ '"]*):(? <version>[^ '"]*)['"]?"
      ],
      "datasourceTemplate": "docker",
      "versioningTemplate": "docker"
    }
  ]
}

Image Updater


# argocd-image-updater-config
apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-image-updater-config
data:
  logs.key: debug
  registries.conf: /etc/containersregistries.conf

环境晋升(Promotion)

Git Tag 触发生产部署


# 开发触发 Staging 自动部署
git commit -m "Update myapp"
git push origin feature-branch

# Merge 到 main 后 Staging 自动部署
git checkout main
git merge feature-branch
git push origin main

# 打标签后 Production 部署
git tag v1.2.3
git push origin v1.2.3

Promotion Pipeline


# promotion.yaml
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: image-promotion
spec:
  params:
    - name: image
    - name: fromEnv
    - name: toEnv
  tasks:
    - name: update-staging
      when:
        - input: $(params.toEnv)
          operator: in
          values: ["production"]
      taskRef:
        name: update-gitops
      params:
        - name: image
          value: $(params.image)
        - name: path
          value: apps/myapp/production/values.yaml

安全与合规

RBAC 隔离


# argocd-project.yaml
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: myapp
  namespace: argocd
spec:
  description: MyApp Project
  sourceRepos:
    - https://github.com/example/*
    - https://charts.example.com
  destinations:
    - server: https://kubernetes.default.svc
      namespace: myapp-*
  namespaceResourceBlacklist:
    - group: ""
      kind: ResourceQuota
    - group: ""
      kind: LimitRange
  roles:
    - name: developer
      groups:
        - myapp-developers
      policies:
        - p, proj:myapp:developer, applications, *, myapp/*, allow

审计日志

ArgoCD 会自动记录所有同步和变更操作:


# 查看应用历史
argocd app history myapp

# 查看具体变更
argocd app diff myapp

# 导出审计日志
argocd admin notifications trigger list
argocd admin settings log level debug

回滚策略

ArgoCD 自动回滚


# 查看历史
argocd app history myapp

# 回滚到指定版本
argocd app rollback myapp <revision>

# 自动回滚(Sync 失败时)
# 在 Application spec 中设置
syncPolicy:
  automated:
    selfHeal: true
    prune: true

Git 原生回滚


# 回滚 GitOps 仓库
git revert HEAD
git push origin main

# 或者使用 git reset
git reset --hard <previous-commit>
git push --force

多集群管理

集群注册


# 添加集群到 ArgoCD
argocd cluster add <context-name> \
  --name production \
  --system-namespace argocd

# 查看集群列表
argocd cluster list

Application 跨集群部署


# 部署到生产集群
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp-production
spec:
  destination:
    server: https://eks-production.example.com
    namespace: production

最佳实践

  • 仓库分离:App 代码与 K8s 配置分离,GitOps Repo 独立管理
  • 小而频繁:避免大变更,每次 Merge 小步提交
  • Kustomize Overlays:不同环境使用 Overlay 而非 Helm Values
  • 不可变镜像:镜像 Tag 使用 SHA,不要使用 latest
  • 环境晋升:从 Dev → Staging → Production 逐步验证
  • Git Permission:合并代码需要 PR 审查,禁止直接 Push
  • 自愈能力:启用 ArgoCD/Flux 的 selfHeal 功能

下一步