This commit is contained in:
parent
0b770340c8
commit
41115faa16
430
docs/https-and-certificate.md
Normal file
430
docs/https-and-certificate.md
Normal file
@ -0,0 +1,430 @@
|
||||
# HTTPS 跳转 & 证书生成流程分析
|
||||
|
||||
> **本文档面向 AI Agent / 开发者**:总结了在 K3s + Traefik v3 + cert-manager 架构下,实现 HTTP→HTTPS 自动跳转和 Let's Encrypt 自动证书的完整方案。其他项目可直接参照本文档修改自己的 CI/CD 流水线和 K8s 配置。
|
||||
|
||||
---
|
||||
|
||||
## 零、其他项目接入指南(快速参考)
|
||||
|
||||
### 需要做的 3 件事
|
||||
|
||||
#### 1. 新增文件:`k8s/cert-manager-issuer.yaml`
|
||||
|
||||
```yaml
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-prod
|
||||
spec:
|
||||
acme:
|
||||
server: https://acme-v02.api.letsencrypt.org/directory
|
||||
email: airlabsv001@gmail.com
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-prod-key
|
||||
solvers:
|
||||
- http01:
|
||||
ingress:
|
||||
class: traefik
|
||||
```
|
||||
|
||||
> 如果集群已有同名 ClusterIssuer(多项目共享同一集群),这一步可跳过,`kubectl apply` 是幂等的。
|
||||
|
||||
#### 2. 新增文件:`k8s/redirect-https-middleware.yaml`
|
||||
|
||||
```yaml
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: Middleware
|
||||
metadata:
|
||||
name: redirect-https
|
||||
spec:
|
||||
redirectScheme:
|
||||
scheme: https
|
||||
permanent: true
|
||||
```
|
||||
|
||||
#### 3. 修改 `k8s/ingress.yaml`
|
||||
|
||||
确保包含以下 3 个 annotation 和 TLS 配置:
|
||||
|
||||
```yaml
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: 你的-ingress-名称
|
||||
annotations:
|
||||
kubernetes.io/ingress.class: "traefik"
|
||||
cert-manager.io/cluster-issuer: "letsencrypt-prod" # ← 触发自动证书签发
|
||||
traefik.ingress.kubernetes.io/router.middlewares: "default-redirect-https@kubernetescrd" # ← HTTP→HTTPS 跳转
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- 你的域名-api.example.com # ← 改成你的域名
|
||||
- 你的域名.example.com
|
||||
secretName: 你的项目-tls # ← 证书存储的 Secret 名,随便起,不要和其他项目冲突
|
||||
rules:
|
||||
- host: 你的域名-api.example.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: 你的-backend-service
|
||||
port:
|
||||
number: 8000
|
||||
- host: 你的域名.example.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: 你的-web-service
|
||||
port:
|
||||
number: 80
|
||||
```
|
||||
|
||||
#### 4. 修改 CI/CD 流水线(deploy.yaml)
|
||||
|
||||
在 `kubectl apply` 部署步骤中,**在 ingress.yaml 之前**加上这两行:
|
||||
|
||||
```yaml
|
||||
# 原来只有这些:
|
||||
kubectl apply -f k8s/backend-deployment.yaml
|
||||
kubectl apply -f k8s/web-deployment.yaml
|
||||
kubectl apply -f k8s/ingress.yaml
|
||||
|
||||
# 改成:
|
||||
kubectl apply -f k8s/cert-manager-issuer.yaml # ← 新增:注册 Let's Encrypt CA
|
||||
kubectl apply -f k8s/redirect-https-middleware.yaml # ← 新增:HTTP→HTTPS 重定向中间件
|
||||
kubectl apply -f k8s/backend-deployment.yaml
|
||||
kubectl apply -f k8s/web-deployment.yaml
|
||||
kubectl apply -f k8s/ingress.yaml
|
||||
```
|
||||
|
||||
> **顺序很重要**:cert-manager-issuer 和 middleware 必须在 ingress 之前 apply,否则 ingress 引用的资源不存在会导致证书签发失败或重定向不生效。
|
||||
|
||||
### 集群前置条件(每台服务器只需执行一次)
|
||||
|
||||
以下命令需要 **SSH 到每台 K8s master 节点手动执行一次**,不需要写进 CI/CD:
|
||||
|
||||
```bash
|
||||
# 1. 确认 cert-manager 已安装
|
||||
kubectl get pods -n cert-manager
|
||||
# 如果没有,需要先安装:https://cert-manager.io/docs/installation/
|
||||
|
||||
# 2. 配置 Traefik 全局 HTTP→HTTPS 重定向
|
||||
kubectl -n kube-system patch deployment traefik --type=json -p '[
|
||||
{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--entryPoints.web.http.redirections.entryPoint.to=:443"},
|
||||
{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--entryPoints.web.http.redirections.entryPoint.scheme=https"},
|
||||
{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--entryPoints.web.http.redirections.entryPoint.permanent=true"}
|
||||
]'
|
||||
```
|
||||
|
||||
> **关键**:`to=:443` 不能写成 `to=websecure`。Traefik 内部 websecure 端口是 8443,写 `websecure` 会导致重定向 URL 带 `:8443`,用户无法访问。
|
||||
|
||||
### 验证清单
|
||||
|
||||
```bash
|
||||
# HTTP 跳转
|
||||
curl -I http://你的域名
|
||||
# 预期: 308 Permanent Redirect → https://你的域名
|
||||
|
||||
# 证书有效
|
||||
curl -v https://你的域名 2>&1 | grep "issuer"
|
||||
# 预期: issuer: ... Let's Encrypt ...
|
||||
|
||||
# 证书状态
|
||||
kubectl get certificate -A
|
||||
# 预期: Ready = True
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 一、HTTP → HTTPS 自动跳转
|
||||
|
||||
### 问题
|
||||
用户通过 `http://` 访问时不会自动跳转到 `https://`。
|
||||
|
||||
### 根因
|
||||
Traefik v3(K3s 内置 Ingress Controller)对配置了 TLS 的 Ingress 默认只创建 HTTPS 路由,HTTP 请求没有对应路由处理,导致无法重定向。
|
||||
|
||||
### 修复方案
|
||||
在 Traefik Deployment 全局添加 HTTP→HTTPS 重定向参数(无需每个 Ingress 单独配置,集群内所有项目自动生效):
|
||||
|
||||
```
|
||||
--entryPoints.web.http.redirections.entryPoint.to=:443
|
||||
--entryPoints.web.http.redirections.entryPoint.scheme=https
|
||||
--entryPoints.web.http.redirections.entryPoint.permanent=true
|
||||
```
|
||||
|
||||
**执行命令**(在 K8s master 节点):
|
||||
```bash
|
||||
kubectl -n kube-system patch deployment traefik --type=json -p '[
|
||||
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--entryPoints.web.http.redirections.entryPoint.to=:443"},
|
||||
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--entryPoints.web.http.redirections.entryPoint.scheme=https"},
|
||||
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--entryPoints.web.http.redirections.entryPoint.permanent=true"}
|
||||
]'
|
||||
```
|
||||
|
||||
> **注意**: `to=:443` 而不是 `to=websecure`。Traefik 内部 websecure 监听在 8443 端口,如果写 `to=websecure` 重定向 URL 会带上 `:8443` 端口号,导致用户访问失败。写 `:443` 可以确保重定向目标是标准 HTTPS 端口。
|
||||
|
||||
### 测试服状态
|
||||
已修复 ✅ — `http://airflow-studio.test.airlabs.art` → 308 → `https://airflow-studio.test.airlabs.art`
|
||||
|
||||
### 正式服状态
|
||||
未修复 ❌ — 需要在正式服 K8s 集群执行同样的 `kubectl patch` 命令。
|
||||
|
||||
---
|
||||
|
||||
## 二、SSL 证书生成流程
|
||||
|
||||
### 整体架构
|
||||
|
||||
```
|
||||
用户浏览器
|
||||
│
|
||||
┌────▼────┐
|
||||
│ DNS │ *.airlabs.art → 集群外网 IP
|
||||
└────┬────┘
|
||||
│
|
||||
┌─────────▼──────────┐
|
||||
│ Traefik (K3s) │ Ingress Controller
|
||||
│ Port 80 / 443 │
|
||||
└─────────┬──────────┘
|
||||
│
|
||||
┌───────────▼────────────┐
|
||||
│ Ingress 资源 │ 定义域名 → Service 映射
|
||||
│ + TLS secretName │ 指定证书存储位置
|
||||
│ + cert-manager注解 │ 触发自动证书签发
|
||||
└───────────┬────────────┘
|
||||
│
|
||||
┌───────────▼────────────┐
|
||||
│ cert-manager │ 监听 Ingress 变化
|
||||
│ (集群内 Pod) │ 自动管理证书生命周期
|
||||
└───────────┬────────────┘
|
||||
│
|
||||
┌───────────▼────────────┐
|
||||
│ Let's Encrypt │ 免费证书颁发机构 (CA)
|
||||
│ (外部服务) │ 通过 ACME 协议验证域名
|
||||
└────────────────────────┘
|
||||
```
|
||||
|
||||
### 详细步骤
|
||||
|
||||
#### 第 1 步:ClusterIssuer 定义 CA 配置
|
||||
|
||||
文件: `k8s/cert-manager-issuer.yaml`
|
||||
|
||||
```yaml
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-prod
|
||||
spec:
|
||||
acme:
|
||||
server: https://acme-v02.api.letsencrypt.org/directory # Let's Encrypt 生产 API
|
||||
email: airlabsv001@gmail.com # 证书到期提醒邮箱
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-prod-key # ACME 账号私钥存储
|
||||
solvers:
|
||||
- http01:
|
||||
ingress:
|
||||
class: traefik # 使用 Traefik 完成验证
|
||||
```
|
||||
|
||||
- `ClusterIssuer` 是全局资源,集群内所有 namespace 都可使用
|
||||
- ACME 账号注册后私钥保存在 `letsencrypt-prod-key` Secret 中
|
||||
|
||||
#### 第 2 步:Ingress 触发证书签发
|
||||
|
||||
文件: `k8s/ingress.yaml`
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: "letsencrypt-prod" # ← 告诉 cert-manager 用哪个 Issuer
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- airflow-studio-api.airlabs.art # ← 需要证书的域名
|
||||
- airflow-studio.airlabs.art
|
||||
secretName: airflow-studio-tls # ← 证书存到这个 Secret
|
||||
```
|
||||
|
||||
当 cert-manager 检测到这个 Ingress 有 `cert-manager.io/cluster-issuer` 注解,会自动:
|
||||
1. 创建一个 `Certificate` 资源
|
||||
2. 创建一个 `CertificateRequest` 资源
|
||||
3. 创建一个 `Order` 资源
|
||||
4. 创建一个 `Challenge` 资源(每个域名一个)
|
||||
|
||||
#### 第 3 步:HTTP-01 验证(关键环节)
|
||||
|
||||
cert-manager 使用 **HTTP-01 验证**来证明你拥有该域名:
|
||||
|
||||
```
|
||||
Let's Encrypt 服务器 你的集群
|
||||
│ │
|
||||
│ 1. 给你一个 token │
|
||||
│ ──────────────────────────────────────────► │
|
||||
│ │
|
||||
│ 2. 在 http://<域名>/.well-known/ │
|
||||
│ acme-challenge/<token> 放置响应 │
|
||||
│ │ cert-manager 自动创建
|
||||
│ 3. Let's Encrypt 访问该 URL 验证 │ 临时 Ingress 路由
|
||||
│ ──────────────────────────────────────────► │ 处理这个路径
|
||||
│ │
|
||||
│ 4. 验证通过,签发证书 │
|
||||
│ ◄────────────────────────────────────────── │
|
||||
```
|
||||
|
||||
**验证成功的前提条件**:
|
||||
| 条件 | 说明 |
|
||||
|------|------|
|
||||
| DNS 解析正确 | 域名必须指向集群的外网 IP |
|
||||
| 80 端口开放 | Let's Encrypt 只通过 HTTP 80 端口验证 |
|
||||
| Traefik 正常运行 | 需要处理 `/.well-known/acme-challenge/` 请求 |
|
||||
| cert-manager 已安装 | 集群内必须有 cert-manager Pod 在运行 |
|
||||
| 无防火墙拦截 | 安全组/防火墙不能阻断 Let's Encrypt 到 80 端口的访问 |
|
||||
|
||||
#### 第 4 步:证书存储与使用
|
||||
|
||||
验证通过后:
|
||||
- cert-manager 将证书和私钥存入 Secret `airflow-studio-tls`
|
||||
- `tls.crt` — 证书链(服务器证书 + 中间证书)
|
||||
- `tls.key` — 私钥
|
||||
- Traefik 自动读取该 Secret,用于 HTTPS 握手
|
||||
|
||||
#### 第 5 步:自动续期
|
||||
|
||||
- Let's Encrypt 证书有效期 **90 天**
|
||||
- cert-manager 在到期前 **30 天**自动续期(`renewalTime`)
|
||||
- 续期过程与首次签发相同(HTTP-01 验证)
|
||||
|
||||
---
|
||||
|
||||
## 三、正式服 HTTPS "不安全" 排查
|
||||
|
||||
### 当前正式服证书状态(从外部检测)
|
||||
|
||||
```
|
||||
Subject: CN=airflow-studio-api.airlabs.art
|
||||
Issuer: C=US, O=Let's Encrypt, CN=R13
|
||||
Valid: 2026-04-04 ~ 2026-07-03
|
||||
SAN: airflow-studio-api.airlabs.art, airflow-studio.airlabs.art
|
||||
Chain: 完整 (R13 → ISRG Root X1)
|
||||
Verify: return:1 (通过)
|
||||
```
|
||||
|
||||
**证书本身是有效的。** 从 openssl 命令行验证完全通过。
|
||||
|
||||
### 浏览器提示"不安全"的可能原因
|
||||
|
||||
#### 原因 1:正式服 HTTP 80 端口未跳转 HTTPS(最可能)
|
||||
|
||||
```bash
|
||||
# 测试结果
|
||||
curl http://airflow-studio.airlabs.art/login → HTTP 200(直接返回页面,没有跳转!)
|
||||
```
|
||||
|
||||
正式服 80 端口直接返回了页面内容(通过 nginx),浏览器地址栏显示 `http://` 时会标记为"不安全"。这不是证书问题,而是**用户没有被引导到 HTTPS**。
|
||||
|
||||
**解决**: 在正式服集群执行同样的 Traefik redirect patch 命令(见第一节)。
|
||||
|
||||
#### 原因 2:HSTS 头未设置
|
||||
|
||||
即使有了跳转,首次访问仍走 HTTP。添加 HSTS 头可以让浏览器记住始终用 HTTPS:
|
||||
|
||||
在 `web/nginx.conf` 中添加(仅在 Traefik 终结 TLS 的情况下由后端设置无效,需在 Ingress 层设置):
|
||||
|
||||
```yaml
|
||||
# ingress.yaml annotation
|
||||
traefik.ingress.kubernetes.io/router.middlewares: "default-hsts@kubernetescrd"
|
||||
```
|
||||
|
||||
或创建 HSTS Middleware:
|
||||
```yaml
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: Middleware
|
||||
metadata:
|
||||
name: hsts
|
||||
spec:
|
||||
headers:
|
||||
stsSeconds: 31536000
|
||||
stsIncludeSubdomains: true
|
||||
stsPreload: true
|
||||
```
|
||||
|
||||
#### 原因 3:混合内容 (Mixed Content)
|
||||
|
||||
页面通过 HTTPS 加载,但其中某些资源(图片、API、JS)通过 HTTP 加载。
|
||||
- 前端源码已检查:**无 `http://` 硬编码** ✅
|
||||
- 可能来源:数据库中存储的视频/图片 URL 是 `http://` 开头
|
||||
- 排查:在浏览器 F12 → Console 查看是否有 "Mixed Content" 警告
|
||||
|
||||
#### 原因 4:cert-manager 未部署到正式服集群
|
||||
|
||||
正式服和测试服是**不同的 K8s 集群**。需要确认正式服集群也安装了 cert-manager:
|
||||
|
||||
```bash
|
||||
kubectl get pods -n cert-manager
|
||||
```
|
||||
|
||||
如果没有安装,证书不会自动签发,Traefik 会使用自签证书(浏览器会报不安全)。
|
||||
|
||||
---
|
||||
|
||||
## 四、测试服 vs 正式服对比排查表
|
||||
|
||||
| 检查项 | 测试服 | 正式服 | 检查命令 |
|
||||
|--------|--------|--------|----------|
|
||||
| cert-manager 运行 | ✅ | ❓ 待确认 | `kubectl get pods -n cert-manager` |
|
||||
| ClusterIssuer 存在 | ✅ | ❓ 待确认 | `kubectl get clusterissuer` |
|
||||
| Certificate Ready | ✅ Ready | ❓ 待确认 | `kubectl get certificate -A` |
|
||||
| TLS Secret 存在 | ✅ | ❓ 待确认 | `kubectl get secret airflow-studio-tls` |
|
||||
| 证书链完整 | ✅ Let's Encrypt | ✅ Let's Encrypt | `openssl s_client -connect <domain>:443` |
|
||||
| HTTP→HTTPS 跳转 | ✅ 308 | ❌ 返回 200 | `curl -I http://<domain>` |
|
||||
| Traefik redirect 配置 | ✅ 已配置 | ❌ 未配置 | `kubectl get deploy traefik -n kube-system -o yaml` |
|
||||
| 80 端口外网可达 | ✅ | ✅ | `curl http://<domain>` |
|
||||
| 443 端口外网可达 | ✅ | ✅ | `curl -k https://<domain>` |
|
||||
| 前端混合内容 | ✅ 无 | ❓ 待确认 | 浏览器 F12 Console |
|
||||
|
||||
---
|
||||
|
||||
## 五、正式服修复操作清单
|
||||
|
||||
### 步骤 1:SSH 到正式服 K8s master 节点
|
||||
|
||||
### 步骤 2:检查 cert-manager
|
||||
```bash
|
||||
kubectl get pods -n cert-manager
|
||||
kubectl get clusterissuer
|
||||
kubectl get certificate -A
|
||||
kubectl describe certificate airflow-studio-tls
|
||||
```
|
||||
|
||||
### 步骤 3:如果证书状态异常,删除重签
|
||||
```bash
|
||||
kubectl delete secret airflow-studio-tls
|
||||
# cert-manager 会自动重新签发(需要 1-3 分钟)
|
||||
kubectl get certificate -A -w # 等待 Ready=True
|
||||
```
|
||||
|
||||
### 步骤 4:配置 HTTP→HTTPS 全局跳转
|
||||
```bash
|
||||
kubectl -n kube-system patch deployment traefik --type=json -p '[
|
||||
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--entryPoints.web.http.redirections.entryPoint.to=:443"},
|
||||
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--entryPoints.web.http.redirections.entryPoint.scheme=https"},
|
||||
{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--entryPoints.web.http.redirections.entryPoint.permanent=true"}
|
||||
]'
|
||||
```
|
||||
|
||||
### 步骤 5:验证
|
||||
```bash
|
||||
# HTTP 跳转
|
||||
curl -I http://airflow-studio.airlabs.art/login
|
||||
# 预期: 308 → https://airflow-studio.airlabs.art/login
|
||||
|
||||
# HTTPS 证书
|
||||
curl -v https://airflow-studio.airlabs.art/login 2>&1 | grep -E "SSL|subject|issuer"
|
||||
```
|
||||
Loading…
x
Reference in New Issue
Block a user