日常听说 HTTPS 是加密协议,那现实中的 HTTPS 流量,是真的完全加密吗?
——答案是,不一定。原因嘛,抓个包就知道了。
我们用 curl 命令触发一下:
```bash
curl -v 'https://s-api.37.com.cn/api/xxx'
* Trying 106.53.109.63:443...
* Connected to s-api.37.com.cn (106.53.109.63) port 443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
* CAfile: /etc/ssl/cert.pem
* CApath: none
* (304) (OUT), TLS handshake, Client hello (1):
* (304) (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN: server accepted h2
* Server certificate:
* subject: CN=*.37.com.cn
* start date: Aug 24 00:00:00 2022 GMT
* expire date: Sep 11 23:59:59 2023 GMT
* subjectAltName: host "s-api.37.com.cn" matched cert's "*.37.com.cn"
* issuer: C=US; O=DigiCert, Inc.; CN=RapidSSL Global TLS RSA4096 SHA256 2022 CA1
* SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* h2h3 [:method: GET]
* h2h3 [:path: /api/xxx]
* h2h3 [:scheme: https]
* h2h3 [:authority: s-api.37.com.cn]
* h2h3 [user-agent: curl/7.85.0]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x159813400)
> GET /api/xxx HTTP/2
> Host: s-api.37.com.cn
> user-agent: curl/7.85.0
> accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
< HTTP/2 404
< date: Wed, 18 Jan 2023 09:57:12 GMT
< content-type: text/html
< content-length: 150
<
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>openresty</center>
</body>
</html>
* Connection #0 to host s-api.37.com.cn left intact
```
Wireshark 使用过滤条件 `ip.addr == 106.53.109.63`,截图如下:
![https抓包分析](https://imlht.com/usr/uploads/2023/01/1226203107.png)
可以看到,HTTPS 并没有完全加密我的访问请求,因为 `Server Name` 依然是明文传输的。它发生在 HTTPS 传输过程中的 `Client Hello` 握手阶段,在 TCP 三次握手之后。
如果不知道什么是 `Client Hello`,可以参考网上的一张流程图:
![https流程图](https://imlht.com/usr/uploads/2023/01/3602849187.jpg)
这也解答了我之前用 curl 请求接口的疑惑——正常来说,我们用 http 协议,以下命令是可以访问的:
```bash
curl -v -H 'Host: s-api.37.com.cn' 'http://10.43.2.9/api/xxx'
```
但是你用了 https 协议,会报告证书校验失败。
```
curl -v -H 'Host: s-api.37.com.cn' 'https://10.43.2.9/api/xxx'
* Trying 10.43.2.9:443...
* Connected to 10.43.2.9 (10.43.2.9) port 443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
* CAfile: /etc/ssl/cert.pem
* CApath: none
* (304) (OUT), TLS handshake, Client hello (1):
* (304) (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS alert, unknown CA (560):
* SSL certificate problem: self signed certificate
* Closing connection 0
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS alert, unknown CA (560):
curl: (60) SSL certificate problem: self signed certificate
More details here: https://curl.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
```
意思是证书校验失败!因为 -H 参数指定了 HTTP 头部 Host 字段,作用于 7 层。
而 HTTPS 的握手阶段,只是完成了 TCP 的三次握手,抓包分析也可以发现,看不到域名,只有一个 IP 地址。
可以使用 `-k` 参数跳过证书校验的过程。
有没有更好的办法呢?
```bash
curl -vs --resolve 's-api.37.com.cn:443:10.43.2.9' 'https://s-api.37.com.cn/api/xxx'
```
可以使用 `--resolve` 参数,手工指定域名解析的 IP,就不会报证书校验失败了。
但是为什么要明文传输呢?那就得说到 SNI 了!
引用维基百科的描述,它用于服务端复用 IP 地址,提供不同域名的网站服务。
> 服务器名称指示(英语:Server Name Indication,缩写:SNI)是TLS的一个扩展协议[1],在该协议下,在握手过程开始时客户端告诉它正在连接的服务器要连接的主机名称。这允许服务器在相同的IP地址和TCP端口号上呈现多个证书,并且因此允许在相同的IP地址上提供多个安全(HTTPS)网站(或其他任何基于TLS的服务),而不需要所有这些站点使用相同的证书。它与HTTP/1.1基于名称的虚拟主机的概念相同,但是用于HTTPS。
而 TLS 1.3,也将 SNI 信息加密了。
---
> 文章来源于本人博客,发布于 2022-07-24,原文链接:[https://imlht.com/archives/394/](https://imlht.com/archives/394/)