Golang内存管理之内存逃逸分析

2023-07-04 08:24:16 作者：IguoChan

逃逸分析是指由编译器决定内存分配的位置，不需要程序员指定，这篇文章主要为大家详细介绍了Golang中内存逃逸分析的几种方法，需要的可以参考一下

0. 简介

前面我们针对Go中堆和栈的内存都做了一些分析，现在我们来分析一下Go的内存逃逸。

学习过C语言的都知道，在C栈区域会存放函数的参数、局部变量等，而这些局部变量的地址是不能返回的，除非是局部静态变量地址，字符串常量地址或者动态分配的地址，因为程序调用完函数后，局部变量会随着此函数的栈帧一起被释放。而对于程序员主动申请的内存则存储在堆上，需要使用malloc等函数进行申请，同时也需要使用free等函数释放，由程序员进行管理，而申请内存后如果没有释放，就有可能造成内存泄漏。

但是在Go中，程序员根本无需感知数据是在栈（Go栈）上，还是在堆上，因为编译器会帮你承担这一切，将内存分配到栈或者堆上。在编译器优化中，逃逸分析是用来决定指针动态作用域的方法。Go语言的编译器使用逃逸分析决定哪些变量应该分配在栈上，哪些变量应该分配在堆上，包括使用new、make和字面量等方式隐式分配的内存，Go语言逃逸分析遵循以下两个不变性：

指向栈对象的指针不能存在于堆中；
指向栈对象的指针不能在栈对象回收后存活；

逃逸分析是在编译阶段进行的，可以通过go build -gcflags="-m -m -l"命令查到逃逸分析的结果，最多可以提供4个-m, m 越多则表示分析的程度越详细，一般情况下我们可以采用两个-m分析。使用-l禁用掉内联优化，只关注逃逸优化即可。

1. 几种逃逸分析

1.1 函数返回局部变量指针

package main
func Add(x, y int) *int {
   res := 0
   res = x + y
   return &res
}
func main() {
   Add(1, 2)
}

逃逸分析结果如下：

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:4:2: res escapes to heap:
./main.go:4:2: flow: ~r2 = &res:
./main.go:4:2: from &res (address-of) at ./main.go:6:9
./main.go:4:2: from return &res (return) at ./main.go:6:2
./main.go:4:2: moved to heap: res
note: module requires Go 1.18

分析结果很明显，函数返回的局部变量是一个指针变量，当函数Add执行结束后，对应的栈帧就会被销毁，引用返回到函数之外，如果在外部解引用这个地址，就会导致程序访问非法内存，所以编译器会经过逃逸分析后在堆上分配内存。

1.2 interface(any)类型逃逸

package main
import (
   "fmt"
)
func main() {
   str := "hello world"
   fmt.Printf("%v\n", str)
}

逃逸分析如下：

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:9:13: str escapes to heap:
./main.go:9:13: flow: {storage for ... argument} = &{storage for str}:
./main.go:9:13: from str (spill) at ./main.go:9:13
./main.go:9:13: from ... argument (slice-literal-element) at ./main.go:9:12
./main.go:9:13: flow: {heap} = {storage for ... argument}:
./main.go:9:13: from ... argument (spill) at ./main.go:9:12
./main.go:9:13: from fmt.Printf("%v\n", ... argument...) (call parameter) at ./main.go:9:12
./main.go:9:12: ... argument does not escape
./main.go:9:13: str escapes to heap

通过这个分析你可能会认为str escapes to heap表示这个str逃逸到了堆，但是却没有上一节中返回值中明确写上moved to heap: res，那实际上str是否真的逃逸到了堆上呢？

escapes to heap vs moved to heap

我们可以写如下代码试试：

package main
import "fmt"
func main() {
   str := "hello world"
   str1 := "nihao!"
   fmt.Printf("%s\n", str)
   println(&str)
   println(&str1)
}

其逃逸分析和上面的没有区别：

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:8:13: str escapes to heap:
./main.go:8:13: flow: {storage for ... argument} = &{storage for str}:
./main.go:8:13: from str (spill) at ./main.go:8:13
./main.go:8:13: from ... argument (slice-literal-element) at ./main.go:8:12
./main.go:8:13: flow: {heap} = {storage for ... argument}:
./main.go:8:13: from ... argument (spill) at ./main.go:8:12
./main.go:8:13: from fmt.Printf("%s\n", ... argument...) (call parameter) at ./main.go:8:12
./main.go:8:12: ... argument does not escape
./main.go:8:13: str escapes to heap
note: module requires Go 1.18

但是，str1和str二者的地址却是明显相邻的，那是怎么回事呢？

$ go run main.go
hello world
0xc00009af50
0xc00009af40

如果我们将上述代码的第8行fmt.Printf("%s\n", str)改为fmt.Printf("%p\n", &str)，则逃逸分析如下，发现多了一行moved to heap: str：

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:6:2: str escapes to heap:
./main.go:6:2: flow: {storage for ... argument} = &str:
./main.go:6:2: from &str (address-of) at ./main.go:8:21
./main.go:6:2: from &str (interface-converted) at ./main.go:8:21
./main.go:6:2: from ... argument (slice-literal-element) at ./main.go:8:12
./main.go:6:2: flow: {heap} = {storage for ... argument}:
./main.go:6:2: from ... argument (spill) at ./main.go:8:12
./main.go:6:2: from fmt.Printf("%p\n", ... argument...) (call parameter) at ./main.go:8:12
./main.go:6:2: moved to heap: str
./main.go:8:12: ... argument does not escape
note: module requires Go 1.18

再看运行结果，发现看起来str的地址看起来像逃逸到了堆，毕竟和str1的地址明显不同：

$ go run main.go
0xc00010a210
0xc00010a210
0xc000106f50

参考如下解释：

When the escape analysis says "b escapes to heap", it means that the values in b are written to the heap. So anything referenced by b must be in the heap also. b itself need not be.

翻译过来大意是：当逃逸分析输出“b escapes to heap”时，意思是指存储在b中的值逃逸到堆上了，即任何被b引用的对象必须分配在堆上，而b自身则不需要；如果b自身也逃逸到堆上，那么逃逸分析会输出“&b escapes to heap”。

由于字符串本身是存储在只读存储区，我们使用切片更能表现以上的特性。

无逃逸

package main
import (
   "reflect"
   "unsafe"
)
func main() {
   var i int
   i = 10
   println("&i", &i)
   b := []int{1, 2, 3, 4, 5}
   println("&b", &b) // b这个对象的地址
   println("b", unsafe.Pointer((*reflect.SliceHeader)(unsafe.Pointer(&b)).Data)) // b的底层数组地址
}

逃逸分析是：

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:12:12: []int{...} does not escape
note: module requires Go 1.18

打印结果：

$ go run main.go
&i 0xc00009af20
&b 0xc00009af58
b 0xc00009af28

可以看到，以上分析无逃逸，且&i b &b地址连续，可以明显看到都在栈中。

切片底层数组逃逸

我们新增一个fmt包的打印：

package main
import (
   "fmt"
   "reflect"
   "unsafe"
)
func main() {
   var i int
   i = 10
   println("&i", &i)
   b := []int{1, 2, 3, 4, 5}
   println("&b", &b) // b这个对象的地址
   println("b", unsafe.Pointer((*reflect.SliceHeader)(unsafe.Pointer(&b)).Data)) // b的底层数组地址
   fmt.Println(b) // 多加了这行
}

逃逸分析如下：

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:16:13: b escapes to heap:
./main.go:16:13: flow: {storage for ... argument} = &{storage for b}:
./main.go:16:13: from b (spill) at ./main.go:16:13
./main.go:16:13: from ... argument (slice-literal-element) at ./main.go:16:13
./main.go:16:13: flow: {heap} = {storage for ... argument}:
./main.go:16:13: from ... argument (spill) at ./main.go:16:13
./main.go:16:13: from fmt.Println(... argument...) (call parameter) at ./main.go:16:13
./main.go:13:12: []int{...} escapes to heap:
./main.go:13:12: flow: b = &{storage for []int{...}}:
./main.go:13:12: from []int{...} (spill) at ./main.go:13:12
./main.go:13:12: from b := []int{...} (assign) at ./main.go:13:4
./main.go:13:12: flow: {storage for b} = b:
./main.go:13:12: from b (interface-converted) at ./main.go:16:13
./main.go:13:12: []int{...} escapes to heap
./main.go:16:13: ... argument does not escape
./main.go:16:13: b escapes to heap
note: module requires Go 1.18

可以发现，出现了b escapes to heap，然后查看打印：

$ go run main.go
&i 0xc000106f38
&b 0xc000106f58
b 0xc000120030
[1 2 3 4 5]

可以发现，b的底层数组发生了逃逸，但是b本身还是在栈中。

切片对象同样发生逃逸

package main
import (
   "fmt"
   "reflect"
   "unsafe"
)
func main() {
   var i int
   i = 10
   println("&i", &i)
   b := []int{1, 2, 3, 4, 5}
   println("&b", &b) // b这个对象的地址
   println("b", unsafe.Pointer((*reflect.SliceHeader)(unsafe.Pointer(&b)).Data)) // b的底层数组地址
   fmt.Println(&b) // 修改这行
}

如上，将fmt.Println(b)改为fmt.Println(&b)，逃逸分析如下：

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:13:2: b escapes to heap:
./main.go:13:2: flow: {storage for ... argument} = &b:
./main.go:13:2: from &b (address-of) at ./main.go:16:14
./main.go:13:2: from &b (interface-converted) at ./main.go:16:14
./main.go:13:2: from ... argument (slice-literal-element) at ./main.go:16:13
./main.go:13:2: flow: {heap} = {storage for ... argument}:
./main.go:13:2: from ... argument (spill) at ./main.go:16:13
./main.go:13:2: from fmt.Println(... argument...) (call parameter) at ./main.go:16:13
./main.go:13:12: []int{...} escapes to heap:
./main.go:13:12: flow: b = &{storage for []int{...}}:
./main.go:13:12: from []int{...} (spill) at ./main.go:13:12
./main.go:13:12: from b := []int{...} (assign) at ./main.go:13:4
./main.go:13:2: moved to heap: b
./main.go:13:12: []int{...} escapes to heap
./main.go:16:13: ... argument does not escape
note: module requires Go 1.18

发现多了moved to heap: b这行，然后看地址打印：

$ go run main.go
&i 0xc00006af48
&b 0xc00000c030
b 0xc00001a150
&[1 2 3 4 5]

发现不仅底层数组发生了逃逸，连b这个对象本身也发生了逃逸。

所以可以总结下来就是：

escapes to heap：表示这个对象里面的指针对象逃逸到堆中；
moved to heap：表示对象本身逃逸到堆中，根据指向栈对象的指针不能存在于堆中这一准则，该对象里面的指针对象特必然逃逸到堆中。

1.3 申请栈空间过大

package main
import (
   "reflect"
   "unsafe"
)
func main() {
   var i int
   i = 10
   println("&i", &i)
   b := make([]int, 0)
   println("&b", &b) // b这个对象的地址
   println("b", unsafe.Pointer((*reflect.SliceHeader)(unsafe.Pointer(&b)).Data))
   b1 := make([]byte, 65536)
   println("&b1", &b1) // b1这个对象的地址
   println("b1", unsafe.Pointer((*reflect.SliceHeader)(unsafe.Pointer(&b1)).Data))
   var a [1024*1024*10]byte
   _ = a
}

可以发现逃逸分析显示没有发生逃逸：

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:13:11: make([]int, 0) does not escape
./main.go:17:12: make([]byte, 65536) does not escape
note: module requires Go 1.18

如果将切片和数组的长度都增加1，则会发生逃逸。

b1 := make([]byte, 65537)
var a [1024*1024*10 + 1]byte

逃逸分析：

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:21:6: a escapes to heap:
./main.go:21:6: flow: {heap} = &a:
./main.go:21:6: from a (too large for stack) at ./main.go:21:6
./main.go:17:12: make([]byte, 65537) escapes to heap:
./main.go:17:12: flow: {heap} = &{storage for make([]byte, 65537)}:
./main.go:17:12: from make([]byte, 65537) (too large for stack) at ./main.go:17:12
./main.go:21:6: moved to heap: a
./main.go:13:11: make([]int, 0) does not escape
./main.go:17:12: make([]byte, 65537) escapes to heap
note: module requires Go 1.18

可以发现切片类型的逃逸阈值是65536 = 64KB，数组类型的逃逸阈值是1024*1024*10 = 10MB，超过这个数值就会发生逃逸。

1.4 闭包逃逸

package main
func intSeq() func() int {
   i := 0
   return func() int {
      i++
      return i
   }
}
func main() {
    a := intSeq()
    println(a())
    println(a())
    println(a())
    println(a())
    println(a())
    println(a())
}

逃逸分析如下，可以发现闭包中的局部变量i发生了逃逸。

$ go build -gcflags="-m -m -l" ./main.go
# command-line-arguments
./main.go:4:2: intSeq capturing by ref: i (addr=false assign=true width=8)
./main.go:5:9: func literal escapes to heap:
./main.go:5:9: flow: ~r0 = &{storage for func literal}:
./main.go:5:9: from func literal (spill) at ./main.go:5:9
./main.go:5:9: from return func literal (return) at ./main.go:5:2
./main.go:4:2: i escapes to heap:
./main.go:4:2: flow: {storage for func literal} = &i:
./main.go:4:2: from i (captured by a closure) at ./main.go:6:3
./main.go:4:2: from i (reference) at ./main.go:6:3
./main.go:4:2: moved to heap: i
./main.go:5:9: func literal escapes to heap
note: module requires Go 1.18

因为函数也是一个指针类型，所以匿名函数当作返回值时也发生了逃逸，在匿名函数中使用外部变量i，这个变量i会一直存在直到a被销毁，所以i变量逃逸到了堆上。

2. 总结

逃逸到堆上的内存可能会加大GC压力，所以在一些简单的场景下，我们可以避免内存逃逸，使得变量更多地分配在栈上，可以提升程序的性能。比如：

不要盲目地使用指针传参，特别是参数对象很小时，虽然可以减小复制大小，但是可能会造成内存逃逸；
多根据代码具体分析，根据逃逸分析结果做一些优化，提高性能。

以上就是Golang内存管理之内存逃逸分析的详细内容，更多关于Golang内存逃逸的资料请关注脚本之家其它相关文章！