内存回收属于

[TOC]

non-generational non-compacting

It is a concurrent mark and sweep that uses a write barrier. It is non-generational and non-compacting. – mgc.go

Go 实现的垃圾回收器是无分代(对象没有代际之分)、不整理(回收过程中不对对象进行移动与整理)、并发(与用户代码并发执行)的三色标记清扫算法。

一些变量

// GC 阶段定义
const (
	_GCoff             = iota // GC not running; sweeping in background, write barrier disabled
	_GCmark                   // GC marking roots and workbufs: allocate black, write barrier ENABLED
	_GCmarktermination        // GC mark termination: allocate black, P's help GC, write barrier ENABLED
)

// 是垃圾收集器当前处于的阶段,可能处于 _GCoff、_GCmark 和 _GCmarktermination,Goroutine 在读取或者修改该阶段时需要保证原子性;
var gcphase uint32

// 布尔值,当垃圾收集处于标记阶段时,该变量会被置为 1,在这里辅助垃圾收集的用户程序和后台标记的任务可以将对象涂黑;
var gcBlackenEnabled uint32

//实现了垃圾收集的调步算法,它能够决定触发并行垃圾收集的时间和待处理的工作;
var gcController gcControllerState

// 触发垃圾收集的内存增长百分比,默认情况下为 100,即堆内存相比上次垃圾收集增长 100% 时应该触发 GC
// Initialized from $GOGC.  GOGC=off means no GC.
var gcpercent int32

// Holding worldsema grants an M the right to try to stop the world.
var worldsema uint32 = 1

gcMode

// gcMode indicates how concurrent a GC cycle should be.
type gcMode int

const (
   gcBackgroundMode gcMode = iota // concurrent GC and sweep 并发模式
   gcForceMode                    // stop-the-world GC now, concurrent sweep
   gcForceBlockMode               // stop-the-world GC now and STW sweep (forced by user)
)

gcController

var gcController gcControllerState



// 会根据全局处理器的个数以及垃圾收集的 CPU 利用率计算出 dedicatedMarkWorkersNeeded 和 fractionalUtilizationGoal 以决定不同模式的工作协程的数量。
gcControllerState.startCycle()
gcControllerState.enlistWorker()

// 返回当前P所绑定的 后台 Mark Worker Goroutine(如果可运行),
// 根据当前运行状态及策略设置处理器的 gcMarkWorkerMode 工作模式,然后唤醒
gcControllerState.findRunnableGCWorker() 

控制器全程参与并发回收任务,记录相关状态数据,动态调整运行策略,影响并发标记单元的工作模式和数量,平衡CPU资源占用。当回收结東时,参与 next gc回收國值设置,调整垃圾回收触发频率。

gcController实现GC步调控制器,它决定何时触发并发垃圾收集,以及在mutator辅助和后台标记中要做多少标记工作。

它使用反馈控制算法,根据每个周期的堆增长和 GC 的 CPU利用率调整来 memstats_gc.trigger触发。

gcMarkWorkerMode 标记工作模式

type gcMarkWorkerMode int

const (
   gcMarkWorkerDedicatedMode gcMarkWorkerMode = iota   	// 全力模式
   gcMarkWorkerFractionalMode							// 部分模式
   gcMarkWorkerIdleMode									// 空闲模式
)

三种不同模式的工作协程会相互协同保证垃圾收集的 CPU 利用率达到期望的阈值,在到达目标堆大小前完成标记任务。

writeBarrier

是一个包含写屏障状态的结构体,其中的 enabled 字段表示写屏障的开启与关闭;

// The compiler knows about this variable.
// If you change it, you must change builtin/runtime.go, too.
// If you change the first four bytes, you must also change the write
// barrier insertion code.
var writeBarrier struct {
	enabled bool    // 写屏障的开启与关闭标记;
	pad     [3]byte // compiler uses 32-bit load for "enabled" field
	needed  bool    // whether we need a write barrier for current GC phase
	cgo     bool    // whether we need a write barrier for a cgo check
	alignme uint64  // guarantee alignment so that compiler can use a 32 or 64-bit load
}

同步屏障(Barrier)是并行计算中的一种同步方法。对于一群进程或线程,程序中的一个同步屏障意味着任何线程/进程执行到此后必须等待,直到所有线程/进程都到达此点才可继续执行下文。

内存屏障(英语:Memory barrier)是一类同步屏障指令,它使得 CPU 或编译器在对内存进行操作的时候, 严格按照一定的顺序来执行, 也就是说在memory barrier 之前的指令和memory barrier之后的指令不会由于系统优化等原因而导致乱序,即在内存屏障前执行的操作一定会先于内存屏障后执行的操作。

大多数现代计算机为了提高性能而采取乱序执行,这使得内存屏障成为必须。

大多数处理器提供了内存屏障指令:

语义上,内存屏障之前的所有写操作都要写入内存;内存屏障之后的读操作都可以获得同步屏障之前的写操作的结果。因此,对于敏感的程序块,写操作之后、读操作之前可以插入内存屏障。

使用:

//go:nowritebarrier
func gcDrain(gcw *gcWork, flags gcDrainFlags) {
	if !writeBarrier.needed {
		throw("gcDrain phase incorrect")
	}
    ...
}

既不会在编译期被编译器进行调整,也不会在运行时被 CPU 的乱序执行所打乱

runtime.work

work 结构体有很多垃圾收集的相关字段,

例如:表示完成的垃圾收集循环的次数、当前循环时间和 CPU 的利用率、垃圾收集的模式等等

var work struct {
   full  lfstack          // lock-free list of full blocks workbuf
   empty lfstack          // lock-free list of empty blocks workbuf
   pad0  cpu.CacheLinePad // prevents false-sharing between full/empty and nproc/nwait

   wbufSpans struct {
      lock mutex
      // free is a list of spans dedicated to workbufs, but
      // that don't currently contain any workbufs.
      free mSpanList
      // busy is a list of all spans containing workbufs on
      // one of the workbuf lists.
      busy mSpanList
   }

   // Restore 64-bit alignment on 32-bit.
   _ uint32

   // bytesMarked is the number of bytes marked this cycle. This
   // includes bytes blackened in scanned objects, noscan objects
   // that go straight to black, and permagrey objects scanned by
   // markroot during the concurrent scan phase. This is updated
   // atomically during the cycle. Updates may be batched
   // arbitrarily, since the value is only read at the end of the
   // cycle.
   //
   // Because of benign races during marking, this number may not
   // be the exact number of marked bytes, but it should be very
   // close.
   //
   // Put this field here because it needs 64-bit atomic access
   // (and thus 8-byte alignment even on 32-bit architectures).
   bytesMarked uint64

   markrootNext uint32 // next markroot job
   markrootJobs uint32 // number of markroot jobs

   nproc  uint32
   tstart int64
   nwait  uint32
   ndone  uint32

   // Number of roots of various root types. Set by gcMarkRootPrepare.
   nFlushCacheRoots                               int
   nDataRoots, nBSSRoots, nSpanRoots, nStackRoots int

   // Each type of GC state transition is protected by a lock.
   // Since multiple threads can simultaneously detect the state
   // transition condition, any thread that detects a transition
   // condition must acquire the appropriate transition lock,
   // re-check the transition condition and return if it no
   // longer holds or perform the transition if it does.
   // Likewise, any transition must invalidate the transition
   // condition before releasing the lock. This ensures that each
   // transition is performed by exactly one thread and threads
   // that need the transition to happen block until it has
   // happened.
   //
   // startSema protects the transition from "off" to mark or
   // mark termination.
   startSema uint32
   // markDoneSema protects transitions from mark to mark termination.
   markDoneSema uint32

   bgMarkReady note   // signal background mark worker has started
   bgMarkDone  uint32 // cas to 1 when at a background mark completion point
   // Background mark completion signaling

   // mode is the concurrency mode of the current GC cycle.
   mode gcMode

   // userForced indicates the current GC cycle was forced by an
   // explicit user call.
   userForced bool

   // totaltime is the CPU nanoseconds spent in GC since the
   // program started if debug.gctrace > 0.
   totaltime int64

   // initialHeapLive is the value of memstats.heap_live at the
   // beginning of this GC cycle.
   initialHeapLive uint64

   // assistQueue is a queue of assists that are blocked because
   // there was neither enough credit to steal or enough work to
   // do.
   assistQueue struct {
      lock mutex
      q    gQueue
   }

   // sweepWaiters is a list of blocked goroutines to wake when
   // we transition from mark termination to sweep.
   sweepWaiters struct {
      lock mutex
      list gList
   }

   // cycles is the number of completed GC cycles, where a GC
   // cycle is sweep termination, mark, mark termination, and
   // sweep. This differs from memstats.numgc, which is
   // incremented at mark termination.
   cycles uint32

   // Timing/utilization stats for this cycle.
   stwprocs, maxprocs                 int32
   tSweepTerm, tMark, tMarkTerm, tEnd int64 // nanotime() of phase start

   pauseNS    int64 // total STW time this cycle
   pauseStart int64 // nanotime() of last STW

   // debug.gctrace heap sizes for this cycle.
   heap0, heap1, heap2, heapGoal uint64
}

gcTrigger

// A gcTrigger is a predicate for starting a GC cycle. Specifically,
// it is an exit condition for the _GCoff phase.
type gcTrigger struct {
   kind gcTriggerKind
   now  int64  // gcTriggerTime: current time
   n    uint32 // gcTriggerCycle: cycle number to start
}

// test reports whether the trigger condition is satisfied, meaning
// that the exit condition for the _GCoff phase has been met. The exit
// condition should be tested when allocating.
func (t gcTrigger) test() bool {
	if !memstats.enablegc || panicking != 0 || gcphase != _GCoff {
		return false
	}
	switch t.kind {
	case gcTriggerHeap: 
		return memstats.heap_live >= memstats.gc_trigger
	case gcTriggerTime:
		if gcpercent < 0 {
			return false
		}
		lastgc := int64(atomic.Load64(&memstats.last_gc_nanotime))
		return lastgc != 0 && t.now-lastgc > forcegcperiod
	case gcTriggerCycle:
		// t.n > work.cycles, but accounting for wraparound.
		return int32(t.n-work.cycles) > 0
	}
	return true
}

runtime.gcTrigger.test 方法决定是否需要触发垃圾收集,当满足触发垃圾收集的基本条件时 — 允许垃圾收集、程序没有崩溃并且没有处于垃圾收集循环,该方法会根据三种不同的方式触发进行不同的检查。

  1. gcTriggerHeap — 堆内存的分配达到达控制器计算的触发堆大小;
  2. gcTriggerTime — 如果一定时间内没有触发,就会触发新的循环,该出发条件由 runtime.forcegcperiod 变量控制,默认为 2 分钟;
  3. gcTriggerCycle — 如果当前没有开启垃圾收集,则触发新的循环;

GC trace

https://blog.csdn.net/weixin_45583158/article/details/100143593

https://www.ardanlabs.com/blog/2019/07/garbage-collection-in-go-part3-gcpacing.html

reference

https://github.com/golang/go/blob/master/src/runtime/mbarrier.go