[TOC]
It is a concurrent mark and sweep that uses a write barrier. It is non-generational and non-compacting. – mgc.go
Go 实现的垃圾回收器是无分代(对象没有代际之分)、不整理(回收过程中不对对象进行移动与整理)、并发(与用户代码并发执行)的三色标记清扫算法。
// GC 阶段定义
const (
_GCoff = iota // GC not running; sweeping in background, write barrier disabled
_GCmark // GC marking roots and workbufs: allocate black, write barrier ENABLED
_GCmarktermination // GC mark termination: allocate black, P's help GC, write barrier ENABLED
)
// 是垃圾收集器当前处于的阶段,可能处于 _GCoff、_GCmark 和 _GCmarktermination,Goroutine 在读取或者修改该阶段时需要保证原子性;
var gcphase uint32
// 布尔值,当垃圾收集处于标记阶段时,该变量会被置为 1,在这里辅助垃圾收集的用户程序和后台标记的任务可以将对象涂黑;
var gcBlackenEnabled uint32
//实现了垃圾收集的调步算法,它能够决定触发并行垃圾收集的时间和待处理的工作;
var gcController gcControllerState
// 触发垃圾收集的内存增长百分比,默认情况下为 100,即堆内存相比上次垃圾收集增长 100% 时应该触发 GC
// Initialized from $GOGC. GOGC=off means no GC.
var gcpercent int32
// Holding worldsema grants an M the right to try to stop the world.
var worldsema uint32 = 1
// gcMode indicates how concurrent a GC cycle should be.
type gcMode int
const (
gcBackgroundMode gcMode = iota // concurrent GC and sweep 并发模式
gcForceMode // stop-the-world GC now, concurrent sweep
gcForceBlockMode // stop-the-world GC now and STW sweep (forced by user)
)
var gcController gcControllerState
// 会根据全局处理器的个数以及垃圾收集的 CPU 利用率计算出 dedicatedMarkWorkersNeeded 和 fractionalUtilizationGoal 以决定不同模式的工作协程的数量。
gcControllerState.startCycle()
gcControllerState.enlistWorker()
// 返回当前P所绑定的 后台 Mark Worker Goroutine(如果可运行),
// 根据当前运行状态及策略设置处理器的 gcMarkWorkerMode 工作模式,然后唤醒
gcControllerState.findRunnableGCWorker()
控制器全程参与并发回收任务,记录相关状态数据,动态调整运行策略,影响并发标记单元的工作模式和数量,平衡CPU资源占用。当回收结東时,参与 next gc回收國值设置,调整垃圾回收触发频率。
gcController实现GC步调控制器,它决定何时触发并发垃圾收集,以及在mutator辅助和后台标记中要做多少标记工作。
它使用反馈控制算法,根据每个周期的堆增长和 GC 的 CPU利用率调整来 memstats_gc.trigger触发。
type gcMarkWorkerMode int
const (
gcMarkWorkerDedicatedMode gcMarkWorkerMode = iota // 全力模式
gcMarkWorkerFractionalMode // 部分模式
gcMarkWorkerIdleMode // 空闲模式
)
gcMarkWorkerDedicatedMode
— 处理器专门专属 mark worker用于标记对象,不会被调度器抢占,全力运行,直到并发标记任务结束。gcMarkWorkerFractionalMode
— 部分参与标记任务, 但可被抢占和调度,当垃圾收集的后台 CPU 使用率达不到预期时(默认为 25%),启动该类型的工作协程帮助垃圾收集达到利用率的目标;gcMarkWorkerIdleMode
— 当处理器没有可以执行的 Goroutine 时,它会运行垃圾收集的标记任务直到被抢占,仅在空闲时参与标记任务?三种不同模式的工作协程会相互协同保证垃圾收集的 CPU 利用率达到期望的阈值,在到达目标堆大小前完成标记任务。
是一个包含写屏障状态的结构体,其中的 enabled
字段表示写屏障的开启与关闭;
// The compiler knows about this variable.
// If you change it, you must change builtin/runtime.go, too.
// If you change the first four bytes, you must also change the write
// barrier insertion code.
var writeBarrier struct {
enabled bool // 写屏障的开启与关闭标记;
pad [3]byte // compiler uses 32-bit load for "enabled" field
needed bool // whether we need a write barrier for current GC phase
cgo bool // whether we need a write barrier for a cgo check
alignme uint64 // guarantee alignment so that compiler can use a 32 or 64-bit load
}
同步屏障(Barrier)是并行计算中的一种同步方法。对于一群进程或线程,程序中的一个同步屏障意味着任何线程/进程执行到此后必须等待,直到所有线程/进程都到达此点才可继续执行下文。
内存屏障
(英语:Memory barrier)是一类同步屏障
指令,它使得 CPU 或编译器在对内存进行操作的时候, 严格按照一定的顺序来执行, 也就是说在memory barrier 之前的指令和memory barrier之后的指令不会由于系统优化等原因而导致乱序,即在内存屏障前执行的操作一定会先于内存屏障后执行的操作。
大多数现代计算机为了提高性能而采取乱序执行,这使得内存屏障成为必须。
大多数处理器提供了内存屏障指令:
语义上,内存屏障之前的所有写操作都要写入内存;内存屏障之后的读操作都可以获得同步屏障之前的写操作的结果。因此,对于敏感的程序块,写操作之后、读操作之前可以插入内存屏障。
使用:
//go:nowritebarrier
func gcDrain(gcw *gcWork, flags gcDrainFlags) {
if !writeBarrier.needed {
throw("gcDrain phase incorrect")
}
...
}
既不会在编译期被编译器进行调整,也不会在运行时被 CPU 的乱序执行所打乱
work 结构体有很多垃圾收集的相关字段,
例如:表示完成的垃圾收集循环的次数、当前循环时间和 CPU 的利用率、垃圾收集的模式等等
var work struct {
full lfstack // lock-free list of full blocks workbuf
empty lfstack // lock-free list of empty blocks workbuf
pad0 cpu.CacheLinePad // prevents false-sharing between full/empty and nproc/nwait
wbufSpans struct {
lock mutex
// free is a list of spans dedicated to workbufs, but
// that don't currently contain any workbufs.
free mSpanList
// busy is a list of all spans containing workbufs on
// one of the workbuf lists.
busy mSpanList
}
// Restore 64-bit alignment on 32-bit.
_ uint32
// bytesMarked is the number of bytes marked this cycle. This
// includes bytes blackened in scanned objects, noscan objects
// that go straight to black, and permagrey objects scanned by
// markroot during the concurrent scan phase. This is updated
// atomically during the cycle. Updates may be batched
// arbitrarily, since the value is only read at the end of the
// cycle.
//
// Because of benign races during marking, this number may not
// be the exact number of marked bytes, but it should be very
// close.
//
// Put this field here because it needs 64-bit atomic access
// (and thus 8-byte alignment even on 32-bit architectures).
bytesMarked uint64
markrootNext uint32 // next markroot job
markrootJobs uint32 // number of markroot jobs
nproc uint32
tstart int64
nwait uint32
ndone uint32
// Number of roots of various root types. Set by gcMarkRootPrepare.
nFlushCacheRoots int
nDataRoots, nBSSRoots, nSpanRoots, nStackRoots int
// Each type of GC state transition is protected by a lock.
// Since multiple threads can simultaneously detect the state
// transition condition, any thread that detects a transition
// condition must acquire the appropriate transition lock,
// re-check the transition condition and return if it no
// longer holds or perform the transition if it does.
// Likewise, any transition must invalidate the transition
// condition before releasing the lock. This ensures that each
// transition is performed by exactly one thread and threads
// that need the transition to happen block until it has
// happened.
//
// startSema protects the transition from "off" to mark or
// mark termination.
startSema uint32
// markDoneSema protects transitions from mark to mark termination.
markDoneSema uint32
bgMarkReady note // signal background mark worker has started
bgMarkDone uint32 // cas to 1 when at a background mark completion point
// Background mark completion signaling
// mode is the concurrency mode of the current GC cycle.
mode gcMode
// userForced indicates the current GC cycle was forced by an
// explicit user call.
userForced bool
// totaltime is the CPU nanoseconds spent in GC since the
// program started if debug.gctrace > 0.
totaltime int64
// initialHeapLive is the value of memstats.heap_live at the
// beginning of this GC cycle.
initialHeapLive uint64
// assistQueue is a queue of assists that are blocked because
// there was neither enough credit to steal or enough work to
// do.
assistQueue struct {
lock mutex
q gQueue
}
// sweepWaiters is a list of blocked goroutines to wake when
// we transition from mark termination to sweep.
sweepWaiters struct {
lock mutex
list gList
}
// cycles is the number of completed GC cycles, where a GC
// cycle is sweep termination, mark, mark termination, and
// sweep. This differs from memstats.numgc, which is
// incremented at mark termination.
cycles uint32
// Timing/utilization stats for this cycle.
stwprocs, maxprocs int32
tSweepTerm, tMark, tMarkTerm, tEnd int64 // nanotime() of phase start
pauseNS int64 // total STW time this cycle
pauseStart int64 // nanotime() of last STW
// debug.gctrace heap sizes for this cycle.
heap0, heap1, heap2, heapGoal uint64
}
// A gcTrigger is a predicate for starting a GC cycle. Specifically,
// it is an exit condition for the _GCoff phase.
type gcTrigger struct {
kind gcTriggerKind
now int64 // gcTriggerTime: current time
n uint32 // gcTriggerCycle: cycle number to start
}
// test reports whether the trigger condition is satisfied, meaning
// that the exit condition for the _GCoff phase has been met. The exit
// condition should be tested when allocating.
func (t gcTrigger) test() bool {
if !memstats.enablegc || panicking != 0 || gcphase != _GCoff {
return false
}
switch t.kind {
case gcTriggerHeap:
return memstats.heap_live >= memstats.gc_trigger
case gcTriggerTime:
if gcpercent < 0 {
return false
}
lastgc := int64(atomic.Load64(&memstats.last_gc_nanotime))
return lastgc != 0 && t.now-lastgc > forcegcperiod
case gcTriggerCycle:
// t.n > work.cycles, but accounting for wraparound.
return int32(t.n-work.cycles) > 0
}
return true
}
runtime.gcTrigger.test
方法决定是否需要触发垃圾收集,当满足触发垃圾收集的基本条件时 — 允许垃圾收集、程序没有崩溃并且没有处于垃圾收集循环,该方法会根据三种不同的方式触发进行不同的检查。
gcTriggerHeap
— 堆内存的分配达到达控制器计算的触发堆大小;gcTriggerTime
— 如果一定时间内没有触发,就会触发新的循环,该出发条件由 runtime.forcegcperiod
变量控制,默认为 2 分钟;gcTriggerCycle
— 如果当前没有开启垃圾收集,则触发新的循环;https://blog.csdn.net/weixin_45583158/article/details/100143593
https://www.ardanlabs.com/blog/2019/07/garbage-collection-in-go-part3-gcpacing.html
https://github.com/golang/go/blob/master/src/runtime/mbarrier.go