Garbage Collection and Memory Optimization#
We know that JS stores data (allocates memory) through the heap and stack.
- Primitive data types and the position of stack frames during code execution are stored in the stack space.
- Reference type data is stored in the heap space.
However, some data is no longer needed after use, becoming garbage data. If not collected, memory usage will continue to increase.
Garbage collection is generally divided into two strategies: manual collection and automatic collection.
Languages like C/C++ use manual collection strategies, where code determines how to allocate and deallocate memory. On the other hand, languages like JavaScript/Java/Python rely on garbage collectors to automatically release memory. However, this does not mean that we don't need to care about memory management (especially in JS, where many developers are easily confused by automatic garbage collection and neglect memory management).
I. Stack Memory Collection#
Let's first look at how data in the call stack is collected, starting from the execution context.
When a function execution context is created and pushed onto the stack, primitive data types are allocated in the stack, while reference type data is allocated in the heap.
At the same time, there is a pointer (ESP) that records the current execution state and executes the current execution context A.
When function A is called and finished, ESP moves down to the next execution context B. This downward movement destroys the previous execution context A. Although the execution context of A is still saved in the stack memory, it is already invalid memory. The next time another function is called, this memory will be directly overwritten to store the execution context of the new function.
Therefore, when a function finishes execution, the JS engine destroys its execution context by moving ESP downward.
II. Heap Memory Collection#
After the execution context of a function is destroyed in the stack, what happens to the objects stored in the heap? This is where the garbage collector comes in.
2.1 Generational Collection and Main Process#
In V8, the heap is divided into two regions: the young generation and the old generation.
- Young generation: stores short-lived objects -> minor garbage collector (Minor GC)
- Old generation: stores long-lived objects -> major garbage collector (Major GC)
The garbage collector follows a unified execution process:
- Marking objects
- Active objects: objects still in use
- Inactive objects: eligible for garbage collection
- Reclaiming memory occupied by inactive objects
- Memory compaction: rearranging memory fragments to create contiguous space for allocating larger continuous memory
2.2 Minor Garbage Collector#
Newly added objects are added to the object area. When the object area is nearly full, garbage collection needs to be performed.
Marking garbage in the object area -> copying live objects to the free area -> arranging them in order -> no garbage fragments in the free area -> the concepts of the object area and the free area are reversed -> unlimited reuse
Once the monitoring objects in the minor garbage collector are full, garbage collection is triggered. Additionally, the minor garbage collector adopts an object promotion strategy, which moves objects that are still alive after two rounds of garbage collection to the old generation.
2.3 Major Garbage Collector#
The characteristics of objects in the old generation heap are:
- Large memory footprint
- Long survival time
The major garbage collector uses two algorithms:
- Mark-Sweep
- Mark-Compact
There is also a primitive garbage collection algorithm in JS: reference counting. If an object has no references pointing to it, it will be collected. However, this algorithm cannot handle the problem of circular references.
First, garbage data is marked. The marking phase starts with a set of root elements and recursively traverses these elements. During this traversal, elements that can be reached are called active objects, while elements that cannot be reached are considered garbage data.
Then, the garbage is cleared. This is the Mark-Sweep algorithm.
However, after multiple executions of the Mark-Sweep algorithm on a block of memory, a large number of non-contiguous memory fragments will be generated. Excessive fragmentation will prevent large objects from being allocated with sufficient contiguous memory. To solve this problem, another algorithm called Mark-Compact is introduced.
First, recyclable objects are marked, but instead of directly clearing the recyclable objects, all live objects are moved to one end, and then the memory outside this end is directly cleared.
2.4 Summary of Garbage Collection#
III. Efficiency Optimization of Garbage Collection#
JavaScript runs on the main thread. Therefore, when executing the garbage collection algorithm, the currently executing JavaScript script needs to be paused until the garbage collection is complete before resuming script execution. This behavior is called Stop-The-World.
There are two main optimization strategies:
- The first strategy is parallel collection, where the garbage collector uses multiple auxiliary threads to perform garbage collection in parallel during a complete garbage collection process.
- The second strategy is incremental garbage collection, where the garbage collector divides the marking work into smaller chunks and interleaves them with different tasks on the main thread. When using incremental garbage collection, it is not necessary for the garbage collector to complete the entire garbage collection process at once. Each execution only performs a small part of the entire garbage collection process.
- The third strategy is concurrent collection, where the main thread can continue executing freely while the auxiliary threads perform garbage collection. The challenge lies in the read-write lock mechanism (not discussed here).
The main garbage collector adopts all of these strategies, while the minor garbage collector adopts some of them.
IV. Memory Optimization#
There are three common types of memory issues:
- Memory leaks, which cause deteriorating performance of the page.
- In JavaScript, memory leaks are mainly caused by memory data that is no longer needed (has no effect) but is still referenced by other objects.
- Memory bloat, which leads to consistently poor performance of the page.
- Frequent garbage collection, which causes delays or frequent pauses in the page.
4.1 Common Causes and Solutions for Memory Leaks#
- Global variables: Avoid setting large global objects as much as possible.
- Unreleased listeners and timers.
- Persistent closures: It is easy to create circular references between objects, so set variables to null to break the connection.
- DOM references: Retaining references to DOM nodes prevents GC from collecting them.
In addition to the above, use ES6's WeakMap/WeakSet
more often, as they maintain weak references to objects.
4.2 Methods for Identifying Memory Leaks#
- In the browser, memory can be observed by checking the memory to see if it is increasing.
- In a Node environment, memory information can be captured using heapdump for analysis.