Wednesday, 11 February 2009

Notes on the Java Memory Model and Garbage Collection

Understanding the Java Memory Model is key to understanding critical Java behaviour, including Garbage Collection, thread synchronization, volatile and class and object management. The diagram below is the starting point to items listed in this blog entry.



  • Class loading, into the Heap (shared between threads), is carried out in the following sequence:
    1. On start-up, the JVM loads the objects it requires to operate using the Bootstrap Loader. These objects are obtained from class files in rt.jar
    2. The ExtClassLoader will load classes from the path specified by the property java.ext.dir
    3. All other required classes are loaded by the AppClassLoader from locations specified in the CLASSPATH. These objects are loaded as required by individual threads.
  • Garbage Collection aims to free heap memory used by objects that can't be reached from a root object.
  • Root objects are either:
    • Local variables on the stack
    • Parameters of functions on the stack
    • JNI native references
    • All classes loaded by the bootstrap Loader
  • Note the different types of references (see this Sun page for details), going from strongest to weakest reachability:
    1. An object is strongly reachable if it can be reached by some thread without traversing any reference objects. A newly-created object is strongly reachable by the thread that created it.
    2. An object is softly reachable if it is not strongly reachable but can be reached by traversing a soft reference.
    3. An object is weakly reachable if it is neither strongly nor softly reachable but can be reached by traversing a weak reference. When the weak references to a weakly-reachable object are cleared, the object becomes eligible for finalization.
    4. An object is phantom reachable if it is neither strongly, softly, nor weakly reachable, it has been finalized, and some phantom reference refers to it.
    5. Finally, an object is unreachable, and therefore eligible for reclamation, when it is not reachable in any of the above ways.
  • Retained Set - the objects that would be garbage collected
  • Retained Size - the memory that would be released once the retained set was collected
  • Garbage Collection is based on the generational heap model outlined in the diagram above (where the green boxes, Eden Survivor 1 and 2, are the Young Generation):
    1. New objects allocated to Eden space.
    2. Once Eden is "full", objects are garbage collected and survivors are placed into one of the survivior spaces. The vast majority of objects will never make it into the survivor space as they will die soon after creation.
    3. Once Eden or the Survivor space is full, the objects in both are garbage collected and placed in the other survivor space.
    4. Objects in the Survivor space that survive several generations are copied into the tenured space.
  • The proportion of the heap made available to each generation can be controlled on starting the JVM as follows:
    • -XX:NewRatio=3 will provide a ratio between the Young and Tenured generation of 1 to 3
    • -XX:SurvivorRatio=6 will provide a ratio between the each Survivor space and Eden as 1 to 6.
  • Java 1.5 provides 4 garbage collectors to allow for compromise between throughput and pausing:
    • The Default Collector
    • The Throughput Collector (Parallel GC)
    • The Concurrent Low Pause Collector (Concurrent GC)
    • The incremental low-pause collector (Not supported in future releases)
  • To measure basic GC performance, use the argument -verbose:gc output on startup. As explained by Pete Freitag, the output [GC 325407K->83000K(776768K), 0.2300771 secs] would mean:
    • GC - Indicates that it was a minor collection (young generation). If it had said Full GC then that indicates that it was a major collection (tenured generation).
    • 325407K - The combined size of live objects before garbage collection.
    • 83000K - The combined size of live objects after garbage collection.
    • (776768K) - the total available space, not counting the space in the permanent generation, which is the total heap minus one of the survivor spaces.
    • 0.2300771 secs - time it took for garbage collection to occur

2 comments:

Unknown said...

Nice pic ... but what's the difference between the method area and the permanent generation?

Jason Harris said...

The permanent generation is where the JVM stores objects it needs to carry out generic activity. These objects are not reachable from standard Java applications. The Method area, on the other hand, is where the JVM stores the Class objects (which includes the methods on each class) loaded as a result of running the Java applications hosted by the JVM.