Dalvik GC 小記

反正已經追了,就把我之前想到的疑問和從 Dalvik 原始碼裡挖出的答案做一下記錄好了……zzz

Dalvik GC 的演算法

程式碼位於 vm/alloc/MarkSweep.cpp,除了 mark bit 是採用 bitmap 而不是跟隨著 object 本身外,毫無反應,就是個標準的 Mark and Sweep 演算法。

Dalvik 裡 Java 物件怎麼放在記憶體中?

根據 vm/oo/Object.h 的:

379     /*
380      * Total object size; used when allocating storage on gc heap.  (For
381      * interfaces and abstract classes this will be zero.)
382      */
383     size_t          objectSize;

vm/alloc/Alloc.cpp 中,dvmAllocObject 函式裡的 newObj = (Object*)dvmMalloc(clazz->objectSize, flags); 告訴我們,毫無反應,就是個事先計算好大小的記憶體空間。

Dalvik GC 是否有區別 prmitive type 和 reference?

根據 vm/oo/Object.h 裡面告訴我們,一個物件所有是 reference type 的成員變數,都會被調到存放成員變數的區域的最開頭:

451     /* instance fields
452      *
453      * These describe the layout of the contents of a DataObject-compatible
454      * Object.  Note that only the fields directly defined by this class
455      * are listed in ifields;  fields defined by a superclass are listed
456      * in the superclass's ClassObject.ifields.
457      *
458      * All instance fields that refer to objects are guaranteed to be
459      * at the beginning of the field list.  ifieldRefCount specifies
460      * the number of reference fields.
461      */
462     int             ifieldCount;
463     int             ifieldRefCount; // number of fields that are object refs
464     InstField*      ifields;
465
466     /* bitmap of offsets of ifields */
467     u4 refOffsets;

再加上 vm/alloc/MarkSweep.cpp 裡的:

/*
 * Scans instance fields.
 */
static void scanFields(const Object *obj, GcMarkContext *ctx)
{
    assert(obj != NULL);
    assert(obj->clazz != NULL);
    assert(ctx != NULL);
    if (obj->clazz->refOffsets != CLASS_WALK_SUPER) {
        unsigned int refOffsets = obj->clazz->refOffsets;
        while (refOffsets != 0) {
            size_t rshift = CLZ(refOffsets);
            size_t offset = CLASS_OFFSET_FROM_CLZ(rshift);
            Object *ref = dvmGetFieldObject(obj, offset);
            markObject(ref, ctx);
            refOffsets &= ~(CLASS_HIGH_BIT >> rshift);
        }
    } else {
        for (ClassObject *clazz = obj->clazz;
             clazz != NULL;
             clazz = clazz->super) {
            InstField *field = clazz->ifields;
            for (int i = 0; i < clazz->ifieldRefCount; ++i, ++field) {
                void *addr = BYTE_OFFSET(obj, field->byteOffset);
                Object *ref = ((JValue *)addr)->l;
                markObject(ref, ctx);
            }
        }
    }
}

是的,他告訴我們 Dalvik GC 在做 Mark 階段時,只會 Mark Java 物件,而不會去管 primitive type。

Dalvik GC 釋放記憶體的單位是什麼

根據 vm/alloc/HeapSource.cpp 裡的 dvmHeapSourceFreeList 告訴我們:

903 /*
904  * Frees the first numPtrs objects in the ptrs list and returns the
905  * amount of reclaimed storage. The list must contain addresses all in
906  * the same mspace, and must be in increasing order. This implies that
907  * there are no duplicates, and no entries are NULL.
908  */
909 size_t dvmHeapSourceFreeList(size_t numPtrs, void **ptrs)
910 {

      ..........

928             // mspace_merge_objects takes two allocated objects, and
929             // if the second immediately follows the first, will merge
930             // them, returning a larger object occupying the same
931             // memory. This is a local operation, and doesn't require
932             // dlmalloc to manipulate any freelists. It's pretty
933             // inexpensive compared to free().
934
935             // ptrs is an array of objects all in memory order, and if
936             // client code has been allocating lots of short-lived
937             // objects, this is likely to contain runs of objects all
938             // now garbage, and thus highly amenable to this optimization.
939
940             // Unroll the 0th iteration around the loop below,
941             // countFree ptrs[0] and initializing merged.

      ........
973 }

由此可知,Dalvik GC 釋放記憶體的單位是一個 Java 物件的整個空間,甚至還會把兩個緊臨的垃圾物件合併起來 free 掉。

至於 WeakReference / SoftReference 和 PhantomReference

Android 的 JavaDoc 早就告訴我們,PhantomReference.get 永遠拿到 null、而 WeakReference.get 和 SoftReference.get 只要本來指到的物件消失後,就只能拿到 null。

換句話說,當你物件被回收後,還能透過這些東西拿到 reference 的話,此時 Android / Dalvik VM 本身就已經不在正常的狀態了。

結論

我左看右看,都看不出來 Dalvik VM 在正常(也就是符合其設計預期,而且也沒有 bug)的情況下會有以下的狀況:

  1. 他會自動將目前 Reference 數量稀少的記憶體回收 (Mark and Sweep 本來就和 Reference 數量無關)
  2. 單一個 primitive 的整數 instance variable 被回收,不是 bug 而是 Dalvik 的特性 (上面已經看到 Dalvik 的設計上本來就是回收整個物件)
  3. Dalvik 的 Reference 管理策略,會導至根本就不是 reference type 的整數被回收,或是一個物件被回收後你還能拿到他的 reference。

還是再說一次,如果有人能告訴我,我對於 Dalvik VM 的 GC 運作原理的理解是錯的,我會很高興。

回響