反正已經追了,就把我之前想到的疑問和從 Dalvik 原始碼裡挖出的答案做一下記錄好了……zzz
Dalvik GC 的演算法
程式碼位於 vm/alloc/MarkSweep.cpp,除了 mark bit 是採用 bitmap 而不是跟隨著 object 本身外,毫無反應,就是個標準的 Mark and Sweep 演算法。
Dalvik 裡 Java 物件怎麼放在記憶體中?
根據 vm/oo/Object.h 的:
379 /*
380 * Total object size; used when allocating storage on gc heap. (For
381 * interfaces and abstract classes this will be zero.)
382 */
383 size_t objectSize;
和 vm/alloc/Alloc.cpp 中,dvmAllocObject 函式裡的 newObj = (Object*)dvmMalloc(clazz->objectSize, flags); 告訴我們,毫無反應,就是個事先計算好大小的記憶體空間。
Dalvik GC 是否有區別 prmitive type 和 reference?
根據 vm/oo/Object.h 裡面告訴我們,一個物件所有是 reference type 的成員變數,都會被調到存放成員變數的區域的最開頭:
451 /* instance fields
452 *
453 * These describe the layout of the contents of a DataObject-compatible
454 * Object. Note that only the fields directly defined by this class
455 * are listed in ifields; fields defined by a superclass are listed
456 * in the superclass's ClassObject.ifields.
457 *
458 * All instance fields that refer to objects are guaranteed to be
459 * at the beginning of the field list. ifieldRefCount specifies
460 * the number of reference fields.
461 */
462 int ifieldCount;
463 int ifieldRefCount; // number of fields that are object refs
464 InstField* ifields;
465
466 /* bitmap of offsets of ifields */
467 u4 refOffsets;
再加上 vm/alloc/MarkSweep.cpp 裡的:
/*
* Scans instance fields.
*/
static void scanFields(const Object *obj, GcMarkContext *ctx)
{
assert(obj != NULL);
assert(obj->clazz != NULL);
assert(ctx != NULL);
if (obj->clazz->refOffsets != CLASS_WALK_SUPER) {
unsigned int refOffsets = obj->clazz->refOffsets;
while (refOffsets != 0) {
size_t rshift = CLZ(refOffsets);
size_t offset = CLASS_OFFSET_FROM_CLZ(rshift);
Object *ref = dvmGetFieldObject(obj, offset);
markObject(ref, ctx);
refOffsets &= ~(CLASS_HIGH_BIT >> rshift);
}
} else {
for (ClassObject *clazz = obj->clazz;
clazz != NULL;
clazz = clazz->super) {
InstField *field = clazz->ifields;
for (int i = 0; i < clazz->ifieldRefCount; ++i, ++field) {
void *addr = BYTE_OFFSET(obj, field->byteOffset);
Object *ref = ((JValue *)addr)->l;
markObject(ref, ctx);
}
}
}
}
是的,他告訴我們 Dalvik GC 在做 Mark 階段時,只會 Mark Java 物件,而不會去管 primitive type。
Dalvik GC 釋放記憶體的單位是什麼
根據 vm/alloc/HeapSource.cpp 裡的 dvmHeapSourceFreeList 告訴我們:
903 /*
904 * Frees the first numPtrs objects in the ptrs list and returns the
905 * amount of reclaimed storage. The list must contain addresses all in
906 * the same mspace, and must be in increasing order. This implies that
907 * there are no duplicates, and no entries are NULL.
908 */
909 size_t dvmHeapSourceFreeList(size_t numPtrs, void **ptrs)
910 {
..........
928 // mspace_merge_objects takes two allocated objects, and
929 // if the second immediately follows the first, will merge
930 // them, returning a larger object occupying the same
931 // memory. This is a local operation, and doesn't require
932 // dlmalloc to manipulate any freelists. It's pretty
933 // inexpensive compared to free().
934
935 // ptrs is an array of objects all in memory order, and if
936 // client code has been allocating lots of short-lived
937 // objects, this is likely to contain runs of objects all
938 // now garbage, and thus highly amenable to this optimization.
939
940 // Unroll the 0th iteration around the loop below,
941 // countFree ptrs[0] and initializing merged.
........
973 }
由此可知,Dalvik GC 釋放記憶體的單位是一個 Java 物件的整個空間,甚至還會把兩個緊臨的垃圾物件合併起來 free 掉。
至於 WeakReference / SoftReference 和 PhantomReference
Android 的 JavaDoc 早就告訴我們,PhantomReference.get 永遠拿到 null、而 WeakReference.get 和 SoftReference.get 只要本來指到的物件消失後,就只能拿到 null。
換句話說,當你物件被回收後,還能透過這些東西拿到 reference 的話,此時 Android / Dalvik VM 本身就已經不在正常的狀態了。
結論
我左看右看,都看不出來 Dalvik VM 在正常(也就是符合其設計預期,而且也沒有 bug)的情況下會有以下的狀況:
- 他會自動將目前 Reference 數量稀少的記憶體回收 (Mark and Sweep 本來就和 Reference 數量無關)
- 單一個 primitive 的整數 instance variable 被回收,不是 bug 而是 Dalvik 的特性 (上面已經看到 Dalvik 的設計上本來就是回收整個物件)
- Dalvik 的 Reference 管理策略,會導至根本就不是 reference type 的整數被回收,或是一個物件被回收後你還能拿到他的 reference。
還是再說一次,如果有人能告訴我,我對於 Dalvik VM 的 GC 運作原理的理解是錯的,我會很高興。
回響