Skip to main content

Refrence Counting Garbage Collector (Sorry Re-Post as Question)

8 replies [Last post]
mirza2m
Offline
Joined: 2008-12-04
Points: 0

Hi,
I want to implement a reference counting garbage collector for my school project. Any advice what would be a good place to start? How much time would it take?

Thank you

Babar

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
eric_arseneau
Offline
Joined: 2004-07-15
Points: 0

Derek would be the one to provide you with the appropriate location to look to inject your code.

However, I have a few observations
- how many bits do you intent to keep for this level
-- in a memory constrained device, the extra memory per object could be an issue
-- we go a long way towards trying to minimize the object header size
-- it could be that you just want to experiment, and memory constraints are not your concern :)
- a typical problem with reference counting is indicating that a max has been reached, which renders decrementing or incrementing count meaningless
-- example, I have 8 bits for ref count, and the object gets its 256th reference, there is no more bits to count the number of references
-- it seems that for completeness, your level count should also include this concept
--- you could say that an object graph will never be any deeper than X, but that would seem unsafe
- I am not sure I completely get your algorithm, but it seems to me that you would be traversing object graphs on assignments to every field of an object
-- this seems to me to be a performance issue no ?
-- without this you only have to do graph traversal when you run out of memory

mirza2m
Offline
Joined: 2008-12-04
Points: 0

Hi,
Due to dead line for my school project, I could not implement a Reference Counting Garbage Collector for squawk. I ended up writing a frame work of GC using AspectJ, but I would still like to implement one for squawk and see how my algorithm measure up against tracers. Could some one point me to code or code pattern that gets executed when references are updated?

If someone is interested in helping me, here is how my algorithm works. It is able to collect reference cycles

[i]I Introduced a notation of Level in each object. I defined the level to be its shortest distance from the root of its object graph. Every time a reference is added to or removed from an object the algorithm checks, if this reference update caused the level of the object to change. If so the level of the objects in current objects sub graph might also change, so the sub graph is scanned and checked for levels changes. A level of the object can either increase, decrease, stay the same or become undefined when the object becomes disconnected from the root, which indicates that the object should be collected. A level of the object increases when it is referred to by an object with level at least one level lesser than the current objects level. An objects level increases when a reference is removed causing the object to fall down to another level. The new level is determined by the next shallowest (lowest level) object pointing to it, if no such reference exists the object has undefined level and should be collected [/i]

Thank you

eric_arseneau
Offline
Joined: 2004-07-15
Points: 0

Derek would be the one to provide you with the appropriate location to look to inject your code.

However, I have a few observations
- how many bits do you intent to keep for this level
- in a memory constrained device, the extra memory per object could be an issue
- we go a long way towards trying to minimize the object header size
- it could be that you just want to experiment, and memory constraints are not your concern :)
- a typical problem with reference counting is indicating that a max has been reached, which renders decrementing or incrementing count meaningless
- example, I have 8 bits for ref count, and the object gets its 256th reference, there is no more bits to count the number of references
- it seems that for completeness, your level count should also include this concept
- you could say that an object graph will never be any deeper than X, but that would seem unsafe
- I am not sure I completely get your algorithm, but it seems to me that you would be traversing object graphs on assignments to every field of an object
- this seems to me to be a performance issue no ?
- without this you only have to do graph traversal when you run out of memory

mirza2m
Offline
Joined: 2008-12-04
Points: 0

> However, I have a few observations
> - in a memory constrained device, the extra memory
> per object could be an issue
> - we go a long way towards trying to minimize the
> object header size
> - it could be that you just want to experiment, and
> memory constraints are not your concern :)

I am concerned about the additional memory requirement in my algorithm but I figure if it works efficiently, it might be better at recycling memory and make up for the extra memory use.

> a typical problem with reference counting is
> indicating that a max has been reached, which
> renders decrementing or incrementing count
> meaningless
> - example, I have 8 bits for ref count, and the
> object gets its 256th reference, there is no more
> bits to count the number of references
> - it seems that for completeness, your level count
> should also include this concept
> - you could say that an object graph will never be
> any deeper than X, but that would seem unsafe
> - how many bits do you intent to keep for this level

One solution to this problem is to make level a configurable property of the JVM (like amount of ram to use ) and when the graph depth grows beyond this depth the virtual machine could throw a exception, like out of memory exception, any other solutions are more than welcome:)

> - I am not sure I completely get your algorithm, but
> it seems to me that you would be traversing object
> graphs on assignments to every field of an object
> - this seems to me to be a performance issue no ?
> - without this you only have to do graph traversal
> when you run out of memory

I think I might not able to explain my algorithm very well :). Object sub graph is traversed only when an objects level changes, which would happen generally when a reference is removed from an object that causes the current objects shortest distance from the root to change. Another situation in which object's level changes is when it is referenced by a shallowest object that has ever referred it. In other words if a object is created at deep parts of the graph and propagated up to shallower part of the object graph. Intuitively it seems like and I hope that levels of objects with big sub graph would not change that often.

derek_white
Offline
Joined: 2006-09-08
Points: 0

The place to update the write barrier is setObjectAndUpdateWriteBarrier() in memory.c.spp. Note the callers in bytecodes.c.spp. This does not track writes to local variables though (they are always scanned. There is a class of global variables which are not per-isolate, but shared across the VM. These are also always treated as roots.

I'm still not following your algorithm though. It seems that if slot in object A starts out pointing to Object B, and then is overwriten to point to object C, you would have to scan the heap to find out B's new level. You know that B's new level is somewhere in the range [A.level+1 .. inf], but I don't see how that helps. C's new level is simply the min of (B's old level and C's old level).

In any case, good luck - and keep thinking outside the box!

mirza2m
Offline
Joined: 2008-12-04
Points: 0

> The place to update the write barrier is
> setObjectAndUpdateWriteBarrier() in memory.c.spp.
> Note the callers in bytecodes.c.spp. This does not
> track writes to local variables though (they are
> always scanned. There is a class of global variables
> which are not per-isolate, but shared across the VM.
> These are also always treated as roots.
Thank you for pointing this out. I do wish this code was in java though. :). Ill see what I can do .
> I'm still not following your algorithm though. It
> seems that if slot in object A starts out pointing to
> Object B, and then is overwriten to point to object
> C, you would have to scan the heap to find out B's
> new level. You know that B's new level is somewhere
> in the range [A.level+1 .. inf], but I don't see how
> that helps. C's new level is simply the min of (B's
> old level and C's old level).
>
> In any case, good luck - and keep thinking outside
> the box!
You are right. In order to recalculate the new level for Object B I have to know the levels of all the objects that point to B, which requires a little bit more space. Here is some java code defining ObjectHeader, illustrating how I deal with this issue. (Just in case you are curious please ignore other wise : ) )

[i]final class ObjectHeader implements Cloneable {

private static final int[] PRIMES_1000 = new int[] {
2, 3, 5, 7, 11, 13, 17, 19, .... 1000 primes...};

private int level_;

private int refCounts_;

public int getLevel()

public void incRefCount(ObjectHeader from){
if(level_ - from.level_ >1)
level_ =from.level_ +1;

refCounts_ *= PRIMES_1000[ from.level_];
}
public void decRefCount(ObjectHeader from){
refCounts_ /= PRIMES_1000[ from.level_];

if(refCounts_ % PRIMES_1000[ from.level_] != 0)
level_ =calculateNextLevel(refCounts_, level_);

}
public boolean isGarbage(){

return level_ ==-1;
}

private static int calculateNextLevel(int refCounts, int fromLevel){
for(int i =fromLevel ; i < PRIMES_1000.length; i++){
if(refCounts < PRIMES_1000[i])
return -1;
else if(refCounts % PRIMES_1000[i] == 0)
return i+1;

}

return -1;
}

@Override
protected Object clone() throws CloneNotSupportedException {
// TODO Auto-generated method stub
return super.clone();
}
}[/i]

Thank you for your help

derek_white
Offline
Joined: 2006-09-08
Points: 0

HI,

The question is a bit vague. Do you mean for any language, or for Java, or all objects in Squawk, or for some objects in Squawk?

I'll try answering the 3rd option.

To add traditional reference counting in Squawk for all objects, you need to add code to all reads and write of object values, including reads and writes to local variables (AKA a "read barrier" and a "write barrier"). There is no support for this in Squawk right now.

Squawk does support a write barrier for write to object fields and global variables though.

Another approach is to use some form of deferred reference counting, which would add less machinery to the VM (see links from http://www.memorymanagement.org/glossary/d.html, http://en.wikipedia.org/wiki/Reference_counting#Dealing_with_inefficienc...).

I can't give any estimates without spending a bit of time thinking where all of barriers would have to be called, etc.

Good luck!

mirza2m
Offline
Joined: 2008-12-04
Points: 0

Thank you for your reply Derek

> I'll try answering the 3rd option.
That's exactly what I meant should have been more precise.

>There is no support for this in Squawk
> right now.
Does this mean that currently there is no way to intercept when a reference is added to or removed from an object?

> Another approach is to use some form of deferred
> reference counting
Wouldn't this also require intercepting reference updates?

babar