Skip to content

feat: implement mark_sweep _branded based on approved API redesign#83

Open
shruti2522 wants to merge 7 commits intoboa-dev:mainfrom
shruti2522:prototype-impl
Open

feat: implement mark_sweep _branded based on approved API redesign#83
shruti2522 wants to merge 7 commits intoboa-dev:mainfrom
shruti2522:prototype-impl

Conversation

@shruti2522
Copy link
Copy Markdown
Contributor

Implement mark_sweep_branded GC based on the approved API redesign for oscars. This is a draft implementation, also includes changes from #82

@shruti2522 shruti2522 marked this pull request as ready for review April 23, 2026 17:57
Copy link
Copy Markdown
Member

@nekevss nekevss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some early feedback. I may try a second pass at this within a couple days.

In general, a lot of this is super interesting. Super excited to see it progress 😄

Comment thread oscars/src/collectors/mark_sweep_branded/gc.rs Outdated
/// Use `Tracer::mark` for every reachable `Gc` pointer.
pub trait Trace {
/// Marks all `Gc` pointers reachable from `self`.
fn trace(&mut self, tracer: &mut Tracer);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, a couple questions here:

  1. Have you looked into using a Trace color vs. a Tracer? I'd prefer that approach vs. keeping Tracer around.
  2. This should still be & vs. & mut.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the TraceColor refactor slipped my mind 😅 on it now. Also switching &mut self to &self since mark bits already use interior mutability via Cell<bool> so it's overkill

let slot = self
.pool
.borrow_mut()
.try_alloc_bytes(layout)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is incorrect. It should be try_alloc not try_alloc_bytes

Is there a reason for allocating the the BumpPage here rather than the slots?

If we are using one of the bump pages here, then we should probably use one of the arena allocators

// and we ensure that `Gc` objects and pool allocations do not outlive
// the `Collector` instance
pub(crate) pool: RefCell<PoolAllocator<'static>>,
pool_entries: RefCell<Vec<PoolEntry>>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need to maintain a Vec of pool entries here nor the allocation count. That should be managed by the allocator.

use core::ptr::NonNull;

/// A weak reference to a GC managed value
pub struct WeakGc<'id, T: Trace + ?Sized> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, I think we will still need an ephemeron implementation, but we could probably iterate on this

pub fn root<T: Trace + Finalize + 'gc>(&self, gc: Gc<'gc, T>) -> Root<'id, T> {
let gc_ptr = gc.ptr;

let node = Box::new(RootNode {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RootNode should ideally be allocated onto our heap. It would actually probably be really ideal to have a mempool specifically for RootNodes

pub(crate) fn iter_from_sentinel(
sentinel: NonNull<Self>,
) -> impl Iterator<Item = NonNull<Self>> {
struct Iter {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: this should probably just be a separate iterator RootLinkIter, and then we call iter that returns RootLinkIter rather than iter_from_sentinel call which is an anonymous iter

// the `Collector` instance
pub(crate) pool: RefCell<PoolAllocator<'static>>,
pool_entries: RefCell<Vec<PoolEntry>>,
pub(crate) sentinel: Pin<Box<RootLink>>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a new type RootSentinel that can be a transparent new type.

pub struct RootSentinel(Pin<Box<RootLink>>);

/// Reachability flag set by the mark phase.
pub(crate) marked: Cell<bool>,
/// Type-erased trace function.
pub(crate) trace_fn: Option<TraceFn>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever an instance where trace_fn would be None. This should be fine to just be TraceFn as it's never not set.

@shruti2522 shruti2522 force-pushed the prototype-impl branch 2 times, most recently from 57590c5 to 85b9952 Compare April 24, 2026 21:20
Copy link
Copy Markdown
Member

@nekevss nekevss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have another small batch of comments / feedback

let size = mem::size_of::<PoolItem<gc::RootNode<'id, T>>>();
let mut pool = self.root_pool.borrow_mut();
let slot_ptr = pool
.alloc_slot_raw(size)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we creating new methods on pool rather than using the pre-existing try_alloc?


let mut pool = self.pool.borrow_mut();
let slot_ptr = pool
.alloc_slot_raw(size)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here: why would we be using a new alloc_slot_raw vs. the pre-existing try_alloc

// Initialize the PoolItem<GcBox<T>> in place
core::ptr::write(
pool_item_ptr.as_ptr(),
PoolItem(GcBox {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ideally be defined as a ctor on GcBox rather than created here.


self.generic_alloc_id.set(alloc_id.wrapping_add(1));

unsafe fn trace_value<T: trace::Trace>(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition can be moved to the gc_box module

///
/// The `'a` lifetime ties the color to the collection cycle,
/// preventing it from being stored or escaping the collector.
pub struct TraceColor<'a> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change just kept the tracer implementation and named it to TraceColor. Is there a reason for the name change vs. switching to the full tricolor approach?

Is the worklist somehow necessary for this current implementation?

@shruti2522 shruti2522 force-pushed the prototype-impl branch 3 times, most recently from e270d95 to 08de4ff Compare April 29, 2026 14:24
Copy link
Copy Markdown
Member

@nekevss nekevss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully last round of general review. 😅 There's a several things that I think I'd prefer to address on this first pass at this API


/// A weak reference to a GC managed value
pub struct WeakGc<'id, T: Trace + ?Sized> {
pub(crate) ptr: NonNull<PoolItem<GcBox<T>>>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use PoolPointer

use core::ptr::NonNull;

pub struct Ephemeron<'id, K: Trace, V: Trace> {
pub(crate) key_ptr: NonNull<PoolItem<GcBox<K>>>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: key_ptr here should be an Option<pointer>

Ephemeron keys are removed and set to None when they underlying ptr is invalid (i.e. null).

/// A transient pointer to a GC-managed value.
#[derive(Debug)]
pub struct Gc<'gc, T: Trace + ?Sized + 'gc> {
pub(crate) ptr: NonNull<PoolItem<GcBox<T>>>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere: is there a reason for using NonNull<PoolItem<GcBox<T>>>? If there's not a particular reason, then we should be using PoolPointer

// the `Collector` instance
pub(crate) pool: RefCell<PoolAllocator<'static>>,
/// Dedicated pool for RootNode allocations
pub(crate) root_pool: RefCell<PoolAllocator<'static>>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: hmmmm, intriguing. I'm not entirely convinced immediately that this is needed, but we can leave it as is currently.

for link_ptr in self.sentinel.iter() {
unsafe {
// Read the `gc_ptr` field using the stable offset
let gc_ptr_ptr = link_ptr
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing pointer offsets here, we should be able to cast to an ErasedRootNode type and use that specific type to begin the marking correct? My initial reaction would is that may be a better approach as it allows us to be more explicit and not deal as much in NonNull<u8> land.

Is there any benefit to manually calculating the pointer and dealing with NonNull<u8>s over explicit types

drop_and_free::<T>,
alloc_id,
))
.expect("branded GC: pool allocation failed");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be throwing this error and not panicking

for entry in self.ephemerons.borrow().iter() {
use crate::alloc::mempool3::PoolItem;
unsafe {
let key_box = entry.key_ptr.cast::<PoolItem<GcBox<()>>>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: If we are always casting to this value, then we should just probably just store the pointer as that value.

unsafe {
let key_box = entry.key_ptr.cast::<PoolItem<GcBox<()>>>();
// Skip entries invalidated by a previous collection cycle.
if (*key_box.as_ptr()).0.alloc_id != entry.key_alloc_id {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be probably handled by the ephemeron key being an option

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants