Cuite design (1/?): QObject in OCaml

2020-05-10 18:39:59+02:00

Two years ago, I worked on "Cuite", an OCaml binding to Qt5. It stalled when I got to the point where all core concepts were mapped to OCaml. The remaining work was very repetitive: go through the huge hierarchy of Qt classes and bind each method, accommodating for the occasional ad-hoc behavior.

There is also some shortcomings to revisit in my approach:

• Mapping between C++ and OCaml types is quite ad-hoc, there is no principled way to handle all the variations (some types behave likes values, some like references, some exist as part of a graph, some make sense on their own, etc).
• The runtime support library relies a lot on internals of OCaml runtime, and would benefit from a cleanup.
• The lack of ad-hoc polymorphism means that C++ method invocation has to be very explicit (e.g. foo->setBar(baz) translates to Foo.setBar foo baz). Also, the huge number of methods sometime significantly slows down compilation.

This post is the first of a series where I explain the thoughts that went in the design of the library and how these issues are addressed.

Exposing QObjects

QObject is the root of the main class hierarchy in Qt. It is used everywhere: all widgets are QObject instances.

The binding needs to expose QObject classes, instances and functions to OCaml programs. In this post we will take a look at memory management: how QObjects are allocated and released when manipulated from OCaml.

There are a few properties that I wanted the binding to preserve. This is subjective, another binding might look for other properties. Here is what I was looking for:

• Runtime safety. Incorrect use of the API should translate to an exception, not to a segmentation fault or memory corruption.
• Automatic memory management with opt-out. Most of the time, programmer should not worry about memory management. Occasionally, they might want to make sure memory is released on time. For instance when allocating large objects such as a picture, it is nice to release memory as early as possible.
• No arbitrary restriction or ad-hoc rules for objects (unless there is no alternative). Programmers should not worry about cyclic references or have to manage certain objects differently (except maybe for performance reason).
• QObjects should interact well with other OCaml features. Physical equality, ordering, and hashing should make sense.

I ended up with a scheme that provides all these properties to the binding. The rest of the post focuses on memory management for QObjects.

QObject values

Each QObject instance visible from the OCaml program is mapped to a unique value. This graph shows all the infrastructure involved.

An instance QObject *obj is made accessible from OCaml code via the mlproxy value. In other words, we want the functions:

value Val_QObject(QObject *obj);
QObject *QObject_val(value v);


QObjectal: from value to QObject

The OCaml block mlproxy contains a pointer to an object cproxy in the C++ heap. In turn cproxy has a pointer to obj, the QObject.

To get to the QObject from the OCaml value we just need two follow two pointers.

Handling QObject destruction

We need to keep track of when the QObject is deleted: the OCaml value might still be reachable and we don't want to accidentally deferences the QObject past that point.

This is not too difficult, we can either:

• Use a QPointer<QObject> instead of a QObject*: Qt will clear the QPointer on object deletion.
• Listen on the destroyed signal of the QObject.

From QObject to CProxy

The Val_QObject function will be invoked many times, we don't want to create a new proxy each time. The ProxyTable remember the CProxy associated to a QObject. It is a hash-table indexed by object addresses. It is populated by the helper function:

static CProxy *QObject_proxy(QObject *obj);


QObject_proxy starts by looking up the hash-table. If a valid proxy is found, it is returned. Otherwise, the object has not yet been exported to OCaml world. We allocate, initialize, and add a new CProxy to the table. The weakid field is initialized to -1.

From CProxy to value: the weakid field

We have a CProxy, but not yet an OCaml value. The weakid field is an index in the WeakTable, a global OCaml table that weakly references MLProxy's:

• If the field is not -1, a cell is already allocated. We can look directly in the weak table.
• If the field is -1, we allocate and initialize a new MLProxy value that points to the CProxy and index it in the weak table.

This is done from a primitive exported by OCaml code that also registers a finalizer to handled the cleanup of unreachable objects:

val finalize_and_index : ml_proxy -> int


Why go through the hoops of this weak table? Because C++ code needs to access the OCaml values but normal roots are strong references. That would prevent MLProxy values from being collectible by the GC.

QObjectval/ValQObject: ✔️

We now have both functions:

value Val_QObject(QObject *obj);
QObject *QObject_val(value v);


They:
- can convert from value to QObject and from QObject to value
- safely handle explicit QObject deletion
- enable automatic deletion of unreachable objects