cd ~

Progress on racket-vulkan C Type Declarations

Earlier I announced work generating FFI bindings for Vulkan in Racket from XML and C. You can see the code generated at time of writing here.

Here I share my notes in generating type declarations, and how the project will evolve based on what I’ve learned so far. I offer this article to help other developers understand implementation challenges in bringing Vulkan to a new language, and to reflect on how that relates to Racket in particular.

I’ll lead by admitting that I am NOT a Vulkan expert, and this is the first serious effort I’ve made to generate a complete set of bindings against a C library. I consider this a good technical challenge that is part of my education, so please report mistakes at moc.drarblahegegas@egas.

How I Welcomed Pain

The XML representing the API registry, vk.xml, is a testament —and test—of the maintainers’ determination. While it borrows much from the OpenGL spec, there is enough organic growth to call vk.xml its own beast.

The Racket user list and a close friend both suggested just parsing the C and generating bindings that way. I opted not to take this natural shortcut because the XML can help me populate contracts/types, and help me filter to subsets of the API depending on the host platform.

In taking on this project I want to generate as much helpful code as possible using only the Racket ecosystem. This means I am not using Khronos’ own Python helper scripts, and am entering the vk.xml dragon’s den with nothing but a sword and shield engraved with parentheses.

What this says about my intellect is an exercise for the reader.

vk.xml challenges

vk.xml is a machine readable specification of the Vulkan API, inclusive of released versions, supported platforms, and extensions.

Despite what you might expect, vk.xml is optimized for humans. This makes sense because humans are expected to manually edit the spec when making API changes, and having strict structural rules would make the XML difficult to edit—let alone refactor if the structure turned out to be infeasible.

This does of course mean that any program parsing the XML file has to stay aware of the semantic context of any one element. A fatal assumption of mine was that the type-declaring elements had enough data to piece together a complete declaration in a new language.

If you are trying to interface with Vulkan from a new language, please take a minute to review these pain points.

All of the examples I cite here are from the version of vk.xml written on October 13th, 2019.

I will also refer to the Registry Guide by section.

Ordering FFI Declarations

Let’s start with my “fatal assumption.”

vk.xml has a plethora of <type> elements describing the data types in Vulkan.

From §9.2:

Zero or more type and comment tags, in arbitrary order (though they are typically ordered by putting dependencies of other types earlier in the list).

Mmkay. Well, I want to guarentee an order of declarations. Racket won’t like it if I declare an unbound identifier as a result of blind trust in the order of <type> elements.

Ah, §10.1 speaks of a requires attribute!

requires - optional. Another type name this type requires to complete its definition.

Ok, cool. I’ll build a directed graph using this attribute and perform a topological sort to order my declarations.

Aaaaannd it didn’t work. The VkPhysicalDeviceLimits struct is defined on line 1449 in vk.xml, but used on line 691. There was no “requires” attribute naming VkPhysicalDeviceLimits.

If I wanted to do the graph approach, I’d have to draw edges based on direct name reference. Due to the way vk.xml mixes XML and C, this was past my pain threshold. I instead opted to “forward declare” all structs and unions since Racket lets me refer to custom types symbolically:

(define _VkMemoryAllocateInfo 'VkMemoryAllocateInfo)

(UPDATE Oct 19th 2019: This actually did not work, because Racket started complaining about me redefining identifiers. I ended up ignoring my aforementioned pain threshold and writing a topological sort of all relevant types.)

I wish someone had told me that requires did not mean what I thought it meant. That could have saved me a few hours.

You could argue that my assumption did hold since I could still discover enough information to complete a declaration. You would be right, but only in this case. I assumed everything I needed was readily parseable as XML.

But this isn’t only XML.

C Macros are Mixed with platform types

<type> elements are categorized with a category attribute.

One of those categories is "define". This is one of the categories that §10.2.3 simply declares as holding “legal C code” in the text of the element.

If the category attribute is one of basetype, bitmask, define, funcpointer, group, handle or include, or is not specified, type contains text which is legal C code for a type declaration.

Here’s two cherry-picked "define" types near the top of the XML:

<type category="define">
 #define <name>VK_NULL_HANDLE</name> 0
</type>
<type category="define">
 struct <name>ANativeWindow</name>;
</type>

So how can you tell, in general, which <type> is a C macro and which is an actual C type declaration?

Without parsing the C fragments, there is no reliable solution. This is one of the big reasons why some would give up on the XML entirely and just parse Vulkan headers directly.

Like many others before me, I opted for string analysis to avoid roping in clang. I filtered out what looked like macros and moved on.

Recursive Type Definitions and Pointer Variance

Some Vulkan struct types have recursive definitions.

<type category="struct" name="VkBaseOutStructure">
  <member>
    <type>VkStructureType</type>
    <name>sType</name>
  </member>
  <member>
    struct <type>VkBaseOutStructure</type>* 
    <name>pNext</name>
  </member>
</type>

Racket’s FFI library chokes on this if you try to generate a struct type definition using the name twice. Going by the user list, the solution is to use a symbol to represent the type in advance of the signature.

But if the goal is to be specific in your declarations, consider structextends from §10.1:

structextends - only applicable if category is struct or union. This is a comma-separated list of structures whose pNext can include this type.

When structextends exists, it appears that pNext is of type void* in practice.

So I figure if void* is good enough for Vulkan, it’s good enough for me. I ignored structextends for now, and will consider it again when generating a helper layer.

Mixed-Mode XML/C

I brought it up in passing before, but the types that simply blast C code into the element are full of corner cases. Just take a look at this function pointer declaration.

Unlike some other excerpts, I maintained the original spacing here. Forgive the horizontal scrolling.

<type category="funcpointer">typedef void (VKAPI_PTR *<name>PFN_vkInternalAllocationNotification</name>)(
<type>void</type>*                                       pUserData,
<type>size_t</type>                                      size,
<type>VkInternalAllocationType</type>                    allocationType,
<type>VkSystemAllocationScope</type>                     allocationScope);</type>

Cue internal screaming!

The return type is not in a tag, and the parameter types do not include pointer information. That’s all in the text. No single regular expression will handle this gracefully over time, since some of the parameters in other elements have const peppered about.

Racket’s FFI Library

Everyone I’ve seen comment on Racket’s FFI library says its awesome, and they’re right. The documentation is clear, and building a type declaration feels intuitive.

I suspect there’s ignorance on my part when dealing with the below speed bumps, so I welcome comments.

Unions

Racket’s FFI includes a construct for unions, but for reasons I do not yet understand you cannot refer to union members by name. There’s a (define-cstruct-type) form that lets you allocate and jimmy with a C struct as if it were a Racket struct.

But with unions, everything is positional. You declare a union type and fetch your the intended values via an ordinal.

Here’s an example from the docs:

(define a-union-type
  (_union (_list-struct _int _int)
          (_list-struct _double _double)))

(define a-union-val
  (cast (list 3.14 2.71)
        (_list-struct _double _double)
        a-union-type))

(union-ref a-union-val 1) ; '(3.14 2.71)

So given this Vulkan union (Yes, the [4]s are in the text):

<type category="union" name="VkClearColorValue">
  <member>
    <type>float</type>
    <name>float32</name>[4]
  </member>
  <member>
    <type>int32_t</type>
    <name>int32</name>[4]
  </member>
  <member>
    <type>uint32_t</type>
    <name>uint32</name>[4]
  </member>
</type>

I can’t say (_VkClearColorValue-int32 val), I have to say (union-ref val 1).

It looks like I’d have to generate helper procedures to map member names to ordinals, unless the maintainers are open to me contributing a (define-cunion-type) form. The docs suggest that unions are just treated as structs, so does that mean (define-cstruct-type) is the intended substitute?

Opinionated Base Type Mapping

Racket has bindings for basic or common C types like _void and _stdbool. They do not exactly match the types as they appear in C. e.g. size_t becomes _size. In fact, Racket seems content to drop the _t wherever it would appear.

Racket also has a signed byte C type binding as _int8. _sbyte is an alias, but char isn’t. But, Racket does have a _wchar.

So when Vulkan declared that it uses char, I ended up adding it myself. And I had to tack on the missing _t suffixes again.

I’m not sure why this is the intended state. Hopefully a maintainer can weigh in.

Conclusion

The project is still ongoing. The data declarations are nearly done. They are only missing enumerants whose values appear outside of C enums.

Following this I will fetch the actual functions from libvulkan and provide them as Racket procedures.


† To the maintainers of dynamic-ffi: if you ever provide a Racket interface into your clang plugin to parse and visit C fragments, you’ll be my hero.