
marvin at rectangular
Oct 28, 2007, 7:18 PM
Post #4 of 14
(1577 views)
Permalink
|
|
Re: Towards a stable C API... via indirect dispatch
[In reply to]
|
|
On Oct 28, 2007, at 9:05 AM, Aaron Crane wrote: > Theoretically, object pointers (including void pointers) and function > pointers are incommensurate according to the C standard -- you get > undefined behaviour when you cast between them. Ah, yes, I'd forgotten that. Presently, the vtables are actual objects themselves, with a 'refcount' member and the whole bit. Having them be objects makes it easier to implement dynamic subclassing, a feature which is required by both Schema and FieldSpec, and which may come in handy elsewhere in the future. The vtable objects belong to the class "KinoSearch::Util::VirtualTable". Here's the definition for KinoSearch::Index::Term's vtable object: KINO_TERM_VTABLE KINO_TERM = { (KINO_OBJ_VTABLE*)&KINO_VIRTUALTABLE, /* vtable object's vtable */ 1, /* refcount */ (KINO_OBJ_VTABLE*)&KINO_OBJ, /* parent */ "KinoSearch::Index::Term", /* class name */ (kino_Obj_clone_t)kino_Term_clone, (kino_Obj_destroy_t)kino_Term_destroy, (kino_Obj_equals_t)kino_Term_equals, (kino_Obj_hash_code_t)kino_Obj_hash_code, (kino_Obj_is_a_t)kino_Obj_is_a, (kino_Obj_to_string_t)kino_Term_to_string, (kino_Obj_serialize_t)kino_Term_serialize, (kino_Term_get_field_t)kino_Term_get_field, (kino_Term_get_text_t)kino_Term_get_text, (kino_Term_copy_t)kino_Term_copy }; The first four member variables aren't function pointers, and I'd kinda sorta been hoping to sneak them into the array somehow. ;) A fifth member var will actually be needed as well: 'size' (or something like that), describing the size of the vtable either in bytes or in array members. One approach is to keep the vtables as structs, with the last member a "flexible array" of function pointers: typedef struct kino_VTable { KINO_OBJ_VTABLE *_; chy_u32_t refcount; KINO_OBJ_VTABLE *parent; const char *class_name; size_t size; kino_method_t methods[]; } kino_VTable; Flexible arrays are C99, but you can get away with them on C89 if you declare them to be at least length 1. kino_method_t methods[1]; You then take advantage of C's lack of bounds checking to malloc() enough memory for however many elements you need. :) It's a hack, but widely portable -- Perl's regex engine depends on it, for example. The downside of having the vtable be a struct rather than an array is that it adds an extra addition op to the process of finding the right function pointer. method_OFFSET * sizeof(kino_method_t) method_OFFSET * sizeof(kino_method_t) + FIXED_OFFSET Here's some AT&T assembler, for code implementing the array technique: # %eax register holds method_OFFSET # %edx register holds address of "methods" array movl (%edx,%eax,4), %eax Here's assembler for code using a vtable struct containing a "methods" array: # %eax register holds method_OFFSET # %edx register holds vtable struct pointer movl 20(%edx,%eax,4), %eax # <----------- NOTE extra "20" (To see the whole context, view the attached file "need_meth.s", which was generated from the attached file "need_meth.c" using the command "gcc -S -Wall -Os need_meth.c" on an x86 Linux box.) I'm not sure how much of a penalty you pay for the extra addition op -- only a benchmark would tell -- but I'm reasonably sure it doesn't help matters. :) It seems to me that the only way to get away with using the array rather than the struct containing the array involves some nasty casting hacks. Worth it, y'think? >> Say we remove the Kino_Term_Destroy method... then this code >> will crash at run-time, because the kino_Term_destroy_OFFSET >> symbol cannot be resolved: >> >> destroy_meth = self->_[kino_Term_destroy_OFFSET]; >> >> Of course a run-time crash would be bad -- but that just means that >> we can't redact public methods -- which we wouldn't be doing anyway. > > More specifically, the failure would be at link-time, right? Unless > I'm misunderstanding, code using the new macro will contain a > reference > to the kino_Term_destroy_OFFSET symbol, so the linker should fail when > trying to resolve uses of that symbol in callers. Of course, assuming > that most uses of the Kinosearch code rely on a dynamically loaded > KinoSearch.so (or local equivalent), that turns out to be roughly the > same thing as run-time anyway. Exactly. Marvin Humphrey Rectangular Research http://www.rectangular.com/
|