Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Dev

redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others))

 

 

Python dev RSS feed   Index | Next | Previous | View Threaded


stefan_ml at behnel

Aug 11, 2013, 6:52 AM

Post #1 of 3 (15 views)
Permalink
redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others))

Nick Coghlan, 11.08.2013 15:19:
> On 11 Aug 2013 09:02, "Stefan Behnel" wrote:
>>> BTW, this already suggests a simple module initialisation interface. The
>>> extension module would expose a function that returns a module type, and
>>> the loader/importer would then simply instantiate that. Nothing else is
>>> needed.
>>
>> Actually, strike the word "module type" and replace it with "type". Is
>> there really a reason why Python needs a module type at all? I mean, you
>> can stick arbitrary objects in sys.modules, so why not allow arbitrary
>> types to be returned by the module creation function?
>
> That's exactly what I have in mind, but the way extension module imports
> currently work means we can't easily do it just yet. Fortunately, importlib
> means we now have some hope of fixing that :)

Well, what do we need? We don't need to care about existing code, as long
as the current scheme is only deprecated and not deleted. That won't happen
before Py4 anyway. New code would simply export a different symbol when
compiling for a CPython that supports it, which points to the function that
returns the type.

Then, there's already the PyType_Copy() function, which can be used to
create a heap type from a statically defined type. So extension modules can
simply define an (arbitrary) additional type in any way they see fit, copy
it to the heap, and return it.

Next, we need to define a signature for the type's __init__() method. This
can be done in a future proof way by allowing arbitrary keyword arguments
to be added, i.e. such a type must have a signature like

def __init__(self, currently, used, pos, args, **kwargs)

and simply ignore kwargs for now.

Actually, we may get away with not passing all too many arguments here if
we allow the importer to add stuff to the type's dict in between,
specifically __file__, __path__ and friends, so that they are available
before the type gets instantiated. Not sure if this is a good idea, but it
would at least relieve the user from having to copy these things over from
some kind of context or whatever we might want to pass in.

Alternatively, we could split the instantiation up between tp_new() and
tp_init(), and let the importer set stuff on the instance dict in between
the two. But given that this context won't actually change once the shared
library is loaded, the only reason to prefer modifying the instance instead
of the type would be to avoid requiring a tp_dict for the type. Open for
discussion, I guess.

Did I forget anything? Sounds simple enough to me so far.

Stefan


_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


eliben at gmail

Aug 11, 2013, 10:43 AM

Post #2 of 3 (12 views)
Permalink
Re: redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)) [In reply to]

On Sun, Aug 11, 2013 at 6:52 AM, Stefan Behnel <stefan_ml [at] behnel> wrote:

> Nick Coghlan, 11.08.2013 15:19:
> > On 11 Aug 2013 09:02, "Stefan Behnel" wrote:
> >>> BTW, this already suggests a simple module initialisation interface.
> The
> >>> extension module would expose a function that returns a module type,
> and
> >>> the loader/importer would then simply instantiate that. Nothing else is
> >>> needed.
> >>
> >> Actually, strike the word "module type" and replace it with "type". Is
> >> there really a reason why Python needs a module type at all? I mean, you
> >> can stick arbitrary objects in sys.modules, so why not allow arbitrary
> >> types to be returned by the module creation function?
> >
> > That's exactly what I have in mind, but the way extension module imports
> > currently work means we can't easily do it just yet. Fortunately,
> importlib
> > means we now have some hope of fixing that :)
>
> Well, what do we need? We don't need to care about existing code, as long
> as the current scheme is only deprecated and not deleted. That won't happen
> before Py4 anyway. New code would simply export a different symbol when
> compiling for a CPython that supports it, which points to the function that
> returns the type.
>
> Then, there's already the PyType_Copy() function, which can be used to
> create a heap type from a statically defined type. So extension modules can
> simply define an (arbitrary) additional type in any way they see fit, copy
> it to the heap, and return it.
>
> Next, we need to define a signature for the type's __init__() method. This
> can be done in a future proof way by allowing arbitrary keyword arguments
> to be added, i.e. such a type must have a signature like
>
> def __init__(self, currently, used, pos, args, **kwargs)
>
> and simply ignore kwargs for now.
>
> Actually, we may get away with not passing all too many arguments here if
> we allow the importer to add stuff to the type's dict in between,
> specifically __file__, __path__ and friends, so that they are available
> before the type gets instantiated. Not sure if this is a good idea, but it
> would at least relieve the user from having to copy these things over from
> some kind of context or whatever we might want to pass in.
>
> Alternatively, we could split the instantiation up between tp_new() and
> tp_init(), and let the importer set stuff on the instance dict in between
> the two. But given that this context won't actually change once the shared
> library is loaded, the only reason to prefer modifying the instance instead
> of the type would be to avoid requiring a tp_dict for the type. Open for
> discussion, I guess.
>
> Did I forget anything? Sounds simple enough to me so far.
>

Out of curiosity - can we list actual use cases for this new design? The
previous thread, admittedly, deals with an isoteric corner-cases that comes
up in overly-clever tests. If we plan to serious consider these changes -
and this appears to be worth a PEP - we need a list of actual advantages
over the current approach. It's not that a more conceptually pure design is
an insufficient reason, IMHO, but it would be interesting to hear about
other implications.

Eli


ncoghlan at gmail

Aug 11, 2013, 3:41 PM

Post #3 of 3 (12 views)
Permalink
Re: redesigning the extension module initialisation protocol (was: Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)) [In reply to]

On 11 Aug 2013 09:55, "Stefan Behnel" <stefan_ml [at] behnel> wrote:
>
> Nick Coghlan, 11.08.2013 15:19:
> > On 11 Aug 2013 09:02, "Stefan Behnel" wrote:
> >>> BTW, this already suggests a simple module initialisation interface.
The
> >>> extension module would expose a function that returns a module type,
and
> >>> the loader/importer would then simply instantiate that. Nothing else
is
> >>> needed.
> >>
> >> Actually, strike the word "module type" and replace it with "type". Is
> >> there really a reason why Python needs a module type at all? I mean,
you
> >> can stick arbitrary objects in sys.modules, so why not allow arbitrary
> >> types to be returned by the module creation function?
> >
> > That's exactly what I have in mind, but the way extension module imports
> > currently work means we can't easily do it just yet. Fortunately,
importlib
> > means we now have some hope of fixing that :)
>
> Well, what do we need? We don't need to care about existing code, as long
> as the current scheme is only deprecated and not deleted. That won't
happen
> before Py4 anyway. New code would simply export a different symbol when
> compiling for a CPython that supports it, which points to the function
that
> returns the type.
>
> Then, there's already the PyType_Copy() function, which can be used to
> create a heap type from a statically defined type. So extension modules
can
> simply define an (arbitrary) additional type in any way they see fit, copy
> it to the heap, and return it.
>
> Next, we need to define a signature for the type's __init__() method.

We need the "ModuleSpec" object to pass here, which is what we're currently
working on in import-sig.

We're not going to define something specifically for C extensions when
other modules suffer related problems.

Cheers,
Nick.

This
> can be done in a future proof way by allowing arbitrary keyword arguments
> to be added, i.e. such a type must have a signature like
>
> def __init__(self, currently, used, pos, args, **kwargs)
>
> and simply ignore kwargs for now.
>
> Actually, we may get away with not passing all too many arguments here if
> we allow the importer to add stuff to the type's dict in between,
> specifically __file__, __path__ and friends, so that they are available
> before the type gets instantiated. Not sure if this is a good idea, but it
> would at least relieve the user from having to copy these things over from
> some kind of context or whatever we might want to pass in.
>
> Alternatively, we could split the instantiation up between tp_new() and
> tp_init(), and let the importer set stuff on the instance dict in between
> the two. But given that this context won't actually change once the shared
> library is loaded, the only reason to prefer modifying the instance
instead
> of the type would be to avoid requiring a tp_dict for the type. Open for
> discussion, I guess.
>
> Did I forget anything? Sounds simple enough to me so far.
>
> Stefan
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev [at] python
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

Python dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.