Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Dev

SIGCHECK() in longobject.c

 

 

Python dev RSS feed   Index | Next | Previous | View Threaded


solipsis at pitrou

Oct 18, 2009, 1:01 PM

Post #1 of 16 (956 views)
Permalink
SIGCHECK() in longobject.c

Hello,

In Objects/longobject.c, there's the SIGCHECK() macro which periodically checks
for signals when doing long integer computations (divisions, multiplications).
It does so by messing with the _Py_Ticker variable.

It was added in 1991 under the title "Many small changes", and I suppose it was
useful back then.

However, nowadays long objects are ridiculously fast, witness for example:

$ ./py3k/python -m timeit -s "a=eval('3'*10000+'5');b=eval('8'*6000+'7')"
"str(a//b)"
1000 loops, best of 3: 1.47 msec per loop

Can we remove this check, or are there people doing million-digits calculations
they want to interrupt using Control-C ?

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


dickinsm at gmail

Oct 18, 2009, 1:08 PM

Post #2 of 16 (931 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

On Sun, Oct 18, 2009 at 9:01 PM, Antoine Pitrou <solipsis [at] pitrou> wrote:

> In Objects/longobject.c, there's the SIGCHECK() macro which periodically checks
> for signals when doing long integer computations (divisions, multiplications).
> It does so by messing with the _Py_Ticker variable.
>
> It was added in 1991 under the title "Many small changes", and I suppose it was
> useful back then.
>
> However, nowadays long objects are ridiculously fast, witness for example:
>
> $ ./py3k/python -m timeit -s "a=eval('3'*10000+'5');b=eval('8'*6000+'7')"
> "str(a//b)"
> 1000 loops, best of 3: 1.47 msec per loop
>
> Can we remove this check, or are there people doing million-digits calculations
> they want to interrupt using Control-C ?

Yes, I suspect there are. Though you don't need millions of digits for a single
operation to take a noticeable amount of time: try str(10**100000),
for example.

Is there a benefit to removing the check?

Mark
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


solipsis at pitrou

Oct 18, 2009, 1:23 PM

Post #3 of 16 (930 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

Mark Dickinson <dickinsm <at> gmail.com> writes:
>
> Yes, I suspect there are. Though you don't need millions of digits for a
single
> operation to take a noticeable amount of time: try str(10**100000),
> for example.
>
> Is there a benefit to removing the check?

Apart from casual cleanup, the reason I'm asking is that I'm experimenting with
removing _Py_Ticker, and I want to know what I need to emulate :-)


Antoine.


_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


daniel at stutzbachenterprises

Oct 18, 2009, 3:16 PM

Post #4 of 16 (926 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

On Sun, Oct 18, 2009 at 3:01 PM, Antoine Pitrou <solipsis [at] pitrou> wrote:

> Can we remove this check, or are there people doing million-digits
> calculations
> they want to interrupt using Control-C ?
>

I sometimes do million-digits calculations that I want to interrupt using
Control-C.

(particularly when I didn't *intend* to do a million-digits calculation...
;) )

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>


solipsis at pitrou

Oct 18, 2009, 3:46 PM

Post #5 of 16 (933 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

Daniel Stutzbach <daniel <at> stutzbachenterprises.com> writes:
>
> I sometimes do million-digits calculations that I want to interrupt using
Control-C.(particularly when I didn't *intend* to do a million-digits
calculation... ;) )--

Sure, but it's no different than doing, e.g.:
list(range(100000000)).sort()

(don't try this, it just made by computer slow down to a crawl and I had to kill
-9 the Python interpreter)

The question is whether there is still any reason to special-case long objects
and not, say, list.sort() or re.sub().

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


daniel at stutzbachenterprises

Oct 18, 2009, 4:05 PM

Post #6 of 16 (937 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

On Sun, Oct 18, 2009 at 5:46 PM, Antoine Pitrou <solipsis [at] pitrou> wrote:

> Daniel Stutzbach <daniel <at> stutzbachenterprises.com> writes:
> > I sometimes do million-digits calculations that I want to interrupt using
> Control-C.(particularly when I didn't *intend* to do a million-digits
> calculation... ;) )--
>
> Sure, but it's no different than doing, e.g.:
> list(range(100000000)).sort()
>

That's a good point, although I can't recall the last time I accidently
created a painfully large list. I can recall the last time I started a
painfully large integer computation.

Being able to stop the interpretter with Control-C instead of kill -9 is a
minor convenience, though. I could live without it. (Although I can't
speak for everyone, of course)

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>


dickinsm at gmail

Oct 19, 2009, 2:28 AM

Post #7 of 16 (926 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

On Sun, Oct 18, 2009 at 11:46 PM, Antoine Pitrou <solipsis [at] pitrou> wrote:
> Sure, but it's no different than doing, e.g.:
>    list(range(100000000)).sort()
>
> (don't try this, it just made by computer slow down to a crawl and I had to kill
> -9 the Python interpreter)

Maybe you were running out of RAM? On a 64-bit machine,
list(range(10**8)) takes over 3 Gb.

For me, x = list(range(10**8)) takes around 6 seconds, and then
x.sort() takes around 2 seconds. This is on a machine with
16 Gb of RAM. (Though I do seem to recall that timsort is
extra fast for already sorted lists: O(n) rather than O(n log n)?)

So I'd say that it *is* different. A million digit integer takes less
than half a megabyte of RAM, so a single operation can be
horribly slow long before memory limitations are reached.

I'd prefer to keep the SIGCHECK unless there's a really
compelling advantage to getting rid of it.

Mark
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


dickinsm at gmail

Oct 19, 2009, 4:27 AM

Post #8 of 16 (924 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

On Sun, Oct 18, 2009 at 9:01 PM, Antoine Pitrou <solipsis [at] pitrou> wrote:

> Can we remove this check, or are there people doing million-digits calculations
> they want to interrupt using Control-C ?

By the way, here's an example of an *almost* real-life use of million digit
calculations.

For an elementary number theory course that I taught a while ago, there
was an associated (optional) computer lab, where the students used
Python to investigate various ideas, conjectures, examples, etc. One
of the less open-ended questions might have been[1] something like:

"On August 23rd, 2008 a computer at UCLA found the first example
of a prime with more than 10 million digits: p = 2**43112609-1. Find

(1) the exact number of digits in p, when written out in decimal
(2) the last 100 digits of p
(3) the first 100 digits of p."

It's wonderfully easy to get answers to these questions with Python:

>>> import math
>>> e = 43112609
>>> p = 2**e - 1
>>> ndigits = int(math.ceil(math.log10(p)))
>>> last_100_digits = '{:0100d}'.format(p % 10**100)
>>> first_100_digits = str(p // 10**(ndigits-100))

Getting the first 100 digits takes a good few seconds; the
other operations are faster.

But if you (not unreasonably) try to compute str(p),
you'll find it's impossibly slow, and it's very handy
that it's possible to interrupt that calculation, attempt
to understand *why* it's slow, and then try different
methods.

(And yes, there are better ways to get both the first
and last hundred digits.)

Mark


[1] Might have been, but wasn't. Hence the *almost*. If I'd been
teaching that course this semester I'd almost certainly have included
something like this.
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


tim.peters at gmail

Oct 19, 2009, 1:58 PM

Post #9 of 16 (905 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

[Mark Dickinson]
> By the way, here's an example of an *almost* real-life use of million digit
> calculations.
>
> For an elementary number theory course that I taught a while ago, there
> was an associated (optional) computer lab, where the students used
> Python to investigate various ideas, conjectures, examples, etc.  One
> of the less open-ended questions might have been[1] something like:
>
> "On August 23rd, 2008 a computer at UCLA found the first example
> of a prime with more than 10 million digits: p = 2**43112609-1.  Find
>
> (1) the exact number of digits in p, when written out in decimal
> (2) the last 100 digits of p
> (3) the first 100 digits of p."
>
> It's wonderfully easy to get answers to these questions with Python:
...
> But if you (not unreasonably) try to compute str(p),
> you'll find it's impossibly slow, and it's very handy
> that it's possible to interrupt that calculation, attempt
> to understand *why* it's slow, and then try different
> methods.

Don't want to hijack this thread, but this is the kind of use case
justifying keeping the 3-argument pow in the decimal module. People
"playing" with number theory questions can learn a bag of tricks to
worm around that non-decimal arithmetic can make it inconvenient &
slow to get answers about decimal digits, but most people -- and
especially beginners -- find life easier when doing calculations in
decimal to begin with. Then they can use obvious methods, and not get
sidetracked by wondering why they take forever to finish.

Although, to be fair, I started

>>> decimal.getcontext().prec = 20000000
>>> p10 = pow(decimal.Decimal(2), 43112609)

before I started typing this, and am still waiting for it to finish ;-)

OTOH,

>>> decimal.getcontext().prec = 200
>>> pow(decimal.Decimal(2), 43112609) # get the first 100 digits (& then some)

and

>>> pow(decimal.Decimal(2), 43112609, 10**100) - 1 # get the last 100 digits

both appear to work instantaneously.
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


dickinsm at gmail

Oct 20, 2009, 1:39 AM

Post #10 of 16 (903 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

On Mon, Oct 19, 2009 at 9:58 PM, Tim Peters <tim.peters [at] gmail> wrote:
> Don't want to hijack this thread, but this is the kind of use case
> justifying keeping the 3-argument pow in the decimal module.  People
> "playing" with number theory questions can learn a bag of tricks to
> worm around that non-decimal arithmetic can make it inconvenient &
> slow to get answers about decimal digits, but most people -- and
> especially beginners -- find life easier when doing calculations in
> decimal to begin with.  Then they can use obvious methods, and not get
> sidetracked by wondering why they take forever to finish.

That's true. I wouldn't relish the task of trying to teach the decimal
module at the same time as introducing students to Python and to
elementary number theory, though. It's not the most friendly module
to get started with.

> Although, to be fair, I started
>
>>>> decimal.getcontext().prec = 20000000
>>>> p10 = pow(decimal.Decimal(2), 43112609)
>
> before I started typing this, and am still waiting for it to finish ;-)

Hmm, yes. The decimal module isn't (currently) well set up
for this sort of calculation, mostly because almost every operation
is implemented by converting to binary, invoking the appropriate
int/long arithmetic operation, and converting back. Since the
conversion is O(n^2), even addition and multiplication of Decimal
instances end up being O(n^2) operations at the moment, instead
of getting the linear and Karatsuba (resp.) running times that they
deserve.

(The exception is rounding operations, which don't do any
decimal <-> binary operations, but operate directly on the
coefficient in its representation as a string of decimal digits.)

This high-precision inefficiency could easily be fixed by
using a dedicated 'decimal natural number' extension
type for the Decimal coefficient, stored internally in base
a suitable power of 10. I think this may be worth
considering seriously. I'm not proposing this as an alternative
to the huge task of rewriting the entire decimal module in C,
by the way; it would be more a stop-gap measure that would
be easy to implement, would slightly increase efficiency for
normal operations, and vastly increase efficiency for high-precision
operations.

>
> OTOH,
>
>>>> decimal.getcontext().prec = 200
>>>> pow(decimal.Decimal(2), 43112609)   # get the first 100 digits (& then some)
>
> and
>
>>>> pow(decimal.Decimal(2), 43112609, 10**100) - 1  # get the last 100 digits
>
> both appear to work instantaneously.
>

Right. Low precision Decimal operations should be fine. I don't
really see the advantage of the second operation over a simple

pow(2, 43112609, 10**100)

though.

By the way, despite the above use-case, and despite the fact
that I use it frequently, I still don't believe 3-argument pow belongs
in the core. But that's probably a discussion topic for Python 4000.

Mark
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


ncoghlan at gmail

Oct 20, 2009, 3:49 AM

Post #11 of 16 (907 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

Mark Dickinson wrote:
> This high-precision inefficiency could easily be fixed by
> using a dedicated 'decimal natural number' extension
> type for the Decimal coefficient, stored internally in base
> a suitable power of 10. I think this may be worth
> considering seriously. I'm not proposing this as an alternative
> to the huge task of rewriting the entire decimal module in C,
> by the way; it would be more a stop-gap measure that would
> be easy to implement, would slightly increase efficiency for
> normal operations, and vastly increase efficiency for high-precision
> operations.

Didn't you already start work on that concept with the deccoeff patch?

Cheers,
Nick.

--
Nick Coghlan | ncoghlan [at] gmail | Brisbane, Australia
---------------------------------------------------------------
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


dickinsm at gmail

Oct 20, 2009, 4:22 AM

Post #12 of 16 (897 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

On Tue, Oct 20, 2009 at 11:49 AM, Nick Coghlan <ncoghlan [at] gmail> wrote:
> Mark Dickinson wrote:
>> This high-precision inefficiency could easily be fixed by
>> using a dedicated 'decimal natural number' extension
>> type for the Decimal coefficient, stored internally in base
>> a suitable power of 10. [...]
>
> Didn't you already start work on that concept with the deccoeff patch?

I did: the code can be seen at:

http://svn.python.org/view/sandbox/trunk/decimal/decimal_in_c/

This code defines a Deccoeff type as above, providing base 10 natural
number arithmetic, along with a skeletal _Decimal type. The work on
Deccoeff is pretty much complete, but the _Decimal type is only just
a beginning; the original intent was to slowly translate the Decimal
code into C and move it into the _Decimal type.

The code was working a few months ago (with all Decimal tests
passing), but there have been some changes and bugfixes since
then. I might try to resurrect that code, dropping the _Decimal type and
just concentrating on Deccoeff.

Mark
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


eric at trueblade

Oct 20, 2009, 7:50 AM

Post #13 of 16 (888 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

> On Tue, Oct 20, 2009 at 11:49 AM, Nick Coghlan <ncoghlan [at] gmail> wrote:
>> Mark Dickinson wrote:
>>> This high-precision inefficiency could easily be fixed by
>>> using a dedicated 'decimal natural number' extension
>>> type for the Decimal coefficient, stored internally in base
>>> a suitable power of 10. [...]
>>
>> Didn't you already start work on that concept with the deccoeff patch?
>
> I did: the code can be seen at:
>
> http://svn.python.org/view/sandbox/trunk/decimal/decimal_in_c/
>
> This code defines a Deccoeff type as above, providing base 10 natural
> number arithmetic, along with a skeletal _Decimal type. The work on
> Deccoeff is pretty much complete, but the _Decimal type is only just
> a beginning; the original intent was to slowly translate the Decimal
> code into C and move it into the _Decimal type.
>
> The code was working a few months ago (with all Decimal tests
> passing), but there have been some changes and bugfixes since
> then. I might try to resurrect that code, dropping the _Decimal type and
> just concentrating on Deccoeff.

My only concern about this is the effect it would have on IronPython,
Jython, PyPy, and other alternate implementations that use the stdlib.

_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


dickinsm at gmail

Oct 20, 2009, 7:57 AM

Post #14 of 16 (886 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

On Tue, Oct 20, 2009 at 3:50 PM, Eric Smith <eric [at] trueblade> wrote:
>> The code was working a few months ago (with all Decimal tests
>> passing), but there have been some changes and bugfixes since
>> then.  I might try to resurrect that code, dropping the _Decimal type and
>> just concentrating on Deccoeff.
>
> My only concern about this is the effect it would have on IronPython,
> Jython, PyPy, and other alternate implementations that use the stdlib.

Yes, that worries me a bit, too. I have the same worry with the idea
of rewriting the entire decimal module in C.

The Deccoeff type is very simple, though. It would be easy to create
a pure Python version of it, and then do something like:

try:
from _decimal import Deccoeff # try to get C version
except ImportError:
from deccoeff import Deccoeff # else use Python fallback code.

Mark
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


brett at python

Oct 20, 2009, 1:54 PM

Post #15 of 16 (880 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

On Tue, Oct 20, 2009 at 07:57, Mark Dickinson <dickinsm [at] gmail> wrote:

> On Tue, Oct 20, 2009 at 3:50 PM, Eric Smith <eric [at] trueblade> wrote:
> >> The code was working a few months ago (with all Decimal tests
> >> passing), but there have been some changes and bugfixes since
> >> then. I might try to resurrect that code, dropping the _Decimal type
> and
> >> just concentrating on Deccoeff.
> >
> > My only concern about this is the effect it would have on IronPython,
> > Jython, PyPy, and other alternate implementations that use the stdlib.
>
> Yes, that worries me a bit, too. I have the same worry with the idea
> of rewriting the entire decimal module in C.
>
> The Deccoeff type is very simple, though. It would be easy to create
> a pure Python version of it, and then do something like:
>
> try:
> from _decimal import Deccoeff # try to get C version
> except ImportError:
> from deccoeff import Deccoeff # else use Python fallback code.


And this is why you shouldn't have to worry. As long as the tests exercise
both the pure Python and C versions then the other VMs are covered.

-Brett


ncoghlan at gmail

Oct 21, 2009, 3:27 AM

Post #16 of 16 (854 views)
Permalink
Re: SIGCHECK() in longobject.c [In reply to]

Brett Cannon wrote:
> On Tue, Oct 20, 2009 at 07:57, Mark Dickinson <dickinsm [at] gmail
> <mailto:dickinsm [at] gmail>> wrote:
> The Deccoeff type is very simple, though. It would be easy to create
> a pure Python version of it, and then do something like:
>
> try:
> from _decimal import Deccoeff # try to get C version
> except ImportError:
> from deccoeff import Deccoeff # else use Python fallback code.
>
>
> And this is why you shouldn't have to worry. As long as the tests
> exercise both the pure Python and C versions then the other VMs are covered.

We need to worry at least a little bit, as the pure Python version
doesn't exist at this point in time.

Cheers,
Nick.

--
Nick Coghlan | ncoghlan [at] gmail | Brisbane, Australia
---------------------------------------------------------------
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com

Python dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.