
fraserbn at gmail
Aug 10, 2013, 11:04 PM
Post #8 of 8
(20 views)
Permalink
|
|
Re: Custom infix operators (was: feature request: string-or)
[In reply to]
|
|
On Fri, Aug 9, 2013 at 7:45 AM, Lukas Atkinson (amon) <latk [at] gmx> wrote: > Am 07.08.2013 22:43, schrieb David Nicol: > > > > > > On Tue, Aug 6, 2013 at 5:08 AM, Ed Avis <eda [at] waniasset > > <mailto:eda [at] waniasset>> wrote: > > > > Adding custom infix operators would make it easier to get rid of > > smartmatch. > > > > > > yes, but how? Without breaking indirect method invocation? Would there > > be a readily identifiable set of tokens that are reserved as infix > > operators? An "insub" keyword? A > > special prototype that means "infix binary" with variations indicating > > precedence? And how would this interoperate to the yacc-generated > parsing? > > > > If wishes were bicycles... > > There are various sane ways how custom binops could be added. A brainstorm: > > 0. Ignore custom operators, as autoboxing and operator overloading > provide sufficient chances for syntactic mayhem. > > 1.1 Have a special binop operator. E.g. in Haskell, you can write "x > `foo` y" which is syntactic sugar for "foo(x, y)". Perl would need > another character. Interesting candidates are: > > - Some already-used symbols (the code would already have a meaning, > but spacing could be used to disambiguate): "$x :myop: $y", "$x ?myop? > $y", "$x !myop! $y", "$x |myop| $y", "$x ^myop^ $y", "$x ~myop~ $y", "$x > %myop% $y", ... > As e.g. "^" isn't used that much, this might not be so bad after > all. Only one such character should be chosen. The Sub::Infix module > implements this is a very inefficient way. > > - Non-ASCII characters. "$x 〈myop〉 $y", "$x ★myop★ $y", "$x «myop» > $y". This is probably a bad idea, as this would require "utf8", and > opens Pandora's Box for other Unicode syntax. OTOH, this is very > orthogonal. > > If any such solution is chosen, precedence and context have to be fixed. > Custom operators should be non-associative to impose minimum assumptions > on user operators. The precedence level should be roughly as low as > other operators with word characters. Choosing the precedence of the > "eq" or "cmp" operator would feel nice. > > It would be sensible to ignore any prototypes of the "myop" sub, and > impose semantics like the (+) or ($) prototype on each operand. > > 1.2 Allow existing operators to enclose a sub name. This custom > operator would then take on precedence and associativity etc. of the > enclosing operator. This makes sufficient sense for the operators "* / % > + - . << >> < > <= >= == != ~~ & | ^ ..". Silly example: > my $matrix = [1..3] *x* [-1..1]; > # my $matrix = &x([1..3], [-1..1]); > Maybe more complex stuff could be enclosed on curlies, so that "+foo+" > is equivalent to "+{\&foo}+". Then: > say for 3 ..{range_step(2)}.. 19; > # say for range_step(2)->(3, 19); > The above points about ambiguity still hold. I am not that fond of this > idea, because having an operator contain both symbols and word > characters could be a bit confusing: "foo() if $x <<a_lot_less<< $y". > > 2. Pick a few Unicode blocks and let those symbols be used as binops > (save for those that look too similar to existing operators). This would > require an "operators" pragma (or feature providing declaration syntax) > to map symbols to subs. Major disadvantage is the difficulty of typing > these characters. But a "$x ∈ %hash" would be cool (∈ could cover half > of the smartmatch cases). I do not like this, because the next logical > step would be custom circumfix operators, and parsing *that* in a sane > way is unrealistic for Perl5. > > 3. Using prototypes to declare binops is unrealistic in my opinion. > This is a lot of action at a distance, and renders the use of stacking > subs like "foo bar baz $x" utterly incomprehensible for a human reader. > > Incidentally, these ideas are sorted in estimated difficulty of > implementation. > Having already implemented a subset of 2 and 3[*], I can safely say those are easier to implement than 1.1 and 1.2, not even taking into account that overloading current operators even more would just cause headaches and edge cases. 1.1 + Unicode would be doable but seems needlessly restrictive; I'd rather have 2, then use D::CP to attach a custom parser to search for the next symbol. That being said, I don't think that "this has the potential to be written in a confusing way" is a valid argument. You can write all sorts of "incomprehensible" code right now (e.g. foo bar baz quux, or foo + bar, or the example you provided, which has several meanings right now), and our reactions generally aren't to wish that we didn't have prototypes/indirect object syntax[**], but instead to modify the code to be sensical, like by adding parenthesis, or explicit method calls, or abstracting to a sub. Perl mostly gives you rope, and it's up to the programmer to decide how to use it. Implementation-wise, I believe that there's two big roadblocks for infix operators: First, there's currently no easy way to pick a given level of precedence without adding a huge chunk of code to perly.y. Mind you, I'm barely knowledgable in yacc/bison, so maybe there's a super easy way to do this? I would be more than glad to be corrected here. This is because, taking and/or as an example, a user might want a precedence levels that is "higher than and/or", "same as and/or", or "lower than and/or," and then each of those has three extra versions, one for non associative, and two for left/right associative -- which nets you nine new entries to the precedence table in perly.y, just for one existing level. Peter settled for adding two levels for starters -- since I was implementing infix subs, I only needed to add one. Second, we don't have lazy arguments yet, so some cool uses of infix operators go right out of the window. Self-plug here, because I'm working on this :D [***] [*] Well, implemented 3, then wrote a quick hack for "use operators" and set ∈ to an infix sub. [**] Er, not all of us. [***] https://github.com/Hugmeir/Params-Lazy
|