Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

simplified Python parsing question

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


esj at harvee

Jul 29, 2012, 4:21 PM

Post #1 of 13 (877 views)
Permalink
simplified Python parsing question

as some folks may remember, I have been working on making Python and its tool
base more accessible to disabled programmers. I've finally come up with a really
simple technique which should solve 80% of the problem. What I need to figure
out is how to find a spot in the code where a symbol exists and potentially,
it's rough type (class name, instance, etc.). This is really a much bigger
question that I want to get into right now but I'm looking just to build a demo
to back up a storyboard plus video.

When you are sitting on or in a name, you look to the left or look to the right
what would you see that would tell you that you have gone past the end of that
name. For example

a = b + c

if you are sitting on a, the boundaries are beginning of line and =, if you are
sitting on b, the boundaries are = and +, if you are sitting on c, the
boundaries are + and end of line. a call the region between those boundaries
the symbol region.

if this example is clear to you, what you suggest for a method of finding a
whole statement, or a whole symbol region? note, doesn't have to be perfect or
complete solution, just good enough to let me do a moderately complex demo and
seek funding accessibility world to build a complete environment.

I appreciate the help because I believe that once this is working, it'll make a
significant difference in the ability for disabled programmers to write code
again as well as be able to integrate within existing development team and their
naming conventions.

Looking forward to responses.

--- eric

first draft write up of technique
https://docs.google.com/document/d/1In11apApKozw_UOPAhVz0ePqns72_6652Dra34xWp4E/edit
--
http://mail.python.org/mailman/listinfo/python-list


steve+comp.lang.python at pearwood

Jul 29, 2012, 8:33 PM

Post #2 of 13 (839 views)
Permalink
Re: simplified Python parsing question [In reply to]

On Sun, 29 Jul 2012 19:21:49 -0400, Eric S. Johansson wrote:

> When you are sitting on or in a name, you look to the left or look to
> the right what would you see that would tell you that you have gone past
> the end of that name. For example

Have you read the docs? It gives full details of the Python syntax.

http://docs.python.org/reference/index.html

For example:

http://docs.python.org/reference/simple_stmts.html#assignment-statements

See also:

http://docs.python.org/library/language.html
http://effbot.org/zone/simple-top-down-parsing.htm
http://nedbatchelder.com/text/python-parsers.html


Here's a Python parser using the pyparsing library. It's a bit old
(written for Python 2.4) but it shouldn't be hard to update it to new
syntax:

http://pyparsing.wikispaces.com/file/view/pythonGrammarParser.py



--
Steven
--
http://mail.python.org/mailman/listinfo/python-list


esj at harvee

Jul 29, 2012, 11:17 PM

Post #3 of 13 (839 views)
Permalink
Re: simplified Python parsing question [In reply to]

On 7/29/2012 11:33 PM, Steven D'Aprano wrote:
> On Sun, 29 Jul 2012 19:21:49 -0400, Eric S. Johansson wrote:
>
>> When you are sitting on or in a name, you look to the left or look to
>> the right what would you see that would tell you that you have gone past
>> the end of that name. For example
> Have you read the docs? It gives full details of the Python syntax.

Yes I have. I was hoping for a different perspective because what I'm trying to
do is middle out parsing. Top-down when the scanner focus moves from left to
right and bottom up when the scanner focus moves from right to left.

sounds kind of odd when I describe it that way but both the cursor is on the
middle of a name string and I need to look to either end of that name string
before can do a conversion to a symbol string, I have to look at both ends in
different ways. If you've read the documentation I've provided, would it be a
better example to use for describing some of the issues. Here's a very rough
draft of a storyboard

https://docs.google.com/presentation/d/1fuKyo9AE6i9ZdX2lucwK0v_W5Kx9M3Mezavm40wzCo8/edit

the first 13-14 slides are the working content for the storyboard. the rest is
mostly "memory" of things I was thinking about so if it doesn't make sense or
seems wrong, don't give me grief. :-)

> Here's a Python parser using the pyparsing library. It's a bit old
> (written for Python 2.4) but it shouldn't be hard to update it to new
> syntax:
>
> http://pyparsing.wikispaces.com/file/view/pythonGrammarParser.py
>

thanks for the reference. I'll take a look at it as well.
--
http://mail.python.org/mailman/listinfo/python-list


dieter at handshake

Jul 30, 2012, 2:11 AM

Post #4 of 13 (837 views)
Permalink
Re: simplified Python parsing question [In reply to]

"Eric S. Johansson" <esj [at] harvee> writes:

> When you are sitting on or in a name, you look to the left or look to
> the right what would you see that would tell you that you have gone
> past the end of that name. For example
>
> a = b + c
>
> if you are sitting on a, the boundaries are beginning of line and =,
> if you are sitting on b, the boundaries are = and +, if you are
> sitting on c, the boundaries are + and end of line. a call the region
> between those boundaries the symbol region.

Check the lexical definitions. They essentially define, what
a "symbol region" is.

In essence, you have names, operators, literals whitespace and comments --
each with quite a simple definition.

--
http://mail.python.org/mailman/listinfo/python-list


gandalf at shopzeus

Jul 30, 2012, 2:25 AM

Post #5 of 13 (842 views)
Permalink
Re: simplified Python parsing question [In reply to]

> I appreciate the help because I believe that once this is working,
> it'll make a significant difference in the ability for disabled
> programmers to write code again as well as be able to integrate within
> existing development team and their naming conventions.

Did you try to use pygments?

http://pygments.org/docs/api/

It already contains a lexer for Python source code. You can create a
Lexer (pygments.lexer.Lexer) then call its get_tokens method.

Then you can use this to identify statements:

http://docs.python.org/reference/simple_stmts.html

Fortunately, almost all statements begin with a keyword. There are some
exceptions:

expression statement
assignment statement

I would first tokenize the code, then divide it by statement keywords.
Finally, you just need to find expression/assignment statements in the
remaining sections. (Maybe there is a better way to do it.)


esj at harvee

Jul 30, 2012, 2:57 AM

Post #6 of 13 (844 views)
Permalink
Re: simplified Python parsing question [In reply to]

On 7/30/2012 5:25 AM, Laszlo Nagy wrote:
>
> Did you try to use pygments?
>
> http://pygments.org/docs/api/
>

thanks, I'll take a look.

>
> I would first tokenize the code, then divide it by statement keywords.
> Finally, you just need to find expression/assignment statements in the
> remaining sections. (Maybe there is a better way to do it.)
>
>
>

yeah the problem is also little more complicated than simple parsing of Python
code. For example, one example (from the white paper)

*meat space blowback = Friends and family [well-meaning attempt]

*could that be parsed by the tools you mention? I suspect not but this is what I
need to generate using speech recognition because it's easily spoken. A more
complex example might be something like

new base = OS path-base name (old path)

or

if OS base exists (current path): new base name = OS path base name(current path)

What's particularly cute here is that using the translation technique I can
actually describe the full object method path with a minimum of speaking
overhead. Python is great. :-)

But the questions remain, will these tools are stuff like this?


--
http://mail.python.org/mailman/listinfo/python-list


gandalf at shopzeus

Jul 30, 2012, 7:59 AM

Post #7 of 13 (839 views)
Permalink
Re: simplified Python parsing question [In reply to]

>
> yeah the problem is also little more complicated than simple parsing
> of Python code. For example, one example (from the white paper)
>
> *meat space blowback = Friends and family [well-meaning attempt]
>
> *could that be parsed by the tools you mention?

It is not valid Python code. Pygments is able to tokenize code that is
not valid Python code. Because it is not parsing, it is just tokenizing.
But if you put a bunch of random tokens into a file, then of course you
will never be able to split that into statements.

Probably, you will need to process ident/dedent tokens, identify the
"level" of the satement. And then you can tell what file, class, inner
class, method you are staying in. Inside one "level" or code block, you
could try to divide the code into statements.

Otherwise, I have no idea how a blind person could navigate in a Python
source. In fact I have no idea how they use regular programs. So I'm
affraid I cannot help too much with this. :-(


--
http://mail.python.org/mailman/listinfo/python-list


esj at harvee

Jul 30, 2012, 8:40 AM

Post #8 of 13 (846 views)
Permalink
Re: simplified Python parsing question [In reply to]

On 7/30/2012 10:59 AM, Laszlo Nagy wrote:
>
>>
>> yeah the problem is also little more complicated than simple parsing of
>> Python code. For example, one example (from the white paper)
>>
>> *meat space blowback = Friends and family [well-meaning attempt]
>>
>> *could that be parsed by the tools you mention?
>
> It is not valid Python code. Pygments is able to tokenize code that is not
> valid Python code. Because it is not parsing, it is just tokenizing. But if
> you put a bunch of random tokens into a file, then of course you will never be
> able to split that into statements.

If you have been reading the papers, you would understand what I'm doing. I'm
trying to take Python code with speech recognition friendly symbols and
translate the symbols into a code friendly form. My conjecture is that you can
change your perspective on the code and look for the edge that would normally be
used to define start of a symbol, you should be able to define the name string.
Another possibility is looking at the region which just contains letters numbers
and spaces and outside and use that as your definition of a name string. It
would probably help to verify that each word is found in a dictionary although
that adds extra complexity if you are trying to increase the dictionary at the
same time as the translation table.

I'm beginning to think for the first generation I should just use regular
expressions looking forwards and backwards and try to enumerate the possible cases.
>
> Probably, you will need to process ident/dedent tokens, identify the "level"
> of the satement. And then you can tell what file, class, inner class, method
> you are staying in. Inside one "level" or code block, you could try to divide
> the code into statements.

I was starting in that direction so that is good confirmation

>
> Otherwise, I have no idea how a blind person could navigate in a Python
> source. In fact I have no idea how they use regular programs. So I'm affraid I
> cannot help too much with this. :-(

I'm sorry, I am, and I'm trying to help, hand disabled programmers. There are
more disability than blindness and after almost 20 years of encountering this
shortsightedness, I do get a little cranky at times. :-)
>
>

--
http://mail.python.org/mailman/listinfo/python-list


mail at paultjuh

Jul 30, 2012, 10:13 AM

Post #9 of 13 (842 views)
Permalink
RE: simplified Python parsing question [In reply to]

Another possibility is to use the ast module of python: http://docs.python.org/library/ast.html

The only problem with that module, is that everything you parse must be correct, otherwise it throws an exception, I don't know if that's a problem for your project?
 
-----Original message-----
From:Eric S. Johansson <esj [at] harvee>
Sent:Mon 30-07-2012 12:00
Subject:Re: simplified Python parsing question
To:python-list [at] python;
On 7/30/2012 5:25 AM, Laszlo Nagy wrote:
>
> Did you try to use pygments?
>
> http://pygments.org/docs/api/
>

thanks, I'll take a look.

>
> I would first tokenize the code, then divide it by statement keywords.
> Finally, you just need to find expression/assignment statements in the
> remaining sections. (Maybe there is a better way to do it.)
>
>
>

yeah the problem is also little more complicated than simple parsing of Python
code. For example, one example (from the white paper)

*meat space blowback = Friends and family [well-meaning attempt]

*could that be parsed by the tools you mention? I suspect not but this is what I
need to generate using speech recognition because it's easily spoken. A more
complex example might be something like

new base = OS path-base name (old path)

or

if OS base exists (current path): new base name = OS path base name(current path)

What's particularly cute here is that using the translation technique I can
actually describe the full object method path with a minimum of speaking
overhead. Python is great. :-)

But the questions remain, will these tools are stuff like this?


--
http://mail.python.org/mailman/listinfo/python-list


steve+comp.lang.python at pearwood

Jul 30, 2012, 6:54 PM

Post #10 of 13 (846 views)
Permalink
Re: simplified Python parsing question [In reply to]

On Mon, 30 Jul 2012 11:40:50 -0400, Eric S. Johansson wrote:

> If you have been reading the papers, you would understand what I'm
> doing.

That is the second time, at least, that you have made a comment like that.

Understand that most people are not going to follow links to find out
whether or not they are interested in what you have to say. If you can't
give a brief explanation of what you are doing in your email or news
post, many people aren't going to read on. Perhaps they intend to but are
too busy, or they have email access but web access is restricted, or
they've already got 200 tabs open in their browser and don't want any
more (I'm not exaggerating, I know people like that).

People use email because it is a "push" technology -- you don't have to
go out and look for information, it gets pushed into your inbox. Clicking
on links is a "pull" technology -- you have to make the explicit decision
to click the link, open a browser, go out to the Internet and read who
knows what. That requires a different frame of mind. Expect to lose some
of your audience every time you require them to follow a link.

And *especially* so if that it a link to Google Docs, instead of an
normal web page. Google Docs is, in my opinion, a nasty piece of rubbish
that doesn't run on any of my browsers. As far as I'm concerned, I'd
rather download a Word doc, because at least I can open that in
OpenOffice or Abiword and read it. Something in Google Docs might as well
be locked in a safe as far as I'm concerned.


--
Steven
--
http://mail.python.org/mailman/listinfo/python-list


esj at harvee

Jul 30, 2012, 7:11 PM

Post #11 of 13 (846 views)
Permalink
Re: simplified Python parsing question [In reply to]

On 7/30/2012 9:54 PM, Steven D'Aprano wrote:
> On Mon, 30 Jul 2012 11:40:50 -0400, Eric S. Johansson wrote:
>
>> If you have been reading the papers, you would understand what I'm
>> doing.
> That is the second time, at least, that you have made a comment like that.

Actually, it's probably more like the forth hundred time. :-) I apologize, I was
wrong and I would back up and start over again if I could
>
> Understand that most people are not going to follow links to find out
> whether or not they are interested in what you have to say. If you can't
> give a brief explanation of what you are doing in your email or news
> post, many people aren't going to read on. Perhaps they intend to but are
> too busy, or they have email access but web access is restricted, or
> they've already got 200 tabs open in their browser and don't want any
> more (I'm not exaggerating, I know people like that).

accept criticism. I'm still working on an elevator pitch for this concept. I've
been living with the technology and all its variations for about 10 years and
it's not easy to explain to someone who is not disabled. People with working
hands don't understand how isolating and, sometimes humiliating software can be.
advocates like myself sometimes get a little tired of saying the same thing over
and over and over again and people who are disabled just don't care. So you find
yourself using shorthand because you going to be ignored anyway
>
> People use email because it is a "push" technology -- you don't have to
> go out and look for information, it gets pushed into your inbox. Clicking
> on links is a "pull" technology -- you have to make the explicit decision
> to click the link, open a browser, go out to the Internet and read who
> knows what. That requires a different frame of mind. Expect to lose some
> of your audience every time you require them to follow a link.

Okay, this implies the need to really work on more of an elevator/summary
speech. Thank you for your input. I appreciate it
>
> And *especially* so if that it a link to Google Docs, instead of an
> normal web page. Google Docs is, in my opinion, a nasty piece of rubbish
> that doesn't run on any of my browsers. As far as I'm concerned, I'd
> rather download a Word doc, because at least I can open that in
> OpenOffice or Abiword and read it. Something in Google Docs might as well
> be locked in a safe as far as I'm concerned.

the ability for multiple people to work on the same document at the same time is
really important. Can't do that with Word or Libre office. revision tracking
in traditional word processors are unpleasant to work with especially if your
hands are broken.

It would please me greatly if you would be willing to try an experiment. live my
life for a while. Sit in a chair and tell somebody what to type and where to
move the mouse without moving your hands. keep your hands gripping the arms or
the sides of the chair. The rule is you can't touch the keyboard you can't touch
the mice, you can't point at the screen. I suspect you would have a hard time
surviving half a day with these limitations. no embarrassment in that, most
people wouldn't make it as far as a half a day. I've had to live with it since
1994. Not trying to brag, just pointing out the facts.

I'm going to try again from a different angle in a different thread. I will take
your advice to heart and I would appreciate some feedback on how well I do
satisfying the issues you have described

--
http://mail.python.org/mailman/listinfo/python-list


rosuav at gmail

Jul 31, 2012, 3:15 PM

Post #12 of 13 (837 views)
Permalink
Re: simplified Python parsing question [In reply to]

On Tue, Jul 31, 2012 at 11:54 AM, Steven D'Aprano
<steve+comp.lang.python [at] pearwood> wrote:
> Google Docs is, in my opinion, a nasty piece of rubbish
> that doesn't run on any of my browsers. As far as I'm concerned, I'd
> rather download a Word doc, because at least I can open that in
> OpenOffice or Abiword and read it. Something in Google Docs might as well
> be locked in a safe as far as I'm concerned.

I go the opposite way. Google Docs works fine in my web browser, but
if it's a Word doc, I need to hunt down something that can read it.
I've yet to find any browser that can't handle a GDocs "publish" page,
but I have plenty of computers that don't have any
{Open|Libre|Liber|Libra|whatever the next generation is}-Office
installed.

Best is to put the information into your email/post. Next best is to
have a link to the information. Definitely worst is to force people to
download your file and try to read it.

ChrisA
--
http://mail.python.org/mailman/listinfo/python-list


bc at freeuk

Aug 3, 2012, 1:04 PM

Post #13 of 13 (797 views)
Permalink
Re: simplified Python parsing question [In reply to]

"Eric S. Johansson" <esj [at] harvee> wrote in message
news:mailman.2752.1343700723.4697.python-list [at] python
> On 7/30/2012 9:54 PM, Steven D'Aprano wrote:

> It would please me greatly if you would be willing to try an experiment.
> live my life for a while. Sit in a chair and tell somebody what to type
> and where to move the mouse without moving your hands. keep your hands
> gripping the arms or the sides of the chair. The rule is you can't touch
> the keyboard you can't touch the mice, you can't point at the screen. I
> suspect you would have a hard time surviving half a day with these
> limitations. no embarrassment in that, most people wouldn't make it as far
> as a half a day.

Just using speech? Probably more people than you might think have had such
experiences: anyone who's done software support over the telephone for a
start! And in that scenario, they are effectively 'blind' too.

--
Bartc

--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.