You are not logged in or registered. Please login or register to use the full functionality of this board...
SIGN IN Join Our Community For FREE


minimalPCP - precedence climbing parser
06-15-2017, 05:21 AM (This post was last modified: 06-15-2017 05:38 AM by Aurel.)
Post: #1
 (Print Post)
minimalPCP - precedence climbing parser
When you think that you can't expect something from me in python
i found something written in python..ha?
I found this article in my research about 'precedence climbing parser'
and what a frak is written in python and looks simple . Smile
so i copy/paste/run code in Thonny IDE ...and
looks like is executed BUT nothing. Sad
code is for py3* or 2.7 as is stated grrr...i hate to use this language
i remeber that winPython work but this one I am not sure Dodgy
/ so i must repeat myself- python is not good language for windows ! /
why such a examples is not possible to find in BASIC...i will look more
in the meantime i will show you code which seems that can be
translated to basic with some tricks ,it is written in oop...another heck !!!""!#$##$%&%$&$ grrr
but looking clear.,.maybe fiOsdev can help in translation to procedural ???????
see code is licensed Big Grin


Code Snippet: [Select]
# minimalpcp: Minimal demo precedence climbing parser for infix, prefix
# and postfix operator expressions. A rudimentary tokenizer is included.
# Python version 3.* (or 2.7.*); no imports required.
# ---------------------------------------------------------------------
# Notes on this Python module can be found in 'minimalpcp.pdf'

# Licence (2-clause BSD License)

# Copyright 2017 JoeCreu

# Redistribution and use in source and binary forms, with or without
# modification, are permitted, provided that the following conditions are met:

# 1. Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.

# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.

# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
# IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.

# JoeCreu, 2017-04-23.

# Example usage (see test data ts1, ts2, ..., iopsA below):
# Import this module:      from minimalpcp import *
# Display data, e.g.       ts4; iopsA
# Display token sequence:  Tokenizer(ts4,iopsA).TokSeq()
# Create fresh tokenizer:  tok = Tokenizer(ts4,iopsA)
# Parse (using tokenizer): parseExpr(tok)
# Parse (using 'parse'):   parse(ts4,iopsA)
# Use literal arguments:   parse("not 3 < 4",{"not"100,5),"<"6,7)})

# There is a unittest; run it from shell: python3 minimalpcp_test.py -v

# 'Tokenizer' and 'parse' take the string to be parsed as first and the
# dictionary of binding powers as second argument. A tokenizer can't be
# used twice; create a fresh one if required. Tokens in the string must
# be space-separated. Pre- and postfix operators are recognized by
# their special left (resp. right) binding power. The tokenizer inserts
# dummy operands before pre- and after postfix operators.
# 'TokSeq', 'PL2LL' and 'parse' are not needed for the actual parsing.

class Tokenizer:                   # Create a tokenizer instance using
   def __init__(self,ts,iops):    # e.g.      tok=Tokenizer(ts6,iopsA)
       self.tl = ts.split()       # then try: tok.Current
       self.length = len(self.tl) #           tok.Next()
       self.pos = -1              #           tok.Next()
       self.iops = iops           #           ...
       self.iops["$END"] = (-1,0) # Use a fresh tokenizer instance as
       self.Current = "$BEGIN"    # input for parsing: parseExpr(tok)

   def LBP(self,t) : return self.iops[t][0] # Left Binding power
   def RBP(self,t) : return self.iops[t][1] # Right Binding power
   
   def Next(self) :               # Advance to next token, return it
       if self.Current == "$PREDUMMY" :
           self.Current = self.tl[self.pos]
       elif (self.Current in self.iops
                             and self.RBP(self.Current)) == 100 :
           self.Current = "$POSTDUMMY"
       else :
           self.pos += 1
           if self.pos >= self.length :
               self.Current = "$END"
           else :
               atok = self.tl[self.pos]
               if atok in self.iops and self.LBP(atok) == 100 :
                   self.Current = "$PREDUMMY"
               else :
                   self.Current = atok
       return self.Current

   def TokSeq(self) :             # token sequence as seen by the
       s = self.Current           # parser, converted to a string.
       while self.Current != "$END" : s += (" " + self.Next())
       return s

# The function parseExpr is the actual infix parser. If tok is a fresh
# tokenizer, parseExpr(tok) will return the parse tree as Python list.
# 'parse' combines tokenization, parsing and output formatting.

def parseExpr(tok,minrbp=0):       # tok : Tokenizer
   s = tok.Next()                 # minrbp: minimal right binding
   n = tok.Next()                 # power. Start with minrbp = 0.
   while minrbp < tok.LBP(n) :    #        
      s = [n,s,parseExpr(tok,tok.RBP(n))]
      n = tok.Current             # If LBP(o)<=RBP(o) for operator o,
   return s                       # o will be parsed left associative.

def PL2LL(pl) :                    # Format Python list as a string
   if type(pl) is list :          # that looks like a lisp list. PL2LL
      s = "("                     # is a utility function for 'parse'.
      if len(pl) > 0  : s += PL2LL(pl[0])
      for t in pl[1:] : s += " " + PL2LL(t)
      s += ")"
   else : s = str(pl)
   return s

def parse(ts,iops):                # tokenize, parse, format
   return PL2LL(parseExpr(Tokenizer(ts,iops)))

# Test data

ts1 = "12 * x"         # ts1, ..., t6 are test strings for parsing.
ts2 = "3 * not a !"    # The tokens must be space separated.
ts3 = "res = 4 * 3 + n ^ 12 - x2"
ts4 = "res = 4 * 3 * a + n ^ 3 ^ 2 - x2"
ts5 = "f := x -> A and B"           # Try different binding powers,
ts6 = "f := x -> not A and not B !" # e.g., for '->' (see iopsA below).

iopsA = {              # iopsA is a dictionary containing operators
     "^" :  (81,80),  # and their LBP and RBP (left and right binding
     "+" :  (70,71),  # power). Prefix operators have LBP 100, postfix
     "-" :  (70,71),  # fix operators have RBP 100 ("not" is prefix,
     "*" :  (75,76),  # "!" is postfix). "Real" binding powers should
     "/" :  (75,76),  # be integers in range 10 to 90.
     "=" :  (51,50),
    "->" :  (90,25),  # LBP and RBP of "->" are intentionally "very"
    ":=" :  (41,40),  # different. See token strings ts5 and ts6.
   "not" :  (100,33), # LBP("*")<RBP("*") means "*" is left associa-
     "!" :  (78,100), # tive, so a*b*c is parsed as (a*b)*c. But a^b^c
   "and" :  (30,31)}  # is parsed as a^(b^c) since LBP("^")>RBP("^").
interesting this is:
 no imports required. Big Grin
also good thing:
Tokenizer and parser together consist of about 35
lines of Python code (not counting comment lines, code for utility functions and test data).

as far i understand this , this is just python module..right?

basicPro forum:
http://basicpro.mipropia.com/smf/index.php
EU Radioboard forum:
http://euradioboard.createmybb3.com/index.php
AurelSoft main site:
http://aurelsoft.ucoz.com
Find all posts by this user
Like Post
06-15-2017, 09:33 AM
Post: #2
 (Print Post)
RE: minimalPCP - precedence climbing parser
first things first-- the version of python shouldnt matter for this.

and of course there arent imports-- of what use is trig or drawing pixels to a parser?

based on the unit test for this parser i know you can simply do this:

Code Snippet: [Select]
print parse("a + not b !", {"+":(6,7),"not":(100,3),"!":(11,100)})

and get output without an error.

however, since i really dont understand what the point of a "precedence climbing parser" is (or what it does, other than "parse") im not sure if i can get this back to procedural style code or not.

if you want to produce some tests, to ensure that what comes out of the "procedural" version is the same, i might possibly be able to fix it.

most of the code is procedural, and since i dont know how the parser works, i dont know why they want an instance of it.

its not that the code is impenetrable, but id basically have to take the thing apart and put it back together, without either instructions for doing that *or* an understanding of how the thing works.

i really need one of those, for it to be worth the effort. but perhaps someone who is better with python (there are many of those) could fix it up for you.
Find all posts by this user
Like Post
The following 1 user Likes figosdev's post:
Aurel
06-15-2017, 11:24 AM
Post: #3
 (Print Post)
RE: minimalPCP - precedence climbing parser
Hi FIG

Quote:
however, since i really dont understand what the point of a "precedence climbing parser"

Precedence climbing parser -is version of Pratt Parser
It is infix notation expression parser for expression like 1+2*3
in this example he is also expression evaluator without need
to transform infix to postfix notation like 1 2 3 + * like some other
parsers do.
Also he dont require stack for opration.
it is really interesting parser and some people says that is very
efficient in direct interpreting,much more than well known recursive descent.
Do you undrstand now better, i have tried to explain how i can.

It would be nice to have this one translated to BASIC!

basicPro forum:
http://basicpro.mipropia.com/smf/index.php
EU Radioboard forum:
http://euradioboard.createmybb3.com/index.php
AurelSoft main site:
http://aurelsoft.ucoz.com
Find all posts by this user
Like Post
The following 1 user Likes Aurel's post:
figosdev
06-15-2017, 11:46 AM
Post: #4
 (Print Post)
RE: minimalPCP - precedence climbing parser
thats wonderful, i think youve explained it well enough for anyone to understand.

one of the things that keeps me in python and away from basic is the array handling. smallbasic has the best array handling i know of, but if you want this working in a faster basic it usually means taking time out to deal with things like redim and creating your own split command, etc.

so if i wasnt creating a parser in python, smallbasic is my next choice. but you want speed, which is a reasonable thing to want.
Find all posts by this user
Like Post
06-16-2017, 07:17 AM
Post: #5
 (Print Post)
RE: minimalPCP - precedence climbing parser
Hi fig
thanks..
I can't run this on my version of python..
do you can run this program to see how work?

basicPro forum:
http://basicpro.mipropia.com/smf/index.php
EU Radioboard forum:
http://euradioboard.createmybb3.com/index.php
AurelSoft main site:
http://aurelsoft.ucoz.com
Find all posts by this user
Like Post
06-16-2017, 08:50 AM
Post: #6
 (Print Post)
RE: minimalPCP - precedence climbing parser
your version of python is probably out of date somehow.

it would have to be REALLY out of date though. i mean it works in 2.7.

this parser only uses core python features, what error does it give when you try to run it?

if i put this on the end of the script and run it:

Code Snippet: [Select]
print parse("a + not b !", {"+":(6,7),"not":(100,3),"!":(11,100)})

it produces this:

Code Snippet: [Select]
(+ a (not $PREDUMMY (! b $POSTDUMMY)))
Find all posts by this user
Like Post
06-16-2017, 09:42 AM
Post: #7
 (Print Post)
RE: minimalPCP - precedence climbing parser
        I don't know what might be wrong
it throw error - invalid syntax ?
As i said i use thonny IDE + python
look in screenshots

basicPro forum:
http://basicpro.mipropia.com/smf/index.php
EU Radioboard forum:
http://euradioboard.createmybb3.com/index.php
AurelSoft main site:
http://aurelsoft.ucoz.com
Find all posts by this user
Like Post
06-16-2017, 01:31 PM
Post: #8
 (Print Post)
RE: minimalPCP - precedence climbing parser
oh ok. thats because the entire program works in your version, except the part i wrote. you have to do this:

print ( parse("a + not b !", {"+":(6,7),"not":(100,3),"!":(11,100)}) )

the spaces are there to show the difference, you dont need those spaces.
Find all posts by this user
Like Post
07-01-2017, 12:30 PM (This post was last modified: 07-02-2017 11:21 AM by EdDavis.)
Post: #9
 (Print Post)
RE: minimalPCP - precedence climbing parser
Quote:
I found this article in my research about 'precedence climbing parser'
and what a frak is written in python and looks simple . Smile
...
why such a examples is not possible to find in BASIC...i will look more

For those asking about Precedence Climbing: Precedence Climbing is a way to parse expressions.  Precedence Climbing (as well as other expression parsing methods) is explained by Theodore Norvell at Expression Parsing

The above is highly recommended - it is very well written, and gives you the pros and cons of Recursive Descent, Shunting Yard, Precedence Climbing, and even shows how Pratt Parsing is a superset of Precedence Climbing.

For Aurel:

Precedence Climbing is pretty cool.  Basically, you set the precedence of your operators (either via a table or returned via a function), and do the same for associativity. Then you write two relativity compact functions, and you're all set.  Of course once you start adding other operators and strings and other data types, the functions become larger.  But the principles remain the same.

For precedence, typically, power has the highest precedence, followed by unary operators, followed by multiplication/division/modulus, followed by addition/subtraction, then relational operators, equality operators, and finally logical operators.

Easiest way is to just use a case statement and return the precedence of the operator.

Code Snippet: [Select]
function unary_prec(op as string)
   select case op
       case "+", "-" : unary_prec = 3
       case else     : unary_prec = 0
   end select
end function

Code Snippet: [Select]
function binary_prec(op as string)
   select case op
       case "+", "-" : binary_prec = 1
       case "*", "/" : binary_prec = 2
       case "^"      : binary_prec = 4
       case else     : binary_prec = 0
   end select
end function

Associativity describes which side of a binary expression is processed first.  Most operators are left-associative, however, generally the power operator (^ or **) is right-associative.

Code Snippet: [Select]
function associativity(op as string)
   associativity = left_assoc
   if op = "^" then associativity = right_assoc
end function

Once you've got that done, you only need to write two functions (ignoring lexing, and a few simple house-cleaning functions): The main expression driver:

Code Snippet: [Select]
function expr#(p as integer)
   dim n  as double
   dim op as string
   dim q  as integer

   n# = primary#
   while is_binary_operator(sym) and binary_prec(sym) >= p
       op = sym

       call get_sym
       select case associativity(op)
           case right_assoc : q = binary_prec(op)
           case left_assoc  : q = 1 + binary_prec(op)
       end select

       select case op
           case "*" : n# = n# * expr#(q)
           case "/" : n# = n# / expr#(q)
           case "+" : n# = n# + expr#(q)
           case "-" : n# = n# - expr#(q)
           case "^" : n# = n# ^ expr#(q)
       end select
   wend

   expr# = n#
end function

And the function to handle non-binary operators and operands:

Code Snippet: [Select]
function primary#()
   dim op as string

   primary# = 0                    'prepare for errors
   if is_unary_operator(sym) then
       op = sym
       call get_sym
       select case op
           case "-" : primary# = -expr#(unary_prec(op))
           case "+" : primary# = expr#(unary_prec(op))
       end select
   elseif sym = "(" then
       call get_sym
       primary# = expr#(0)
       if sym <> ")" then
           print "expecting ')'"
       else
           call get_sym
       end if
   elseif is_digit(left$(sym, 1)) then
       primary# = val(sym)
       call get_sym
   else
       print "syntax error: expecting a primary"
   end if
end function

Complete code:

Code Snippet: [Select]
'Simple calculator.  Based on http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm
'
'Accepts */+-^ and parenthesis.
'
'Precedence:
'   ^ (power)
'   unary +,-
'   *, /
'   +, -
'
'Written by Ed Davis.  Contact: ed_davis2 at that yahoo place.
'Use at your own risk.

const right_assoc=0, left_assoc=1
dim shared user_str as string
dim shared sym as string
dim shared the_char as string

call main

function is_digit(ch as string)
   is_digit = ch >= "0" and ch <= "9"
end function

sub next_char()
   the_char = left$(user_str, 1)
   if len(user_str) <= 1 then
       user_str = ""
   else
       user_str = right$(user_str, len(user_str) - 1)
   end if
end sub

sub get_sym()
   user_str = ltrim$(user_str)
   call next_char
   sym = the_char
   select case sym
       case "(", ")", "+", "-", "*", "/", "^"  'all set
       case "0" to "9"
           sym = ""
           do
               sym = sym + the_char
               call next_char
           loop while is_digit(the_char)
           user_str = the_char + user_str
       case ""
       case else
           print "unrecognized character:", sym
           sym = ""
   end select
end sub

function is_unary_operator(op as string)
   select case op
       case "+", "-" : is_unary_operator = -1
       case else : is_unary_operator = 0
   end select
end function

function is_binary_operator(op as string)
   select case op
       case "+", "-", "*", "/", "^" : is_binary_operator = -1
       case else : is_binary_operator = 0
   end select
end function

function unary_prec(op as string)
   select case op
       case "+", "-" : unary_prec = 3
       case else : unary_prec = 0
   end select
end function

function binary_prec(op as string)
   select case op
       case "+", "-" : binary_prec = 1
       case "*", "/" : binary_prec = 2
       case "^" : binary_prec = 4
       case else : binary_prec = 0
   end select
end function

function associativity(op as string)
   associativity = left_assoc
   if op = "^" then associativity = right_assoc
end function

function primary#()
   dim op as string

   primary# = 0                    'prepare for errors
   if is_unary_operator(sym) then
       op = sym
       call get_sym
       select case op
           case "-" : primary# = -expr#(unary_prec(op))
           case "+" : primary# = expr#(unary_prec(op))
       end select
   elseif sym = "(" then
       call get_sym
       primary# = expr#(0)
       if sym <> ")" then
           print "expecting ')'"
       else
           call get_sym
       end if
   elseif is_digit(left$(sym, 1)) then
       primary# = val(sym)
       call get_sym
   else
       print "syntax error: expecting a primary"
   end if
end function

function expr#(p as integer)
   dim n as double
   dim op as string
   dim q as integer

   n# = primary#
   while is_binary_operator(sym) and binary_prec(sym) >= p
       op = sym

       call get_sym
       select case associativity(op)
           case right_assoc : q = binary_prec(op)
           case left_assoc  : q = 1 + binary_prec(op)
       end select

       select case op
           case "*" : n# = n# * expr#(q)
           case "/" : n# = n# / expr#(q)
           case "+" : n# = n# + expr#(q)
           case "-" : n# = n# - expr#(q)
           case "^" : n# = n# ^ expr#(q)
       end select
   wend

   expr# = n#
end function

sub main()
   do
       line input "Enter expression:", user_str
       if user_str = "" then exit do
       call get_sym
       print expr#(0)
   loop
end sub

If you have any questions, I am happy to answer them.

Update: I tried to make my original version a little smaller, and of course I introduced a bug, in the scanner.  The version in this message has been updated.
Find all posts by this user
Like Post
07-01-2017, 12:55 PM (This post was last modified: 07-01-2017 01:25 PM by bplus.)
Post: #10
 (Print Post)
RE: minimalPCP - precedence climbing parser
WELCOME Ed Davis

Some light shined on this subject! Wink

Thanks

APPEND: Oh, researching flukiluke's program, I learned that Python already had EVAL as well as EXEC in DOC's. This may be something also to consider.

B += _
Find all posts by this user
Like Post



Forum Jump:


User(s) browsing this thread: 1 Guest(s)




QB64 Member Project - OpenGL Triangles
QB64 Member Project - Exit
QB64 Member Project - Spinning Color Wheel
QB64 Member Project - Martin Fractals version two
QB64 Member Project - Score 4
QB64 Member Project - MAPTRIANGLE
QB64 Member Project - Color Rotating Text
QB64 Member Project - Kings Vallery version two
QB64 Member Project - Kobolts Monopoly
QB64 Member Project - Input
QB64 Member Project - 9 Board
QB64 Member Project - Martin Fractals version four
QB64 Member Project - Point Blank
QB64 Member Project - Qubic
QB64 Member Project - ARB Checkers
QB64 Member Project - STxAxTIC 3D World
QB64 Member Project - Red Scrolling LED Sign
QB64 Member Project - Dakapo
QB64 Member Project - Kings Valley verion one
QB64 Member Project - Splatter
QB64 Member Project - Martin Fractals version three
QB64 Member Project - Rubix's Magic
QB64 Member Project - Kings Court
QB64 Member Project - Full Color LED Sign
QB64 Member Project - Swirl
QB64 Member Project - Blokus
QB64 Member Project - Spiro Roses
QB64 Member Project - Touche
QB64 Member Project - Domain
QB64 Member Project - Martin Fractals version one
QB64 Member Project - Bowditch curve
QB64 Member Project - Pivot version two
QB64 Member Project - Sabotage
QB64 Member Project - Othello
QB64 Member Project - Quarto
QB64 Member Project - Connect Four
QB64 Member Project - Amazon
QB64 Member Project - Inside Moves
QB64 Member Project - Dreamy Clock
QB64 Member Project - Basic Dithering
QB64 Member Project - Foursight
QB64 Member Project - RGB Color Wheel
QB64 Member Project - Overboard
QB64 Member Project - Isolation
QB64 Member Project - Line Thickness
QB64 Member Project - Pivet version one
QB64 Member Project - Rotating Background
QB64 Member Project - Algeria Weather
QB64 Member Project - Color Triangles