A simple way to represent a mathematical expression in a form that a calculator can easily evaluate is to represent it as a Binary Expression Tree. This is a data structure based on the concept of a

In order to evaluate this expression, a post-order traversal of the tree is done. That is, starting from the

(5-(6/2))+(3*4) = (5-3)+(3*4) = 2+12 = 14

This document presents a new, simple and very efficient iterative algorithm to parse a mathematical expression into a Binary Expression Tree. The expression is given in the generally used form also known as infix notation. For an expression of size n the performance is O(n), with a very small implementation constant. The algorithm is a type of operator-precedence parser. It was originally developed to parse the equation of a graph in the Graph Plotter 3D software.

- The basic idea
- Using right associative operators
- Using the open and close brackets
- Using functions like sin() and cos()
- Using the negative and factorial operators
- Using the comparison and logical operators
- Analysis of performance and correctness
- Example implementation of the algorithm
- Parsing while evaluating with stack based algorithm
- Example implementation of stack based algorithm
- Links
- See also

- Get the first item and initialise the tree with it.
- Currently the root node is also the current node. The
**current node**is the node we currently lie on. - Get the next item. This will be called the
**new item**. **Climb up**the tree as long as the precedence of the current node item is*greater than or equal to*that of the new item. When this is over, the current node item has a precedence*strictly less than*that of the new item.- Create a new node for the new item.
**Then set**the left child of the new node to be the old right child of the current node. Finally set the new right child of the current node to be the new node (also set the parent of the new node). - Set the
*current node*to be the new node. - Repeat steps 3, 4, 5 and 6 till there is no item left.

It can be observed that 1)

Now consider the following list of operators and their precedence levels:

- operator
**plus +**with precedence**2** - operator
**minus -**with precedence**2** - operator
**times ***with precedence**4** - operator
**divide /**with precedence**4**

- Step 4: Climb up the tree as long as the precedence of the current node item is
*strictly greater than*that of the new item. When this is over, the current node item has a precedence*less than or equal to*that of the new item.

This is inserted as if it had highest precedence. This essentially means

First of all the slightly modified version of step 4 as used by right-associative operators is performed. The purpose is to climb up the tree and stop at the first encounter of an '(' node. Then what is done is simple:

- Step 5: Set the new right child of the parent node to be the old right child of the current node, then delete the current node.
- Step 6: Set the
*current node*to be the parent node.

Lets add the factorial

Our complete list of operators is now:

- operators
**( and )**with precedence**1** - operator
**plus +**with precedence**2** - operator
**minus -**with precedence**2** - operator
**negative -**with precedence**3** - operator
**times ***with precedence**4** - operator
**divide /**with precedence**4** - operator
**exponent ^**with precedence**5** - operator
**factorial !**with precedence**6**

Lets also try to fit in the following three logical operators into our list of operators:

- operator
with precedence*or***1.2** - operator
with precedence*and***1.4** - operator
with precedence*not***1.6**

The logical operators

The logical operator

The comparison operators are non-associative. That is the expression (4==a==9) cannot be evaluated as ((4==a)==9) nor as (4==(a==9)). The reason is that 4==a evaluates to either true or false, but then the expression (true==9) is invalid. Actually what we will want is for the expression to evaluate as (4==a

The performance analysis of the algorithm is very simple. The key is to note that

- On step 5 where the node is created and set.
- On step 4 when the node is passed by another climbing up.

Therefore the algorithm is given n items, creates n nodes for each of these and may visit a node again only once. This gives an algorithmic performance of O(2n). However step 5 involves a larger implementation constant than step 4.

The necessary and sufficient condition for the constructed binary expression tree to be valid is for it to respect the precedence orders of operations, which in turn will ensure proper order of evaluation of these operations. Therefore the algorithm is correct if it guarantees a valid constructed tree.

Respecting precedence orders implies that items with higher precedence levels must be found lower down the tree. Notice that this is exactly what step 4 of the algorithm ensures when it

Some operators exhibit the important behaviour of skipping step 4 of the algorithm. This simply goes by the nature of the operators themselves. That is for the '(' and '-ve' operators for example, their first encounter implies highest precedence on insertion, which implies skipping step 4.

```
/*
To compile with GCC, execute:
gcc -Wall -ansi -pedantic -o calc expression-parsing-algorithm.c -lm
To run, execute: ./calc 1+1 5-6/2+3*4
*/
#include <stdio.h>
#include <string.h>
#include <malloc.h>
#include <math.h>
void* parse_expression (const char* expression);
double evaluate_expression_tree (const void* tree);
void print_expression_tree (const void* tree);
void delete_expression_tree (void* tree);
int main (int argc, char** argv)
{
void* tree;
double result;
int i;
if(argc<=1) printf("Usage: <program> <expression1> <expression2> ...\n");
for(i=1; i<argc; i++)
{
printf("________________________________%d_\n", i);
tree = parse_expression(argv[i]);
if(tree==NULL) continue;
print_expression_tree(tree);
result = evaluate_expression_tree(tree);
delete_expression_tree(tree);
printf("result = %f\n", result);
}
return 0;
}
/* ... rest of source code ... */
```

Download the complete C source code

Below is the portion of the source code that implements the algorithm.

```
typedef struct _Node {
enum NodeID ID;
int pre; /* precedence */
double number;
struct _Node *left, *right, *parent;
} Node;
Node* insert_node_item (Node* current, Node item, enum NodeInfo info)
{
Node* node;
if(info != SkipClimbUp)
{
/* step 4: climb up */
if(info != RightAssociative)
{
/* for left-associative */
while(current->pre >= item.pre)
current = current->parent;
}
else
{
/* for right-associative */
while(current->pre > item.pre)
current = current->parent;
}
}
if(item.ID == CLOSEBRACKET)
{
/* step 5.1: remove the '(' node */
node = current->parent;
node->right = current->right;
if(current->right) current->right->parent = node;
/* step 5.2: delete the '(' node */
memset(current, 0, sizeof(Node));
free(current);
/* step 6: Set the 'current node' to be the parent node */
current = node;
}
else
{
/* step 5.1: create the new node */
node = malloc(sizeof(Node));
*node = item;
node->right = NULL;
/* step 5.2: add the new node */
node->left = current->right;
if(current->right) current->right->parent = node;
current->right = node;
node->parent = current;
/* step 6: Set the 'current node' to be the new node */
current = node;
}
return current;
}
```

It turns out that this stack-based version is essentially the Precedence Climbing algorithm, for the reason that it considers a

...to be updated...

Consider donating at https://www.rhyscitlema.com/donate to encourage improvement!

```
/*
To compile with GCC, execute:
gcc -Wall -ansi -pedantic -o calc expression-parse-and-evaluate.c -lm
To run, execute: ./calc 1+1 5-6/2+3*4
*/
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <math.h>
double parse_and_evaluate_expression (const char* expression);
int main (int argc, char** argv)
{
double result;
int i;
if(argc<=1) printf("Usage: <program> <expression1> <expression2> ...\n");
for(i=1; i<argc; i++)
{
printf("________________________________%d_\n", i);
result = parse_and_evaluate_expression(argv[i]);
printf("result = %f\n", result);
}
return 0;
}
/* ... rest of source code ... */
```

Download the complete C source code

Below is the portion of the source code that implements the algorithm.

```
typedef struct _Node {
enum NodeID ID;
int pre; /* precedence */
double number;
} Node;
Node* insert_node_item (Node* current, Node item, enum NodeInfo info)
{
double left, right;
if(info == SkipClimbUp) current+=1;
else
{
while(1)
{
if(info == RightAssociative)
{ if(current->pre <= item.pre) break; }
else { if(current->pre < item.pre) break; }
left = (current-1)->number;
right = (current+1)->number;
switch(current->ID)
{
case POSITIVE: ( current)->number = +right; break;
case NEGATIVE: ( current)->number = -right; break;
case PLUS: (--current)->number = left + right; break;
case MINUS: (--current)->number = left - right; break;
case TIMES: (--current)->number = left * right; break;
case DIVIDE: (--current)->number = left / right; break;
case EXPONENT: (--current)->number = pow(left, right); break;
case FACTORIAL: (--current)->number = factorial(left); break;
case SIN: ( current)->number = sin(right); break;
case COS: ( current)->number = cos(right); break;
case TAN: ( current)->number = tan(right); break;
default: break;
}
current->ID = NUMBER; /* force node to be a number */
current->pre = 10; /* set to highest precedence */
current--; /* prepare for next iteration */
}
switch(item.ID)
{
case CLOSEBRACKET:
*current = *(current+1);
/* current-=1; this part is optional */
break;
case PLUS:
case MINUS:
case TIMES:
case DIVIDE:
case EXPONENT:
case FACTORIAL: current+=2; break;
default: current+=1; break;
}
}
if(item.ID != CLOSEBRACKET)
*current = item;
return current;
}
```

- http://en.wikipedia.org/wiki/Binary_expression_tree
- https://en.wikipedia.org/wiki/Tree_traversal
- https://en.wikipedia.org/wiki/Infix_notation
- https://en.wikipedia.org/wiki/Time_complexity#Linear_time
- http://en.wikipedia.org/wiki/Operator-precedence_parser
- https://github.com/rhyscitlema/librfet

- Evaluate any math expression online
- Convert sorted list to complete BST
- Generate nth palindromic number
- Find maximum bipartite matching