← Home

Calculator with Functional Parsing

2025-09-01


Writing an expression parser evaluator using Scala and Cats

Introduction

To write a program that could take input such as "(1 + 3) * (4 * 5 - 6)" return the answer, we need a program that parses the data and evaluates it. Parsing is reading input and extracting data.

The structure of one possible example of a parser could be represented as this signature.

type Parser[A] = String => Option[(String, A)]

So Parser[A] is a function that consumes some text input and depending on that input may return the Some case of Option containing the parsed value and the rest of the input to be consumed by subsequent parsers, or else fail with None. A parser for parsing integers could be a Parser[Int] and may consume some characters from the string, returning the parsed number and the rest of the string.

Parsing step
”56+42”Some((“+42”, 56))

The next useful thing to understand is the State Monad. and for our purposes the StateT monad transformer. You can think of it like this.

case class StateT[F[_], S, A](f: S => F[(S, A)])

StateT is a monad so it has flatMap for chaining StateT computations together. It therefore is also an Applicative (so we can use mapN and we can use it in a traverse) and a Functor (so we can use map).

It is useful for our purposes because it essentially is a wrapper for our Parser form above.

Parser as StateT
Parser[A]StateT[String, Option, A]

And we can use the flatMap functions on StateT to chain together parse steps.

val parseAddition: Parser[Int] = for {
  x <- parseInt
  _ <- parseChar('+')
  y <- parseInt
} yield x + y
Parsing illustration
parseAddition.run(“56+42”)Some((""), 100)

Writing our first parsers

def anyChar: Parser[Char] = StateT { s =>
  if s.isEmpty then None
  else Some((s.substring(1), s.charAt(0)))
}

The above parser is the most simple, it accepts the first character of the string and returns it along with the rest of the string, or fails if empty.

a non empty string
anyChar.run(“abc”)Some((“bc”), ‘a’)
an empty string
anyChar.run("")None
def ifChar(p: Char => Boolean): Parser[Char] = anyChar.filter(p)

The above parser takes a predicate and returns the character if the predicate returns true. While the state monad doesn’t have the filter function. StateT[F, *] can use the filter from F if it has one, (in this case the filter on Option). So it may turn a Some into a None.

def digit: Parser[Char] = ifChar(_.isDigit)
a valid digit
digit.run(“42”)Some((“2”, ‘4’))
an invalid digit
digit.run(“abc”)None

In order to use a parser as many times as possible we need to build two more combinators.

def some[A](fa: Parser[A]): Parser[List[A]] = for {
  a <- fa
  as <- many(fa)
} yield a :: as
 
def many[A](fa: Parser[A]): Parser[List[A]] =
  some(fa) <+> List.empty[A].pure[Parser]

They are defined in terms of each other. Both take a Parser[A] and consume as many As from the stream until it fails and return them in a list. However some will return at least one A or fail. And many may return an empty list if it is unable to parse any As. Here we use <+> which isn’t a function of StateT but comes from Option’s Alternative instance. You can think of it as or else.

behaviour of some
some(digit).run(“42+58”)Some((“+58”, List(‘4’, ‘2’)))
some(digit).run(“+58”)None
behaviour of many
many(digit).run(“42+58”)Some((“+58”, List(‘4’, ‘2’)))
some(digit).run(“+58”)Some((“+58”, List()))

Now we have some we can write a parser for natural numbers (positive or zero). Also negative numbers. And then integers in general.

val nat: Parser[Int] = some(digit).map(_.mkString.toInt)
def char(c: Char): Parser[Char] = ifChar(_ == c)
val neg: Parser[Int] = char('-') *> nat.map(-_)
val int: Parser[Int] = neg <+> nat
int behaviour
int.run(“42+58”)Some((“+58”, 42))

We have a lot of the techniques we need do do a whole expression parser now. If we refer to Professor Graham Hutton’s Functional Parsing Video on youtube. He defines an expression parser like this

-- expr ::= term + expr | term
-- term ::= factor * term | factor
-- factor ::= (expr) | int

This conforms to BIDMAS because while that tells us to do brackets … then multiplication … then addition. This is a bottom up way of thinking. To think of it in the order of a recursive descent parser you need to reverse those. That is, you could think of it in a top down way as being additions of multiplications of expressions in brackets (or numbers).

We can translate this directly to parsers like this.

def expr: Parser[Int] = {
  for {
    t <- term
    _ <- char('+')
    e <- expr
  } yield t + e
} <+> term
 
def term: Parser[Int] = {
  for {
    f <- factor
    _ <- char('*')
    t <- term
  } yield f * t
} <+> factor
 
def factor: Parser[Int] = {
  for {
    _ <- char('(')
    e <- expr
    _ <- char(')')
  } yield e
} <+> int

Let’s extract some common logic and rewrite expr and term:

def add: Parser[(Int, Int) => Int] = char('+').as(_ + _)
 
def multiply: Parser[(Int, Int) => Int] = char('*').as(_ * _)
 
def chainR[A](fa: Parser[A], op: Parser[(A, A) => A]): Parser[A] = {
  for {
    a <- fa
    f <- op
    r <- chainR(fa, op)
  } yield f(a, r)
} <+> fa
 
def expr: Parser[Int] = chainR(term, add)
 
def term: Parser[Int] = chainR(factor, multiply)
expr behaviour
expr.run(“1+2*(3+4)+3*5”)Some(("", 30))

Nice!

If we expand this to include subtraction and integer division however:

def subtract: Parser[(Int, Int) => Int] = char('-').as(_ - _)
 
def divide: Parser[(Int, Int) => Int] = char('/').as(_ / _)
 
def expr: Parser[Int] = chainR(term, add <+> subtract)
 
def term: Parser[Int] = chainR(factor, multiply <+> divide)

We find that it doesn’t behave how we would wish it to:

expr behaviour using chainR
expr.run(“6-3+2-1”)Some(("", 2))

This isn’t what we would expect our expression should return (4). The reason for this is that chainR is right associative. So it evaluates 6-3+2-1 as 6-(3+(2-1)). But addition and subtraction together needs to associate to the left. As ((6-3)+2)-1. As does multiplication and division.

Our chainR function is quite analogous to foldRight which is named as such as it associates to the right.

def foldRight[A, B](l: List[A], z: B, f: (A, B) => B): B = l match {
  case Nil     => z
  case a :: as => f(a, foldRight(as, z, f))
}
foldRight behaviour
foldRight(List(a, b, c), z, f)f(a, f(b, f(c, z)))

So to remedy our function we want something more akin to foldLeft.

def foldLeft[A, B](l: List[A], z: B, f: (B, A) => B): B = l match {
  case Nil     => z
  case a :: as => foldLeft(as, f(z, a), f)
}
foldLeft behaviour
foldRight(List(a, b, c), z, f)f(f(f(z, a), b), c)

The implementation gives us a clue: For our new function chainL, instead of passing our recursion step to our binary operator, we need to apply arguments and then recurse. Here is one possible solution.

def chainL[A](fa: Parser[A], op: Parser[(A, A) => A]): Parser[A] = {
  def rest(x: A): Parser[A] = {
    for {
      f <- op
      y <- fa
      r <- rest(f(x, y))
    } yield r
  } <+> x.pure[Parser]
  fa.flatMap(rest)
}

With expr and term using chainL they now work correctly

def expr: Parser[Int] = chainL(term, add <+> subtract)
 
def term: Parser[Int] = chainL(factor, multiply <+> divide)
expr behaviour using chainL
expr.run(“6-3+2-1”)Some(("", 4))