Test Driven Development - A practical Example - Page 10 of 12

Step Ten

We have implemented all the basic operators for arbitrarily long expressions. Only one item remains on our list. In this step we will start to implement bracketed expressions. This last feature will show if our current architecture is extendable, as we will have to touch all parts of the application.

We need to extend Tokenizer to accept brackets. ExpressionValidator has to check the order of the brackets is valid. Last but not least, our Calculator class needs to deal with the brackets in the expression.

Let’s start again with some integration tests.

Integration Tests

In the beginning, we have to define how exactly we are going to use bracketed expressions. We make the following definitions:

We use parentheses only: ‘(‘ and ‘)’.
Nested brackets are allowed, like in: (8 * (3 – 2)) + 2
A pair of brackets must contain a valid expression, which is at least two operands and an operator.

We put these rules into integration tests. We start with some valid examples:

def test_valid_brackets(self):
        output = self.app.calculate("(3 + 5) * 2")
        self.assertEqual(output, "16")

        output = self.app.calculate("(8 * (3 - 2)) + 2 ")
        self.assertEqual(output, "10")

        output = self.app.calculate("20 / ((5 - 2.5) + (1 + 0.5) )")
        self.assertEqual(output, "5")

The test fails, because the result is “Invalid Expression” instead of the expected result. This is what we expect, because Tokenizer does not know about brackets yet.

Next we need some examples for invalid brackets.

def test_empty_brackets(self):
    output = self.app.calculate("2 * ()")
    self.assertEqual(output, "Invalid expression")

def test_bracket_without_closing_bracket(self):
    output = self.app.calculate("2 * ( 6")
    self.assertEqual(output, "Invalid expression")

def test_bracket_without_opening_bracket(self):
    output = self.app.calculate("2 * 6)")
    self.assertEqual(output, "Invalid expression")

def test_too_many_opening_brackets(self):
    output = self.app.calculate("2 * ((6 + 3)")
    self.assertEqual(output, "Invalid expression")

def test_too_many_closing_brackets(self):
    output = self.app.calculate("2 * (6 + 3))")
    self.assertEqual(output, "Invalid expression")

def test_bracket_containing_invalid_expression(self):
    output = self.app.calculate("2 * (6 + )")
    self.assertEqual(output, "Invalid expression")

These tests pass now, because Tokenizer raises an exception every time it encounters a bracket.

With our tests in place, let’s start extending our classes step by step.

Tokenizing Brackets

Right now, Tokenizer does not know how to interpret brackets. We have to teach it how. To do this, we will need a new enum to represent brackets. We call it Bracket and give it two members:

from enum import Enum

class Bracket(Enum):
    Opening = 1
    Closing = 2

With this enum, we can formulate a unit test to tokenize brackets:

def test_tokenize_brackets(self):
    output = self.tokenizer.tokenize("(1.23 + -3)")
    self.assertEqual(output, [Bracket.Opening, Number(1.23), Operator.Addition, Number(-3), Bracket.Closing])

This test fails because Tokenizer raises a SyntaxError. Let’s add the brackets to the regular expression of Tokenizer. It now looks like this:

expr = re.compile(r"-?\d+.\d+|-?\d+|[+*/\-()]")

We simply add the brackets to the last part of the expression that currently extracts all the operators. Now that the expression captures the brackets, we have to convert them to instances of the Bracket enum. This is done in create_token:

def create_token(self, element):
    if self.can_be_converted_to_int(element):
        return Number(int(element))
    elif self.can_be_converted_to_float(element):
        return Number(float(element))
    elif(element == '+'):
        return Operator.Addition
    elif(element == '-'):
        return Operator.Subtraction
    elif(element == '*'):
        return Operator.Multiplication
    elif(element == '/'):
        return Operator.Division
    elif(element == '('):
        return Bracket.Opening
    elif(element == ')'):
        return Bracket.Closing
    else:
        raise SyntaxError("Invalid Token")

This change makes the new unit test pass. Tokenizer handles brackets correctly now.

At this stage, only the integration test for the valid expressions with brackets fails. The tests for invalid expressions still pass, even though Tokenizer should not raise an exception anymore. This means that ExpressionValidator also considers the expressions incorrect. Let’s tackle this next.

Validating Expressions with Brackets

To validate an expression with brackets, we must ensure that two conditions apply. First, the number of opening and closing brackets must be the same. Second, each pair of brackets must contain a valid expression.

Going through the expression from left to right, we count the number of opening and closing brackets. At any given time, the number of closing brackets must never be higher than the number of opening brackets. At the end of the expression, the number of both bracket types must be the same.

To check if a pair of brackets contains a valid expression, we go from left to right again. We check the first pair of brackets and extract its content. We check if this subexpression is a valid expression. As this subexpression might contain brackets again, we will have to create a recursive algorithm.

If the expression fulfills these conditions, we consider it valid. Let’s implement these checks test-driven.

We start with a unit test that gives several examples of valid and invalid expressions with brackets.

def test_expressions_with_brackets(self):
    # One pair of Brackets
    isValid = self.validator.validate([Number(1), Operator.Subtraction, Bracket.Opening, Number(3), Operator.Addition, Number(5), Bracket.Closing, Operator.Addition, Number(2)])
    self.assertTrue(isValid)

    # Nested Brackets
    isValid = self.validator.validate([Bracket.Opening, Bracket.Opening, Number(3), Operator.Addition, Number(5), Bracket.Closing, Operator.Addition, Number(2), Bracket.Closing])
    self.assertTrue(isValid)

    # Closing Bracket missing
    isValid = self.validator.validate([Bracket.Opening, Number(3), Operator.Addition, Number(5), Operator.Addition, Number(2)])
    self.assertFalse(isValid)

    # Opening Bracket missing
    isValid = self.validator.validate([Number(3), Operator.Addition, Number(5), Bracket.Closing, Operator.Addition, Number(2)])
    self.assertFalse(isValid)

    # Invalid sub-expression in bracket
    isValid = self.validator.validate([Bracket.Opening, Number(3), Number(5), Bracket.Closing, Operator.Addition, Number(2)])
    self.assertFalse(isValid)

We run the test to assert that it fails. The first check fails in the test fails, which means that ExpressionValidator considers this expression invalid. We have to fix that. Let’s look at the current implementation:

from Number import Number
from Operator import Operator

class ExpressionValidator:
    def validate(self, tokens):
        if(len(tokens) == 0):
            return True

        if(len(tokens) >= 3 and len(tokens) % 2 == 1):
            even_elements = tokens[::2]
            odd_elements = tokens[1::2]
            if(all(isinstance(x, Number) for x in even_elements) and all(isinstance(x, Operator) for x in odd_elements)):
                return True

        return False

As we can see, the current implementation checks that all elements are either of type Number or of type Operator. We will have to remodel this check, because with brackets in place, we cannot rely on the simple rule of odd and even elements.

Before we can change that rule, we need to check that the brackets match. If they don’t match, then it does not make sense to compare the expression even further.

from Number import Number
from Operator import Operator
from Bracket import Bracket

class ExpressionValidator:
    def validate(self, tokens):
        if(len(tokens) == 0):
            return True

        if self.compare_brackets(tokens) == False:
            return False

        if(len(tokens) >= 3 and len(tokens) % 2 == 1):
            even_elements = tokens[::2]
            odd_elements = tokens[1::2]
            if(all(isinstance(x, Number) for x in even_elements) and all(isinstance(x, Operator) for x in odd_elements)):
                return True

        return False

    def compare_brackets(self, tokens):
        num_opening_brackets = 0
        num_closing_brackets = 0
        for token in tokens:
            if isinstance(token, Bracket):
                if(token == Bracket.Opening):
                    num_opening_brackets += 1
                elif(token == Bracket.Closing):
                    num_closing_brackets += 1

            if(num_closing_brackets > num_opening_brackets):
                return False

        if(num_opening_brackets == num_closing_brackets):
            return True
        else:
            return False

This does not fix our test yet, but is one step further. Now we can extract the content of the brackets and check if it is valid.

class ExpressionValidator:
    def validate(self, tokens):
        if(len(tokens) == 0):
            return True

        if self.compare_brackets(tokens) == False:
            return False

        while self.contains_brackets(tokens):
            tokens = self.replace_brackets(tokens)

        if self.is_valid_expression(tokens) == False:
            return False

        return True

    def compare_brackets(self, tokens):
        num_opening_brackets = 0
        num_closing_brackets = 0
        for token in tokens:
            if isinstance(token, Bracket):
                if(token == Bracket.Opening):
                    num_opening_brackets += 1
                elif(token == Bracket.Closing):
                    num_closing_brackets += 1

            if(num_closing_brackets > num_opening_brackets):
                return False

        if(num_opening_brackets == num_closing_brackets):
            return True
        else:
            return False

    def contains_brackets(self, tokens):
        for token in tokens:
            if isinstance(token, Bracket) and token == Bracket.Opening:
                return True

        return False

    def is_valid_expression(self, tokens):
        if(len(tokens) >= 1 and len(tokens) % 2 == 1):
            even_elements = tokens[::2]
            odd_elements = tokens[1::2]
            if(all(isinstance(x, Number) for x in even_elements) and all(isinstance(x, Operator) for x in odd_elements)):
                return True

        return False

    def replace_brackets(self, tokens):
        """Replaces the first bracketed expression with either a Number if the bracket contains a valid Expression or with None if it is not valid"""
        opening_index, closing_index = self.find_first_pair_of_brackets(tokens)

        if(opening_index >= 0 and closing_index >= 0):
            subexpression = tokens[opening_index + 1 : closing_index]

            if self.contains_brackets(subexpression):
                subexpression = self.replace_brackets(subexpression)

            if self.is_valid_expression(subexpression):
                replacement_value = Number(0)
            else:
                replacement_value = None

            if(opening_index == 0):
                first_part = []
            else:
                first_part = tokens[0:opening_index]

            if(closing_index == len(tokens)-1):
                last_part = []
            else:
                last_part = tokens[closing_index+1: ]

            new_expression = first_part + [replacement_value] + last_part
            return new_expression

        else:
            return tokens

    def find_first_pair_of_brackets(self, tokens):
        num_open_brackets = 0
        opening_index = -1
        closing_index = -1
        for token_index, token in enumerate(tokens):
            if isinstance(token, Bracket):
                if token == Bracket.Opening:
                    num_open_brackets += 1

                    if opening_index == -1:
                        opening_index = token_index

                if token == Bracket.Closing:
                    num_open_brackets -= 1
                    if(num_open_brackets == 0):
                        closing_index = token_index
                        break

        return (opening_index, closing_index)

This implementation fixes the unit test for expressions containing brackets. It works as follows:

Check if all brackets match. Otherwise, it does not make sense to continue.
Replace all brackets in the expression with replacement values.
- Find the first opening bracket and the corresponding closing bracket.
- If the content of a bracket is a valid expression, replace the bracket with a Number. At this place, we don’t know the result of the expression inside the brackets, but we do know that it is a Number. So we insert a Number with value 0.
- If the content is invalid, we replace it with an instance of None.
- If the bracket contains another bracket, replace the content of the bracket recursively.
After all brackets have been replaced, check if the resulting expression is valid. It is valid if it only contains Numbersand Operators. If it contains an element of type None, then it is invalid.

Summary of Step Ten

At this stage, we can parse expressions with brackets and check if they are valid. Next we need to calculate the result of the expressions. This will be the topic of Step Eleven.

Looking at the code of ExpressionValidator, we see that it has grown considerably to account for the brackets. This functionality seems like a good candidate for refactoring into a separate class. We might even reuse the functionality to extract the content of the brackets when we are going to calculate the expressions.

We will deal with this in the next step.

Tags TDD

Step Ten

Integration Tests

Tokenizing Brackets

Validating Expressions with Brackets

Summary of Step Ten

Leave a Reply Cancel reply