Test Driven Development - A practical Example - Page 11 of 12

Step Eleven

In Step Ten, we started the implementation of bracketed expressions. We can tokenize and validate those expressions. In this step, we are going to implement the calculation of these expressions.

Extract Bracket Handling Code

At the end of the last step, we extended ExpressionValidator to handle brackets. This functionality increased the size of that class several times. These methods are suitable for extraction into a new class. As we are going to need some of these methods for calculation of the bracketed expressions, we do the refactoring at the beginning of this step.

As the new class helps us to resolve expressions that contain brackets, we call it BracketResolver. It looks like this:

class BracketResolver:
    def contains_brackets(self, tokens):
        for token in tokens:
            if isinstance(token, Bracket) and token == Bracket.Opening:
                return True

        return False

    def compare_brackets(self, tokens):
        num_opening_brackets = 0
        num_closing_brackets = 0
        for token in tokens:
            if isinstance(token, Bracket):
                if(token == Bracket.Opening):
                    num_opening_brackets += 1
                elif(token == Bracket.Closing):
                    num_closing_brackets += 1

            if(num_closing_brackets > num_opening_brackets):
                return False

        if(num_opening_brackets == num_closing_brackets):
            return True
        else:
            return False

    def find_first_pair_of_brackets(self, tokens):
        num_open_brackets = 0
        opening_index = -1
        closing_index = -1
        for token_index, token in enumerate(tokens):
            if isinstance(token, Bracket):
                if token == Bracket.Opening:
                    num_open_brackets += 1

                    if opening_index == -1:
                        opening_index = token_index

                if token == Bracket.Closing:
                    num_open_brackets -= 1
                    if(num_open_brackets == 0):
                        closing_index = token_index
                        break

        return (opening_index, closing_index)

Now ExpressionValidator is much shorter.

class ExpressionValidator:
    def __init__(self):
        self.bracket_resolver = BracketResolver()

    def validate(self, tokens):
        if(len(tokens) == 0):
            return True

        if self.bracket_resolver.compare_brackets(tokens) == False:
            return False

        while self.bracket_resolver.contains_brackets(tokens):
            tokens = self.replace_brackets(tokens)

        if self.is_valid_expression(tokens) == False:
            return False

        return True

    def is_valid_expression(self, tokens):
        if(len(tokens) >= 1 and len(tokens) % 2 == 1):
            even_elements = tokens[::2]
            odd_elements = tokens[1::2]
            if(all(isinstance(x, Number) for x in even_elements) and all(isinstance(x, Operator) for x in odd_elements)):
                return True

        return False

    def replace_brackets(self, tokens):
        """Replaces the first bracketed expression with either a Number if the bracket contains a valid Expression or with None if it is not valid"""
        opening_index, closing_index = self.bracket_resolver.find_first_pair_of_brackets(tokens)

        if(opening_index >= 0 and closing_index >= 0):
            subexpression = tokens[opening_index + 1 : closing_index]

            if self.bracket_resolver.contains_brackets(subexpression):
                subexpression = self.replace_brackets(subexpression)

            if self.is_valid_expression(subexpression):
                replacement_value = Number(0)
            else:
                replacement_value = None

            if(opening_index == 0):
                first_part = []
            else:
                first_part = tokens[0:opening_index]

            if(closing_index == len(tokens)-1):
                last_part = []
            else:
                last_part = tokens[closing_index+1: ]

            new_expression = first_part + [replacement_value] + last_part
            return new_expression

        else:
            return tokens

The only method relating to brackets that is left in ExpressionValidator is replace_brackets. This method is very specific for ExpressionValidator, because it replaces the brackets with dummy replacement values that are only useful for the expression evaluation.

We check that our code still works by running the unit tests. Do we need to write unit tests for BracketResolver? In a real application, we should indeed do this, especially as this is a central functionality. In this example, we are going to skip the implementation of the unit test, as this does not show anything new.

Now that we have a class to handle brackets, we can finally implement the calculation.

Calculate the Result of Bracketed Expressions

Let’s add a unit test for our Calculator.

def test_brackets(self):
    # One pair of Brackets
    result = self.calculator.calculate([Number(1), Operator.Multiplication, Bracket.Opening, Number(3), Operator.Addition, Number(5), Bracket.Closing, Operator.Division, Number(2)])
    self.assertEqual(result, 4)

    # Nested Brackets
    result = self.calculator.calculate([Bracket.Opening, Bracket.Opening, Number(3), Operator.Addition, Number(5), Bracket.Closing, Operator.Multiplication, Number(2), Bracket.Closing])
    self.assertEqual(result, 16)

This test fails with an interesting error message:

ERROR: test_brackets (testCalculator.TestCalculator)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../testCalculator.py", line 60, in test_brackets
    result = self.calculator.calculate([Number(1), Operator.Multiplication, Bracket.Opening, Number(3), Operator.Addition, Number(5), Bracket.Closing, Operator.Division, Number(2)])
  File ".../Calculator.py", line 16, in calculate
    root_term = self.build_term(root_term, *rest[0:2])
  File ".../Calculator.py", line 24, in build_term
    if operator.has_higher_precedence_than(operand_1.get_operator()):
AttributeError: 'Number' object has no attribute 'has_higher_precedence_than'

The output “Number’ object has no attribute ‘has_higher_precedence_than” tells us that something is going wrong building the terms inside calculator. The assumption that three tokens that follow each other are a number, an operator and a number does not hold any longer for bracketed expressions. This means we have to adapt our term building algorithm to handle brackets.

Whenever the term building algorithm encounters an opening bracket, it has to find the content of this bracket and replace it with the result of the corresponding term. To enable the method build_term to do this, it needs to know not only the next three tokens, but all tokens. So our first step is to pass all remaining tokens into the method and let it not only return the new term, but also the rest of the tokens that remain after the next bracket has been processed.

class Calculator:
    def calculate(self, tokens):
        if len(tokens) < 3:
            return 0

        # Build first root term
        root_term, rest = self.build_term(*tokens[0:3], tokens[3:])

        # Consume expression from left to right and update the root term
        while(type(rest) is list and len(rest) > 0):
            root_term, rest = self.build_term(root_term, *rest[0:2], rest[2:])

        # Calculate the expression, starting with the root term
        return self.calculate_term(root_term).get_value()

    def build_term(self, operand_1, operator, operand_2, rest):
        if type(operand_1) is Term:
            if operator.has_higher_precedence_than(operand_1.get_operator()):
                # Rebuild term to reflect higher operator precedence
                term_1 = Term(operand_1.get_operand_2(), operator, operand_2)
                term_2 = Term(operand_1.get_operand_1(), operand_1.get_operator(), term_1)
                return term_2, rest

        # No operator precedence, build new term from left to right
        term = Term(operand_1, operator, operand_2)
        return term, rest

Now we can begin replacing the bracketed expressions. We start with the case that the second operand in an expression is an opening bracket. In this case, we have to extract the complete bracketed expression, determine its content as a term and then calculate the result of that term. Then we can replace the bracketed expression with the resulting number. The code looks like this:

def build_term(self, operand_1, operator, operand_2, rest):
    if type(operand_2) is Bracket and operand_2 == Bracket.Opening:
        term_2, removed_tokens = self.replace_bracket_with_expression([operand_2] + rest)
        rest = rest[removed_tokens-1:]
        operand_2 = self.calculate_term(term_2)

    if type(operand_1) is Term:
        if operator.has_higher_precedence_than(operand_1.get_operator()):
            # Rebuild term to reflect higher operator precedence
            term_1 = Term(operand_1.get_operand_2(), operator, operand_2)
            term_2 = Term(operand_1.get_operand_1(), operand_1.get_operator(), term_1)
            return term_2, rest

    # No operator precedence, build new term from left to right
    term = Term(operand_1, operator, operand_2)
    return term, rest

def replace_bracket_with_expression(self, tokens):
    (opening_index, closing_index) = self.bracket_resolver.find_first_pair_of_brackets(tokens)
    removed_tokens = closing_index + 1

    tokens = tokens[opening_index+1 : closing_index]

    # Build first root term
    root_term, rest = self.build_term(*tokens[0:3], tokens[3:])

    # Consume expression from left to right and update the root term
    while(type(rest) is list and len(rest) > 0):
        root_term, rest = self.build_term(root_term, *rest[0:2], rest[2:])

    return root_term, removed_tokens

In replace_bracket_with_expression, we make use of BracketResolver that we created at the beginning of the step. After determining the content of the expression, we determine a Term for the content of the brackets. It turns out to be the same code as in calculate. This is a good candidate for a refactoring. Let’s keep it in mind for the refactoring step.

This implementation makes the first check in test_brackets pass. But the second step is still failing. The reason is that the second step has a parentheses right at the beginning. We need to handle this case now. If the first operand is an opening bracket, then the resulting term should be the content of this bracket, and the rest should be everything following the bracket. We implement it in the following way:

 def build_term(self, operand_1, operator, operand_2, rest):
    if type(operand_1) is Bracket and operand_1 == Bracket.Opening:
        term_1, rest_after_brackets = self.replace_bracket_with_expression([operand_1, operator, operand_2] + rest)
        rest = rest_after_brackets
        return term_1, rest

    if type(operand_2) is Bracket and operand_2 == Bracket.Opening:
        term_2, rest_after_brackets = self.replace_bracket_with_expression([operand_2] + rest)
        rest = rest_after_brackets
        operand_2 = self.calculate_term(term_2)

    if type(operand_1) is Term:
        if operator.has_higher_precedence_than(operand_1.get_operator()):
            # Rebuild term to reflect higher operator precedence
            term_1 = Term(operand_1.get_operand_2(), operator, operand_2)
            term_2 = Term(operand_1.get_operand_1(), operand_1.get_operator(), term_1)
            return term_2, rest

    # No operator precedence, build new term from left to right
    term = Term(operand_1, operator, operand_2)
    return term, rest

def replace_bracket_with_expression(self, tokens):
    (opening_index, closing_index) = self.bracket_resolver.find_first_pair_of_brackets(tokens)
    rest_after_brackets = tokens[closing_index+1:]

    tokens = tokens[opening_index+1 : closing_index]

    # Build first root term
    root_term, rest = self.build_term(*tokens[0:3], tokens[3:])

    # Consume expression from left to right and update the root term
    while(type(rest) is list and len(rest) > 0):
        root_term, rest = self.build_term(root_term, *rest[0:2], rest[2:])

    return root_term, rest_after_brackets

Running the tests, we don’t get an exception any more, but the result is incorrect:

AssertionError: '13' != '16'

The reason is that if we replace a bracket with a Term, then this Term will not have any special precedence. In this implementation, we lose the precedence of the brackets, and thus the algorithm actually calculates 3 + 5 * 2, which gives 13. A potential remedy would be to return the result of the Term instead of the Term. We try it out and notice that we now receive an exception. All the users of build_term expect a Term as a result, not a Number.

To solve this, we use a little trick to return the result of the bracket as a Term. We simply return a Term with the same value, which can be constructed by multiplying the result with one:

def build_term(self, operand_1, operator, operand_2, rest):
    if type(operand_1) is Bracket and operand_1 == Bracket.Opening:
        term_1, rest_after_brackets = self.replace_bracket_with_expression([operand_1, operator, operand_2] + rest)
        rest = rest_after_brackets
        result = Term(self.calculate_term(term_1), Operator.Multiplication, Number(1))
        return result, rest

     ....

This fixes our unit test for Calculator. It also advances our tests for CalculatorApplication. But we receive a new error now.

Fixing the Integration Tests

FAIL: test_valid_brackets (testIntegrationCalculatorApplication.TestIntegrationCalculatorApplication)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../testIntegrationCalculatorApplication.py", line 163, in test_valid_brackets
    self.assertEqual(output, "5")
AssertionError: 'Invalid expression' != '5'
- Invalid expression
+ 5

This problem has not been visible before because the result of the preceding examples in that test was incorrect. Now it is correct. It seems that either our Tokenizer or ExpressionValidator does not consider this example valid. Maybe we have missed something and our integration test has found it. The failing example looks like this:

output = self.app.calculate("20 / ((5 - 2.5) + (1 + 0.5) )")
self.assertEqual(output, "5.00")

Let’s check if Tokenizer can handle this input by adding a new example to test_tokenize_brackets:

def test_tokenize_brackets(self):
    output = self.tokenizer.tokenize("(1.23 + -3)")
    self.assertEqual(output, [Bracket.Opening, Number(1.23), Operator.Addition, Number(-3), Bracket.Closing])

    output = self.tokenizer.tokenize("20 / ((5 - 2.5) + (1 + 0.5) )")
    self.assertEqual(output, [Number(20), Operator.Division, Bracket.Opening, Bracket.Opening, Number(5), Operator.Subtraction, Number(2.5), Bracket.Closing, Operator.Addition, Bracket.Opening, Number(1), Operator.Addition, Number(0.5), Bracket.Closing, Bracket.Closing])

The test still passes, so the problem is in ExpressionValidator. Let’s add an example there, too:

def test_expressions_with_brackets(self):
    ...
    isValid = self.validator.validate([Number(20), Operator.Division, Bracket.Opening, Bracket.Opening, Number(5), Operator.Subtraction, Number(2.5), Bracket.Closing, Operator.Addition, Bracket.Opening, Number(1), Operator.Addition, Number(0.5), Bracket.Closing, Bracket.Closing])
    self.assertTrue(isValid)
    ...

That test fails, which proves we have a problem in ExpressionValidator. It turns out the source of the problem is that this expressions contains multiple bracketed expressions inside a bracket, which is not handled correctly. The solution is to call replace_brackets until all brackets have been replaced:

def replace_brackets(self, tokens):
    """Replaces the first bracketed expression with either a Number if the bracket contains a valid Expression or with None if it is not valid"""
    opening_index, closing_index = self.bracket_resolver.find_first_pair_of_brackets(tokens)

    if(opening_index >= 0 and closing_index >= 0):
        subexpression = tokens[opening_index + 1 : closing_index]

        while self.bracket_resolver.contains_brackets(subexpression): # FIX: Replace until all brackets have been replaced
            subexpression = self.replace_brackets(subexpression)

        if self.is_valid_expression(subexpression):
            replacement_value = Number(0)
        else:
            replacement_value = None

        if(opening_index == 0):
            first_part = []
        else:
            first_part = tokens[0:opening_index]

        if(closing_index == len(tokens)-1):
            last_part = []
        else:
            last_part = tokens[closing_index+1: ]

        new_expression = first_part + [replacement_value] + last_part
        return new_expression

    else:
        return tokens

Now all tests pass and we have successfully implemented bracketed expressions.

Refactoring

As we noticed above, we introduced some duplicated code in Calculator::replace_bracket_with_expression. Before we conclude this step, we extract this functionality to a method to get rid of the duplication.

class Calculator:
    def __init__(self):
        self.bracket_resolver = BracketResolver()

    def calculate(self, tokens):
        if len(tokens) < 3:
            return 0

        root_term = self.build_root_term_from_tokens(tokens)
        return self.calculate_term(root_term).get_value()

    def build_root_term_from_tokens(self, tokens):
        # Build first root term
        root_term, rest = self.build_term(*tokens[0:3], tokens[3:])

        # Consume expression from left to right and update the root term
        while(type(rest) is list and len(rest) > 1):
            root_term, rest = self.build_term(root_term, *rest[0:2], rest[2:])

        return root_term

    ...

    def replace_bracket_with_expression(self, tokens):
        (opening_index, closing_index) = self.bracket_resolver.find_first_pair_of_brackets(tokens)
        rest_after_brackets = tokens[closing_index+1:]
        tokens = tokens[opening_index+1 : closing_index]

        root_term = self.build_root_term_from_tokens(tokens)

        return root_term, rest_after_brackets

    ....

We moved this common functionality to build_root_term_from_tokens. Now the code looks much more concise.

Summary of Step Eleven

In this step, we finished the implementation of bracketed expressions.

First, we extracted our implementation to handle brackets from the last step into a separate class. Then we could re-use this code to do the calculation of bracketed expressions.

After the calculation was running, we received another error from the integration tests. With unit tests, we tracked down the source of the error in ExpressionValidator.

With bracketed expressions completed, we cross the last item off our list and have finished the implementation of our Calculator.

In the last step we will look back at our implementation process and recapitulate what we have learned about Test Driven Development.

Tags TDD