Categories
Blog Software Development

Test Driven Development – A practical Example

Step Five

At this point, our list looks as follows:

  • Addition of decimals
  • Addition of negative numbers
  • Addition of multiple numbers
  • Multiplication
  • Division
  • Bracketed Expressions

The last two items took quite a lot of time. Let’s choose an item that we can implement a little faster in the next step. We continue with the addition of decimals.

The handling of decimals can get complicated because of the precision of floating point calculation. To keep the implementation simple, we make the following assumptions:

  • Decimal numbers must have at most two decimal places.
  • The result is rounded to two decimal places. If the result does not have any decimal places at all, then they shall be omitted.
  • Decimal numbers must be written in the form x.xx. Neither the number before nor after the dot may be omitted.

Integration Tests

To begin with, we extend the integration tests with some examples of decimal numbers. We also add some example for invalid expressions involving decimal numbers, e.g. a missing number after the dot.

def test_invalid_decimals(self):
    # Decimal places missing
    output = self.app.calculate("7. + 81")
    self.assertEqual(output, "Invalid expression")

    # Pre-Decimal places missing
    output = self.app.calculate("7 + .81")
    self.assertEqual(output, "Invalid expression")

    # Too many decimal places
    output = self.app.calculate("7 + 0.812")
    self.assertEqual(output, "Invalid expression")

def test_addition_of_two_decimal_numbers(self):
    output = self.app.calculate("120.5+ 80.6")
    self.assertEqual(output, "201.10")

def test_subtraction_of_two_decimal_numbers(self):
    output = self.app.calculate("120.5- 80.6")
    self.assertEqual(output, "39.90")

The test cases for addition and subtraction fail, because the Tokenizer cannot tokenize decimal numbers yet. On the other hand, the test case for invalid decimals passes.

Tokenization of Decimals

The failure of the integration test tell us that Tokenizer needs to be extended to support decimals. We add a test case to tokenize valid decimals. Additionally, we write a test case to tokenize invalid decimals:

def test_tokenize_decimals(self):
    output = self.tokenizer.tokenize("120.0 1.23")
    self.assertEqual(output, [Number(120.0), Number(1.23)])

def test_tokenize_invalid_decimals(self):
    with(self.assertRaises(SyntaxError)):
        self.tokenizer.tokenize("1.")

    with(self.assertRaises(SyntaxError)):
        self.tokenizer.tokenize(".1")

    with(self.assertRaises(SyntaxError)):
        self.tokenizer.tokenize("1.234")

The test result confirms that the Tokenizer considers decimals as invalid:

ERROR: test_tokenize_decimals (testTokenizer.TestTokenizer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../testTokenizer.py", line 30, in test_tokenize_decimals
    output = self.tokenizer.tokenize("120.0 1.23")
  File ".../Tokenizer.py", line 10, in tokenize
    raise SyntaxError("Invalid Token Found")
  File "<string>", line None
SyntaxError: Invalid Token Found

To find the correct regular expression that will capture both integers and decimals, let’s use the interactive Python console. After playing around with some expressions there, we end up with this one:

expr = re.compile(r"\d+.\d{1,2}|\d+|[+*/\-]")

But even with this expression, the test fails. This time, the exception is coming from create_token:

def create_token(self, element):
    if(element.isdigit()):
        return Number(int(element))
    elif(element == '+'):
        return Operator.Addition
    elif(element == '-'):
        return Operator.Subtraction
    elif(element == '*'):
        return Operator.Multiplication
    elif(element == '/'):
        return Operator.Division
    else:
        raise SyntaxError("Invalid Token")

The exception happens because element.isdigit() fails for decimals. We need to add methods for determining if the element can be converted to an int or to a float. In these cases, we parse the element to either int or float. Then we can pass it into the instance of Number, which holds both kind of numbers.

def create_token(self, element):
        if self.can_be_converted_to_int(element):
            return Number(int(element))
        elif self.can_be_converted_to_float(element):
            return Number(float(element))
        elif(element == '+'):
            return Operator.Addition
        elif(element == '-'):
            return Operator.Subtraction
        elif(element == '*'):
            return Operator.Multiplication
        elif(element == '/'):
            return Operator.Division
        else:
            raise SyntaxError("Invalid Token")

    def can_be_converted_to_int(self, string):
        try:
            value = int(string)
            return True
        except:
            return False

    def can_be_converted_to_float(self, string):
        try:
            value = float(string)
            return True
        except:
            return False

This makes the test for the valid tokens pass. But now one of the tests for invalid tokens fails:

FAIL: test_tokenize_invalid_decimals (testTokenizer.TestTokenizer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../testTokenizer.py", line 41, in test_tokenize_invalid_decimals
    self.tokenizer.tokenize("1.234")
AssertionError: SyntaxError not raised

The test had not failed before because the decimals were not taken into account before. Now Tokenizer handles decimals. It turns out that the regular expression is not completely correct. It accepts decimals with three decimal places. But why is that? Our regular expression explicitely captures only up to two decimal places.

It turns out that the expression captures the first two decimals correctly. Then the remaining decimals are captured by the rule for integers. Thus no error is caused by the tokenizer.

It is hard to find a regular expression that makes this work. So we remove the check for the number of decimals from the expression, and put it into a separate method with the name can_be_converted_to_float. The string will contain all decimals. If after splitting the string at the dot there are more than two numbers, then we’ll call the string invalid.

def can_be_converted_to_float(self, string):
    try:
        value = float(string)

        decimal_places = len(string.split('.')[1])
        if(decimal_places > 2):
            return False

        return True
    except:
        return False

Now all unit tests for Tokenizer pass. And the error messages of the integration tests have changed:

AssertionError: '201.1' != '201.10'
...
AssertionError: '39.900000000000006' != '39.90'

This proves that the addition and subtraction of decimals is already working! Just the rounding of the result string is not yet implemented.

Rounding of the Result

To implement the rounding of the result, we add a new unit test for CalculatorApplication:

def test_decimals_are_rounded(self):
    self.calculator.calculate.return_value = 42.0
    result = self.app.calculate("")
    self.assertEqual(result, "42.00")

Of course, this test case fails. We need to extend calculate in the following way: If the result is an int, then the int is converted to a str. If it is a float, then it is converted to a str with exactly two decimal places. In all other cases, something inside the application has gone wrong. We return an error message:

...
result = self.calculator.calculate(tokens)
if isinstance(result, int):
    return str(result)
elif isinstance(result, float):
    return "{0:.2f}".format(result)
else:
    return "Invalid result"

Now all test cases pass. As we did not introduce any duplication in this step, we don’t need to do any refactoring.

Summary of Step Five

The introduction of decimal places worked quit smoothly. We had some troubles finding the correct regular expression to use, but the unit test helped us to get it right.

Note that we did not have to touch ExpressionValidator and Calculator at all because we removed the string handling from there and put it all into Tokenizer. All that work in Step Three has paid off.

Leave a Reply

Your email address will not be published. Required fields are marked *