BLOG

Validation - part 1

Introduction

In this blog series I’ll take you, the reader, on a journey through the world of validation. Validation is something that is present in all software, but especially so in business software. Business software always has external dependencies, which cannot be trusted to deliver proper data.

I’ll investigate systems to check for the validness of data of differing degrees of sophistication and effectiveness and how we can use them. I will also try to keep the level of sophistication growing in all the parts of the series.

This is part 1 of the series, you can find the other parts here:

The model

Our model will be a very simple one, but it must be complex enough to illustrate some key properties of the more advanced systems. We’ll use a simplified structure of an invoice, with invoice lines, totals, billing address. We need to define the following four classes:

public class Invoice {
    public Customer Debtor {get;set;}
    public DateTime Date {get;set;}
    public Address BillingAddress {get;set;}
    public decimal TotalAmount {get;set;}
    public List<InvoiceLine> Lines {get;set;}
}
public class Customer {
    public string FirstName {get;set;}
    public string LastName {get;set;}
}
public class Address {
    public string Street {get;set;}
    public string HouseNumber {get;set;}
    public string City {get;set;}
    public string Country {get;set;}
}
public class InvoiceLine {
    public string Description {get;set;}
    public decimal Price {get;set;}
}

In this case the Invoice class is the aggregate root for the rest of the data and we need to be able to validate it as a whole. To validate it as a whole we need to be able to validate its properties. When the property is a primitive datatype we can implement a rule, or when it is another class, we can delegate validation to the class.

Validation

These days it is very fashionable to do declarative programming, and while I think this is a great idea, there are many ways of doing it. Some better than others of course.

A declarative take on validation is to introduce attributes that specify the conditions that must be met for properties. For instance, let’s say the first and last names of our customer are mandatory, we might do something like this:

public class Customer {
    [Required]
    public string FirstName {get;set;}
    [Required]
    public string LastName {get;set;}
}

We then need to implement a bunch of attributes to take care of all the types of validations that we need. Let’s simplify for the sake of argument and ignore the existence of credit invoices: our total amount should then be at least 0, and we would need a kind of bounds checking argument:

public class Invoice {
    public Customer Debtor {get;set;}
    [Required]
    public DateTime Date {get;set;}
    public Address BillingAddress {get;set;}
    [Required]
    [Bounds(Min = 0m)]
    public decimal TotalAmount {get;set;}
    public List<InvoiceLine> Lines {get;set;}
}

Next to all these attributes there would be a need for some class to reflect on these attributes, check if the properties meet the specified requirements and return the result of the checks to the caller. Something like this:

public static class Validator {
    public static bool Validate<T>(T item);
}

I personally don’t think it is a good idea to have too many ‘static state’, as they are just another name for ‘global variables’ (and that goes for singleton-scoped DI registrations as well). Just for the sake of simplifying the code in these articles, I’m going with this design for now.

How should this class determine which properties to validate recursively? A number of conventions can be thought of, but the most straightforward way is to use an attribute:

public class Invoice {
    [ValidateChildren]
    public Customer Debtor {get;set;}
    [Required]
    public DateTime Date {get;set;}
    [ValidateChildren]
    public Address BillingAddress {get;set;}
    [Required]
    [Bounds(Min = 0m)]
    public decimal TotalAmount {get;set;}
    [ValidateChildren]
    public List<InvoiceLine> Lines {get;set;}
}

Some validation requirements however, are not easily implemented in this way. For instance, we would like the TotalAmount property to be the sum of all the invoice lines. In this model we would have to have an attribute on the class level (Invoice), to get all the necessary details in scope. The next question is how do we express this requirement in an attribute? We are forced to either:

  • Implement a specific attribute for Invoice. This is not a desirable option because attributes are designed for reuse.
  • Implement some expression language so a string specifying the constraint can be passed to some generic attribute. This is also not a desirable option because this string is not type-checked and therefore error-prone.