4GL Patterns #14 – Data Validation Patterns
Note, this is Part 14 in a series of posts documenting patterns used in RAD development of Line of Business applications. You can find more here – 4GL Patterns
Lately, I’ve seen software frameworks starting to mark up properties with validation attributes. These are all simple single-field affairs 1, for example, rules such as
- Regular expressions (WCF RIA Service example)
[RegularExpression("[A-Z][A-Za-z0-9]*")] - String length (minimum, maximum)
- Required field
- Not nullable text fields
These are not much different to the In-Schema validation mentioned earlier.
Complex validation
We have so far dealt with simple validation on a single field. There are some practical situations which require validation of two fields together.
- Password confirmation boxes (both passwords must be the same)
- Date range (the “from” date must be less than the “to” date) – Airline bookings applications for example would update the “return” date if the “departure” date was greater than the “return” date.
- Time range – In DabbleDB, the user interface is presented as a Outlook-style scheduler.
- In MS Lightswitch, phone numbers and addresses are first class objects. These can be formatted according to specific masks.
Highly complex validation
Sometimes an entire data structure may need to be validated. These are rare. For instance, when splitting payments on an accounting system, we need to validate that the totals balance. This type of validation span multiple records.
Another example is validation of addresses. One might wish to validate the zipcodes are consistent with the States provided. In addition, there may be third party services that provide address validation. These validations span multiple fields (Street Number, Street Name, City, Zip, State).
Sanity Checks
Just because a value is permitted doesn’t mean that it may be correct. Sanity checking is a form of validation where users are prompted to confirm a value when it is out of typical bounds. For instance, entering birthdate using the current year. There was a case where a minister was embarrassed in Parliament because her advisers had done a “SELECT COUNT(*) FROM MOTHERS WHERE YEAR(DOB) > 2000“; but it turns out the date of birth was wrong, and some mothers were apparently as young as 2 years old.
When not to validate
Some times, it is necessary to not validate data.
- The user is postponing data entry, but wishes to save a draft.
- The data has already been validated when the record was created. Two examples come to mind:
- A task has to be scheduled for a future date (this is validated). However, when the task is completed, and marked as complete, the scheduled date should not be validated.
- When a signed document is recorded in the database, we perform a validation that the signature is current, and has not expired nor revoked. However, if we record other metadata against the document (such as archival schedules), then we should not revalidate the signature, because the signature may no longer be valid.
In other words, the fields should only be validated if they are updated, or when a record is created, and left alone otherwise.
- Increasingly, languages (such as XAML and JavaFX) which permit Two-Way databinding find that validation can get in the way of data entry. For instance, an email property field may require the entry of a properly formatted email address. If the user tabs away from a control with a mistyped email address, the email field needs to store the incorrect value so that the user can fix it later. One design pattern is to generate a ViewModel, which is a “Plain Old Object” or a struct that permits bad data, but will return a list of error messages if the data does not validate.
- It is also worthwhile mentioning Jon Udell, who suggested that users should be allowed to override certain types of rigid validation rules, subject to these events being logged and audited. It will certainly help with better customer service, especially in exceptional situations.
Footnotes
1XForms, through XML Schemas, can also validate number of items in a list, using the minOccurs and the maxOccurs attribute.
No comments
Jump to comment form | comments rss [?] | trackback uri [?]