5.7 Validating User Input

User input must never be trusted. It could be missing. It might be in the wrong format. It might even contain JavaScript or SQL as a means to causing some type of havoc. Thus, almost always user input must be tested for validity.

5.7.1 Types of Input Validation

The following list indicates most of the common types of user input validation.

5.7.2 Notifying the User

What should your pages do when a validation check fails? Clearly, the user needs to be notified, but how? Most user validation problems need to answer the following questions:

5.7.3 How to Reduce Validation Errors

Users dislike having to do things again, so if possible, we should construct user input forms in a way that minimizes user validation errors. The basic technique for doing so is to provide the user with helpful information about the expected data before she enters it. Some of the most common ways of doing so include:

Pro Tip

One of the most common problems facing the developers of real-world web forms is how to ensure that the user submitting the form is actually a human and not a bot (i.e., a piece of software). The reason for this is that automated form bots (often called spam bots) can flood a web application form with hundreds or thousands of bogus requests.

This problem is generally solved by a test commonly referred to as a CAPTCHA (which stands for Completely Automated Public Turing test to tell Computers and Humans Apart) test. Most forms of CAPTCHA ask the user to enter a string of numbers and letters that are displayed in an obscured image that is difficult for a software bot to understand. Other CAPTCHAs ask the user to solve a simple mathematical question or trivia question.

We think it is safe to state that most human users dislike filling in CAPTCHA fields, as quite often the text is unreadable for humans as well as for bots. They also present a usability challenge for users with visual disabilities. As such, in general one should only add CAPTCHA capabilities to a form if your site is providing some type of free service or the site is providing a mechanism for users to post content that will appear on the site. Both of these scenarios are especially vulnerable to spam bots.

If you do need CAPTCHA capability, there is a variety of third-party solutions. Perhaps the most common is reCAPTCHA, which is a free open-source component available from Google. It comes with a JavaScript component and PHP libraries that make it quite easy to add to any form.

5.7.4 Where to Perform Validation

Validation can be performed at three different levels. With HTML5, the browser can perform basic validation. Figure 5.39 illustrates how HTML5 validation appears in the browser. For instance, in the following example, the required and pattern attributes are used to validate a date in the format ##/##/####.

<input type="text" pattern="\d{1,2}/\d{1,2}/\d{4}" required>

Figure 5.39 HTML5 browser validation

The figure consists of a browser window with textboxes.

What is that strange set of text used in this pattern attribute? It is a regular expression, a popular standardized language used in a wide variety of languages and platforms for the matching and manipulating text. Regular expressions will be covered in a bit more detail in Chapter 9.

However, since the validation that can be achieved in HTML5 is quite basic (and there is no real control over how it looks and behaves), many web applications do not use this level of validation and instead perform validation in the browser using JavaScript (covered in Chapters 811). If you wish to disable browser validation (perhaps because you want a unified visual appearance to all validations), you can do so by adding the novalidate attribute to the form attribute:

<form id="sampleForm" method="..." action="..." novalidate>

The advantage of validation using JavaScript is that it reduces server load and ­provides immediate feedback to the user. The immediacy of JavaScript validation dramatically improves the user experience of data-entry forms, and for this reason it is an essential feature of any real-world web site that uses forms.

Unfortunately, JavaScript validation cannot be relied on: for instance, it might be turned off on the user’s browser. For these reasons, validation should always be done on the server side as well. Indeed, server-side validation is arguably the most important since it is the only validation that is guaranteed to run. Figure 5.40 illustrates the interaction of the different levels of validation.

Figure 5.40 Visualizing levels of validation

The figure shows 4 Blocks that display 2 User Forms, Browser, and Server along with various steps involved in visualizing the levels of validation.