Input Validation and Sanitization

In web application security, user input and its associated data are a security risk if left unchecked. We address this risk by using “Input Validation” and “Input Sanitization”. These should be performed in every tier of the application, according to the server’s function. An important note is that all data validation procedures must be done on trusted systems (i.e. on the server).

As noted in the OWASP SCP Quick Reference Guide, there are sixteen bullet points that cover the issues that developers should be aware of when dealing with Input Validation. A lack of consideration for these security risks when developing an application is one of the main reasons Injection ranks as the number 1 vulnerability in the “OWASP Top 10”.

User interaction is a fundamental requirement of the current development paradigm in web applications. As web applications become increasingly richer in content and possibilities, user interaction and submitted user data also increases. It is in this context that Input Validation plays a significant role.

When applications handle user data, the input data must be considered insecure by default, and only accepted after the appropriate security checks have been made. Data sources must also be identified as trusted, or untrusted, and in the case of an untrusted source, validation checks must be made.

In this section an overview of each technique is provided, along with a sample in Go to illustrate the issues.

Validation

In validation checks, the user input is checked against a set of conditions in order to guarantee that the user is indeed entering the expected data.

IMPORTANT: If the validation fails, the input must be rejected.

This is important not only from a security standpoint but from the perspective of data consistency and integrity, since data is usually used across a variety of systems and applications.

This article lists the security risks developers should be aware of when developing web applications in Go.

User Interactivity

Any part of an application that allows user input is a potential security risk. Problems can occur not only from threat actors that seek a way to compromise the application, but also from erroneous input caused by human error (statistically, the majority of the invalid data situations are usually caused by human error). In Go there are several ways to protect against such issues.

Go has native libraries which include methods to help ensure such errors are not made. When dealing with strings we can use packages like the following examples:

Note: Forms are treated by Go as Maps of String values.

Other techniques to ensure the validity of the data include:

  • Whitelisting - whenever possible validate the input against a whitelist of allowed characters. See Validation - Strip tags.
  • Boundary checking - both data and numbers length should be verified.
  • Character escaping - for special characters such as standalone quotation marks.
  • Numeric validation - if input is numeric.
  • Check for Null Bytes - (%00)
  • Checks for new line characters - %0d, %0a, \r, \n
  • Checks forpath alteration characters - ../ or \\..
  • Checks for Extended UTF-8 - check for alternative representations of special characters

Note: Ensure that the HTTP request and response headers only contain ASCII characters.

Third-party packages exist that handle security in Go:

  • Gorilla - One of the most used packages for web application security. It has support for websockets, cookie sessions, RPC, among others.
  • Form - Decodes url.Values into Go value(s) and Encodes Go value(s) into url.Values. Dual Array and Full map support.
  • Validator - Go Struct and Field validation, including Cross Field, Cross Struct, Map as well as Slice and Array diving.

File Manipulation

Any time file usage is required ( read or write a file ), validation checks should also be performed, since most of the file manipulation operations deal with user data.

Other file check procedures include “File existence check”, to verify that a filename exists.

Addition file information is in the File Management section and information regarding Error Handling can be found in the Error Handling section of the document.

Data sources

Anytime data is passed from a trusted source to a less-trusted source, integrity checks should be made. This guarantees that the data has not been tampered with and we are receiving the intended data. Other data source checks include:

  • Cross-system consistency checks
  • Hash totals
  • Referential integrity

Note: In modern relational databases, if values in the primary key field are not constrained by the database’s internal mechanisms then they should be validated.

  • Uniqueness check
  • Table look up check

Post-validation Actions

According to Data Validation’s best practices, the input validation is only the first part of the data validation guidelines. Therefore, Post-validation Actions should also be performed. The Post-validation Actions used vary with the context and are divided in three separate categories:

  • Enforcement Actions Several types of Enforcement Actions exist in order to better secure our application and data.

    • inform the user that submitted data has failed to comply with the requirements and therefore the data should be modified in order to comply with the required conditions.
    • modify user submitted data on the server side without notifying the user of the changes made. This is most suitable in systems with interactive usage.

    Note: The latter is used mostly in cosmetic changes (modifying sensitive user data can lead to problems like truncating, which result in data loss).

  • Advisory Action Advisory Actions usually allow for unchanged data to be entered, but the source actor is informed that there were issues with said data. This is most suitable for non-interactive systems.
  • Verification Action Verification Action refer to special cases in Advisory Actions. In these cases, the user submits the data and the source actor asks the user to verify the data and suggests changes. The user then accepts these changes or keeps his original input.

    A simple way to illustrate this is a billing address form, where the user enters his address and the system suggests addresses associated with the account. The user then accepts one of these suggestions or ships to the address that was initially entered.


Sanitization

Sanitization refers to the process of removing or replacing submitted data. When dealing with data, after the proper validation checks have been made, sanitization is an additional step that is usually taken to strengthen data safety.

The most common uses of sanitization are as follows:

Convert single less-than characters < to entity

In the native package html there are two functions used for sanitization: one for escaping HTML text and another for unescaping HTML. The function EscapeString(), accepts a string and returns the same string with the special characters escaped. i.e. < becomes &lt;. Note that this function only escapes the following five characters: <, >, &, ' and ". Other characters should be encoded manually, or, you can use a third party library that encodes all relevant characters. Conversely there is also the UnescapeString() function to convert from entities to characters.

Strip all tags

Although the html/template package has a stripTags() function, it’s unexported. Since no other native package has a function to strip all tags, the alternatives are to use a third-party library, or to copy the whole function along with its private classes and functions.

Some of the third-party libraries available to achieve this are:

  • https://github.com/kennygrant/sanitize
  • https://github.com/maxwells/sanitize
  • https://github.com/microcosm-cc/bluemonday

Remove line breaks, tabs and extra white space

The text/template and the html/template include a way to remove whitespaces from the template, by using a minus sign - inside the action’s delimiter.

URL request path

In the net/http package there is an HTTP request multiplexer type called ServeMux. It is used to match the incoming request to the registered patterns, and calls the handler that most closely matches the requested URL. In addition to its main purpose, it also takes care of sanitizing the URL request path, redirecting any request containing . or .. elements or repeated slashes to an equivalent, cleaner URL.

A simple Mux example to illustrate:

func main() {
  mux := http.NewServeMux()

  rh := http.RedirectHandler("http://codeahoy.com", 307)
  mux.Handle("/login", rh)

  log.Println("Listening...")
  http.ListenAndServe(":3000", mux)
}

NOTE: Keep in mind that ServeMux doesn’t change the URL request path for CONNECT requests, thus possibly making an application vulnerable for path traversal attacks if allowed request methods are not limited.

The following third-party packages are alternatives to the native HTTP request multiplexer, providing additional features. Always choose well tested and actively maintained packages.

  1. Before writing your own regular expression have a look at OWASP Validation Regex Repository 


Licenses and Attributions


Speak Your Mind