Books / Go - Secure Coding Practices / Chapter 2
Input Validation and Sanitization
In web application security, user input and its associated data are a security risk if left unchecked. We address this risk by using “Input Validation” and “Input Sanitization”. These should be performed in every tier of the application, according to the server’s function. An important note is that all data validation procedures must be done on trusted systems (i.e. on the server).
As noted in the OWASP SCP Quick Reference Guide, there are sixteen bullet points that cover the issues that developers should be aware of when dealing with Input Validation. A lack of consideration for these security risks when developing an application is one of the main reasons Injection ranks as the number 1 vulnerability in the “OWASP Top 10”.
User interaction is a fundamental requirement of the current development paradigm in web applications. As web applications become increasingly richer in content and possibilities, user interaction and submitted user data also increases. It is in this context that Input Validation plays a significant role.
When applications handle user data, the input data must be considered insecure by default, and only accepted after the appropriate security checks have been made. Data sources must also be identified as trusted, or untrusted, and in the case of an untrusted source, validation checks must be made.
In this section an overview of each technique is provided, along with a sample in Go to illustrate the issues.
Validation
In validation checks, the user input is checked against a set of conditions in order to guarantee that the user is indeed entering the expected data.
IMPORTANT: If the validation fails, the input must be rejected.
This is important not only from a security standpoint but from the perspective of data consistency and integrity, since data is usually used across a variety of systems and applications.
This article lists the security risks developers should be aware of when developing web applications in Go.
User Interactivity
Any part of an application that allows user input is a potential security risk. Problems can occur not only from threat actors that seek a way to compromise the application, but also from erroneous input caused by human error (statistically, the majority of the invalid data situations are usually caused by human error). In Go there are several ways to protect against such issues.
Go has native libraries which include methods to help ensure such errors are not made. When dealing with strings we can use packages like the following examples:
strconv
package handles string conversion to other datatypes.strings
package contains all functions that handle strings and its properties.regexp
package support for regular expressions to accommodate custom formats1.-
utf8
package implements functions and constants to support text encoded in UTF-8. It includes functions to translate between runes and UTF-8 byte sequences.Validating UTF-8 encoded runes:
Encoding UTF-8 runes:
Decoding UTF-8:
Note: Forms
are treated by Go as Maps
of String
values.
Other techniques to ensure the validity of the data include:
- Whitelisting - whenever possible validate the input against a whitelist of allowed characters. See Validation - Strip tags.
- Boundary checking - both data and numbers length should be verified.
- Character escaping - for special characters such as standalone quotation marks.
- Numeric validation - if input is numeric.
- Check for Null Bytes -
(%00)
- Checks for new line characters -
%0d
,%0a
,\r
,\n
- Checks forpath alteration characters -
../
or\\..
- Checks for Extended UTF-8 - check for alternative representations of special characters
Note: Ensure that the HTTP request and response headers only contain ASCII characters.
Third-party packages exist that handle security in Go:
- Gorilla - One of the most used packages for web
application security.
It has support for
websockets
,cookie sessions
,RPC
, among others. - Form - Decodes
url.Values
into Go value(s) and Encodes Go value(s) intourl.Values
. DualArray
and Fullmap
support. - Validator - Go
Struct
andField
validation, includingCross Field
,Cross Struct
,Map
as well asSlice
andArray
diving.
File Manipulation
Any time file usage is required ( read
or write
a file ), validation checks
should also be performed, since most of the file manipulation operations deal
with user data.
Other file check procedures include “File existence check”, to verify that a filename exists.
Addition file information is in the File Management section and information
regarding Error Handling
can be found in the Error Handling section of
the document.
Data sources
Anytime data is passed from a trusted source to a less-trusted source, integrity checks should be made. This guarantees that the data has not been tampered with and we are receiving the intended data. Other data source checks include:
- Cross-system consistency checks
- Hash totals
- Referential integrity
Note: In modern relational databases, if values in the primary key field are not constrained by the database’s internal mechanisms then they should be validated.
- Uniqueness check
- Table look up check
Post-validation Actions
According to Data Validation’s best practices, the input validation is only the first part of the data validation guidelines. Therefore, Post-validation Actions should also be performed. The Post-validation Actions used vary with the context and are divided in three separate categories:
-
Enforcement Actions Several types of Enforcement Actions exist in order to better secure our application and data.
- inform the user that submitted data has failed to comply with the requirements and therefore the data should be modified in order to comply with the required conditions.
- modify user submitted data on the server side without notifying the user of the changes made. This is most suitable in systems with interactive usage.
Note: The latter is used mostly in cosmetic changes (modifying sensitive user data can lead to problems like truncating, which result in data loss).
- Advisory Action Advisory Actions usually allow for unchanged data to be entered, but the source actor is informed that there were issues with said data. This is most suitable for non-interactive systems.
-
Verification Action Verification Action refer to special cases in Advisory Actions. In these cases, the user submits the data and the source actor asks the user to verify the data and suggests changes. The user then accepts these changes or keeps his original input.
A simple way to illustrate this is a billing address form, where the user enters his address and the system suggests addresses associated with the account. The user then accepts one of these suggestions or ships to the address that was initially entered.
Sanitization
Sanitization refers to the process of removing or replacing submitted data. When dealing with data, after the proper validation checks have been made, sanitization is an additional step that is usually taken to strengthen data safety.
The most common uses of sanitization are as follows:
Convert single less-than characters <
to entity
In the native package html
there are two functions used for sanitization:
one for escaping HTML text and another for unescaping HTML.
The function EscapeString()
, accepts a string and returns the same string
with the special characters escaped. i.e. <
becomes <
.
Note that this function only escapes the following five characters: <
, >
,
&
, '
and "
. Other characters should be encoded manually, or, you can use
a third party library that encodes all relevant characters.
Conversely there is also the UnescapeString()
function to convert from
entities to characters.
Strip all tags
Although the html/template
package has a stripTags()
function, it’s
unexported. Since no other native package has a function to strip all tags, the
alternatives are to use a third-party library, or to copy the whole function
along with its private classes and functions.
Some of the third-party libraries available to achieve this are:
- https://github.com/kennygrant/sanitize
- https://github.com/maxwells/sanitize
- https://github.com/microcosm-cc/bluemonday
Remove line breaks, tabs and extra white space
The text/template
and the html/template
include a way to remove whitespaces
from the template, by using a minus sign -
inside the action’s delimiter.
URL request path
In the net/http
package there is an HTTP request multiplexer type called
ServeMux
. It is used to match the incoming request to the registered patterns,
and calls the handler that most closely matches the requested URL.
In addition to its main purpose, it also takes care of sanitizing the URL
request path, redirecting any request containing .
or ..
elements or
repeated slashes to an equivalent, cleaner URL.
A simple Mux example to illustrate:
func main() {
mux := http.NewServeMux()
rh := http.RedirectHandler("http://codeahoy.com", 307)
mux.Handle("/login", rh)
log.Println("Listening...")
http.ListenAndServe(":3000", mux)
}
NOTE: Keep in mind that ServeMux
doesn’t change the URL request path
for CONNECT
requests, thus possibly making an application vulnerable for path
traversal attacks if allowed request methods are not limited.
The following third-party packages are alternatives to the native HTTP request multiplexer, providing additional features. Always choose well tested and actively maintained packages.
-
Before writing your own regular expression have a look at OWASP Validation Regex Repository ↩