CppCMS
Public Types | Public Member Functions | Static Public Member Functions
cppcms::xss::rules Class Reference

The class that holds XSS filter rules. More...

#include <cppcms/xss.h>

List of all members.

Public Types

enum  html_type { xhtml_input, html_input }
enum  tag_type { invalid_tag = 0, opening_and_closing = 1, stand_alone = 2, any_tag = 3 }
typedef booster::function
< bool(char const *begin, char
const *end) 
validator_type )

Public Member Functions

 rules (rules const &)
rules const & operator= (rules const &)
 rules (json::value const &r)
 rules (std::string const &file_name)
html_type html () const
void html (html_type t)
void add_tag (std::string const &name, tag_type=any_tag)
void add_entity (std::string const &name)
bool numeric_entities_allowed () const
void numeric_entities_allowed (bool v)
void add_boolean_property (std::string const &tag_name, std::string const &property)
void add_property (std::string const &tag_name, std::string const &property, validator_type const &val)
void add_property (std::string const &tag_name, std::string const &property, booster::regex const &r)
void add_integer_property (std::string const &tag_name, std::string const &property)
void add_uri_property (std::string const &tag_name, std::string const &property)
void add_uri_property (std::string const &tag_name, std::string const &property, std::string const &schema)
bool comments_allowed () const
void comments_allowed (bool comments)
void encoding (std::string const &enc)

Static Public Member Functions

static CPPCMS_DEPRECATED
booster::regex 
uri_matcher ()
static CPPCMS_DEPRECATED
booster::regex 
uri_matcher (std::string const &schema)
static validator_type uri_validator ()
static validator_type uri_validator (std::string const &scheme, bool absolute_only=false)
static validator_type relative_uri_validator ()

Detailed Description

The class that holds XSS filter rules.

This is the major class the defines the white list rules to handle the Correct HTML input.

When using these rules you should be very strict about what you need and what you allow.

Basically you need to specify:

  1. The XHTML or HTML parsing rules - should be done first
  2. The encoding of the text. If you do not specify the encoding it would be assumed that it is ASCII compatible. You may not specify encoding only if you know that it was validated for example by using widgets::text, otherwise always specify encoding
  3. Provide the list of tags that should be used. Specify only thous you need. Never allow tags like style, object, embed or of course script as they can be easily used for XSS attacks
  4. Provide essential HTML attributes - properties for tags you need. Use add_uri_property for links like src for img or href for a. It would check correctness of URI syntax and ensure that only white-listed schemas are allowed (i.e. no javascript would be allowed). Never allow style tags unless you specify very strict white list of really used styles. Styles can be easily exploited for both XSS and click-jacking. For example
         <p style="width: expression(alert('XSS'));"></p>
    
    If you want to use styles specify very strict list of things you need like:
         add_property("p","style",booster::regex("text-align:(left|right|center)"));
    
  5. Do not allow comments unless you need them. Note not all comments are allowed. Comments containing "<", ">" or "&" would be considered invalid as some exploits use them.

Remember more strict you are it is harder to make attack. Read about XSS, see existing attacks to understand how they work and then decide what you allow.

rules class can be treated as value for thread safe access, i.e. you can safely use const reference and const member functions as long as you don't change the rules under the hood.

The simplest way: define at application startup some global rules object configure it and use it for filtering and validation - and make your attackers cry :-).


Member Typedef Documentation

typedef booster::function<bool(char const *begin,char const *end) cppcms::xss::rules::validator_type)

Functor that allows to provide custom validations for different properties


Member Enumeration Documentation

How to treat in input

Enumerator:
xhtml_input 

Assume that the input is XHTML.

html_input 

Assume that the input is HTML.

The type of tag

Enumerator:
invalid_tag 

This tag is invalid (returned by validate)

opening_and_closing 

This tag should be opened and closed like em , or strong.

stand_alone 

This tag should stand alone (like hr or br)

any_tag 

This tag can be used in both roles (like input)


Constructor & Destructor Documentation

cppcms::xss::rules::rules ( json::value const &  r)

Create rules from JSON object r

The json object the defines the XSS prevention rules. This object has following properties:

  • "xhtml" - boolean; default true - use XHTML (true) or HTML input
  • "comments" - boolean; setting it to true allows comments, default false
  • "numeric_entities" - boolean; setting it to true allows numeric_entities, default false
  • "entities" - array of strings: list of allowed HTML entities besides lt, gt and amp
  • "encoding" - string; the encoding of the text to validate, by default not checked and the input is assumed to be ASCII compatible. Always specifiy it for multibyte encodings like Shift-JIS or GBK as they are not ASCII compatible.
  • "tags" - object with 3 properties of type array of string:
    • "opening_and_closing" - the tags that should come in pair like "<b></b>"
    • "stand_alone" - the tags that should appear stand alone like "<br/>"
    • "any_tag" - the tags that can be both like "<input>"
  • "attributes" - array of objects that define HTML attributes. Each object consists of following properties:
    • "type" - string - the type of the attribute one of: "boolean", "uri", "relative_uri", "absolute_uri", "integer", "regex".
    • "scheme" - string the allowed URI scheme - regular expression like "(http|ftp)". Used with "uri" and "absolute_uri" type
    • "expression" - string the regular expression that defines the value that the attribute should match.
    • "tags" - array of strings - list of tags that this attribute is allowed for.
    • "attributes" - array of strings - lisf of names of the attribute
    • "pairs" - array of objects that consists of two properities "tag" and "attr" of type string that define tag and attributed that such type of property should be allowed for.

The extra properties that are not defined by this scheme are ingored

For example:

 {
        "xhtml" : true,
        "encoding" : "UTF-8",
        "entities" : [ "nbsp" , "copy" ],
        "comments" : false,
        "numeric_entities" : false,
        "tags" : {
                "opening_and_closing" : [
                        "p", "b", "i", "tt",
                        "a",
                        "strong", "em",
                        "sub", "sup",
                        "ol", "ul", "li",
                        "dd", "dt", "dl",
                        "blockquote","code", "pre",
                        "span", "div"
                ],
                "stand_alone" : [ "br", "hr", "img" ]
        ],
        "attributes": [
                {
                        "tags" : [ "p", "li", "ul" ]
                        "attr" : [ "style" ],
                        "type" : "regex",
                        "expression" : "\\s*text-algin:\\s*(center|left|right|justify);?\\s*"
                },
                {
                        "tags" : [ "span", "div" ]
                        "attr" : [ "class", "id" ],
                        "type" : "regex",
                        "expression" : "[a-zA-Z_0-9]+"
                },
                {
                        "pairs" : [ 
                                { "tag" : "a",   "attr" : "href" },
                                { "tag" : "img", "attr" : "src"  }
                        ],
                        "type" : "absolute_uri",
                        "scheme" : "(http|https|ftp)"
                },
                {
                        "tags" : [ "img" ],
                        "attr" : [ "alt" ],
                        "type" : "regex",
                        "expression" : ".*"
                }
        ]
 }
cppcms::xss::rules::rules ( std::string const &  file_name)

Create rules from the JSON object stored in the file file_name

See also:
rules(json::value const&)

Member Function Documentation

void cppcms::xss::rules::add_boolean_property ( std::string const &  tag_name,
std::string const &  property 
)

Add the property that should be allowed to appear for specific tag as boolean property like checked="checked", when the type is HTML it is case insensitive.

The property should be ASCII only

void cppcms::xss::rules::add_entity ( std::string const &  name)

Add allowed HTML entity, by default only "lt", "gt", "quot" and "amp" are allowed

void cppcms::xss::rules::add_integer_property ( std::string const &  tag_name,
std::string const &  property 
)

Add numeric property, same as add_property(tag_name,property,booster::regex("-?[0-9]+") but little bit more efficient

void cppcms::xss::rules::add_property ( std::string const &  tag_name,
std::string const &  property,
validator_type const &  val 
)

Add the property that should be checked using custom functor

void cppcms::xss::rules::add_property ( std::string const &  tag_name,
std::string const &  property,
booster::regex const &  r 
)

Add the property that should be checked using regular expression.

void cppcms::xss::rules::add_tag ( std::string const &  name,
tag_type  = any_tag 
)

Add the tag that should be allowed to appear in the text, for HTML the name is case insensitive, i.e. "br", "Br", "bR" and "BR" are valid tags for name "br".

The name should be ASCII only

void cppcms::xss::rules::add_uri_property ( std::string const &  tag_name,
std::string const &  property 
)

Add URI property. It should be used for properties like like "href" or "src". It is very good idea to use it in order to prevent urls like javascript:alert('XSS')

It's behavior is same as add_property(tag_name,property,rules::uri_validator());

void cppcms::xss::rules::add_uri_property ( std::string const &  tag_name,
std::string const &  property,
std::string const &  schema 
)

Add URI property, using regular expression that matches allowed schemas. It should be used for properties like like "href" or "src". It is very good idea to use it in order to prevent urls like javascript:alert('XSS')

It's behavior is same as add_property(tag_name,property,rules::uri_validator(schema));

Check if the comments are allowed in the text

void cppcms::xss::rules::comments_allowed ( bool  comments)

Set to true if the comments are allowed in the text

void cppcms::xss::rules::encoding ( std::string const &  enc)

Set the character encoding of the source, otherwise encoding is not checked and assumed valid all invalid characters are removed from the text or replaced with default character

It is very important to specify this option. You may skip it if you are sure that the the input encoding was already validated using cppcms::form::text widget that handles character encoding validation by default.

In any case it is generally better to always specify this option.

Note:
the replace functionality is not supported for all encoding, only UTF-8, ISO-8859-* and single byte windows-12XX encodings support such replacement with default character, for all other encodings like Shift-JIS, the invalid characters or characters that are invalid for use in HTML are removed.

Get how to treat input - HTML or XHTML

Set how to treat input - HTML or XHTML, it should be called first before you add any other rules

Get if numeric entities are allowed, default is false

Set if numeric entities are allowed

Create a validator that checks that this URI is relative and it is safe for inclusion in URI property like href or src

static CPPCMS_DEPRECATED booster::regex cppcms::xss::rules::uri_matcher ( ) [static]
Deprecated:
use uri_validator

Create a regular expression that checks URI for safe inclusion in the property. By default it allows only: http, https, ftp, mailto, news, nntp.

If you need finer control over allowed schemas, use uri_matcher(std::string const&).

static CPPCMS_DEPRECATED booster::regex cppcms::xss::rules::uri_matcher ( std::string const &  schema) [static]
Deprecated:
use uri_validator

Create a regular expression that checks URI for safe inclusion in the text, where schema is a regular expression that matches specific protocols that can be used.

Note:
Don't add "^" or "$" tags as this expression would be used in construction of regular other expression.

For example:

 booster::regex uri = uri_matcher("(http|https)");

Create a validator that checks URI for safe inclusion in the property. By default it allows only: http, https, ftp, mailto, news, nntp.

If you need finer control over allowed schemas, use uri_validator(std::string const&).

static validator_type cppcms::xss::rules::uri_validator ( std::string const &  scheme,
bool  absolute_only = false 
) [static]

Create a validator that checks URI for safe inclusion in the property.

  • schema is a regular expression that matches specific protocols that can be used.
  • absolute_only - set to true to prevent accepting relative URIs like "/files/img.png" or "test.html"
Note:
You don't need to add "^" or "$" tags to scheme

For example:

 uri_validator("(http|https)");

If you need finer control over allowed schemas, use uri_validator(std::string const&).


The documentation for this class was generated from the following file: