The woes of sanitizing SVGs

https://muffin.ink/blog/scratch-svg-sanitization/

219varun_ch | 18 hours ago | 90 | HN

I'm glad this article includes the only credible fix for the HTTP leak problems: CSP.

A useful thing I learned recently is that, while CSP headers are usually set using HTTP headers, you can also reliably set them directly in HTML - for example for HTML generated directly on a page where HTTP headers don't come into play:

  <iframe sandbox="allow-scripts" srcdoc="
    <meta http-equiv='Content-Security-Policy'
        content='default-src none; script-src unsafe-inline; style-src unsafe-inline;'>
    <!-- untrusted content here -->
  "></iframe>

It feels like this shouldn't work, because JavaScript in the untrusted content could use the DOM to delete or alter that meta tag... but it turns out all modern browsers specifically lock that down, treating those CSP rules as permanent as soon as that meta tag has loaded before any malicious code has the chance to subvert them.

I had Claude Code run some experiments to help demonstrate this a few weeks ago: https://github.com/simonw/research/tree/main/test-csp-iframe...

loading story #47924328

loading story #47925870

loading story #47924317

loading story #47924312

andybak18 hours ago | parent | next

My first thought is "support a tiny subset of svg that probably still covers 90% of real-world use cases".

I do feel that's there's two distinct types of svg - "bunch of paths with fills" and "clever dangerous stuff" where most real SVGs are of the former type.

Fully expect this to be shot down by someone that's thought about this problem for longer than the 120 seconds I just spent. :)

loading story #47931508

loading story #47926604

loading story #47923659

loading story #47925925

loading story #47923816

loading story #47926208

loading story #47923300

loading story #47929671

loading story #47930421

loading story #47923355

loading story #47923632

loading story #47929802

loading story #47923765

loading story #47924968

loading story #47923593

nmilo11 hours ago | parent | next

I'm sorry because I love the scratch project but this has to be said: they found XSS in SVGs in a surface with attacker-controlled access to Node and their fix was sanitizing it using regex??? And this was discovered by a user on scratch?

Even worse, OP's latest post "Every version of Scratch is vulnerable to arbitrary code execution" just tells you how exactly to exploit something similar today in the current version with no mention of responsible disclosure except a plug to say, "hey, check out my project, this one doesn't have RCE!" This is so irresponsible it borders on malicious.

loading story #47928762

evilpie17 hours ago | parent | next

The HTML Sanitizer API has a subset of SVG that is allowed by the default configuration. It won't help you with sanitizing CSS at all however, style is simply not allowed by default.

https://developer.mozilla.org/en-US/docs/Web/API/HTML_Saniti...

loading story #47929766

philo2317 hours ago | parent | next

It'd be nice if there was a sandbox attribute you could add to inline <svg> tags, like the <iframe sandbox> attribute that'd let you opt out of all the potentially "dynamic" stuff inside of an SVG like scripts and event handlers, or even just literally sandbox the entire thing from accessing the "parent" HTML page's context/cookies/etc just like an iframe.

I'm sure it'd just open up a whole other can of worms though... not to mention having to wait for browsers to actually support it.

The real solution here is definitely CSP + basic sanitisation though.

loading story #47924409

loading story #47924773

loading story #47926011

loading story #47926870

ikkun17 hours ago | parent | next

I do wish tinyVG or similar would take off, but I don't see that ever actually happening. the only thing I think it's missing is animation support, which is pretty niche but not as niche as <script> tags.

https://tinyvg.tech/

spankalee18 hours ago | parent | next

This is, by the way, why Google Slides doesn't have SVG support even though there's a nearly 15 year old ticket requesting the feature.

loading story #47929693

loading story #47923589

djoldman10 hours ago | parent | next

Cloudflare went down this road a bit:

https://github.com/cloudflare/svg-hush

codedokode5 hours ago | parent | next

I don't like that SVG uses things like CSS and JS and requires pulling in the whole browser to display. Instead of being a simple vector image format, it became just an extension of HTML. Maybe we need a new format, and if someone decides to do it, please add ability to embed fonts, wrap text and decent animations.

Springtime18 hours ago | parent | next

It seems the reason they're inlined in the page at all is to measure things briefly like bounding boxes (not sure the full extent as it didn't cover that), before subsequent removal. I'm not familiar with Scratch and its use of user-submitted SVGs but I'd be curious to read more about what they're doing that required it be inlined specifically.

(This isn't a comment on the challenges in proper sanitization fwiw, as I've needed to do various of the same things myself)

loading story #47928699

wingi59 minutes ago | parent | next

thank you for this post.

bawolff15 hours ago | parent | next

These aren't really SVG specific issues. They are all pretty standard XSS that apply to html and are very well known vectors.

Like this post didn't even mention presentational attributes, like how cursor attribute can contain a url that gets loaded. Or any of the other tricky parts of svg sanitization, like using dtd to bypass things.

Liftyee13 hours ago | parent | next

I'm not familiar with the details of real software development, so I don't know why it's not possible to just "not give the SVG part of the code internet access" or "perform sanitization on post-decoding (url, hex, etc) data".

Is it because the SVG parser/renderer being used is an entire library, and it would be prohibitive to write your own SVG parser/renderer or insert your own code into the existing one?

loading story #47928294

kevinmgranger17 hours ago | parent | next

> This was fixed by using a regular expression to remove script tags.

The infamous you can't parse (X)HTML with regex¹ meme is from 2009, yet this fix was done in 2019. I guess the SO answer never mentioned SVG.

1: https://stackoverflow.com/revisions/1732454/1

jancsika17 hours ago | parent | next

For the "<script>" stuff: regardless of how the thing is spelled or otherwise obscured, the HTML5 parser eventually knows when it's gotten hold of a script tag. Oops, we got one in a NOSCRIPTTAG context. Let's poop out.

Tag names, attributes, attribute values, event callback default-cancelers... so many ways to declare that this node and its children shouldn't parse/evaluate scripts.

As Jay-Z said: "I've got 99 solutions, fixing a problem ain't one"

etchalon17 hours ago | parent | next

I don't understand why it wasn't immediately understood that SVG is as dangerous as HTML.

It is not, and never was, an image format. It's a markup language.

loading story #47928498

loading story #47926912

NooneAtAll316 hours ago | parent | next

wait... scratch is just a browser?

loading story #47929048

Theodores16 hours ago | parent | next

Maybe we need a dumbed down version 3 of SVG where the browser knows it is not to do anything that requires fetching a URL, to make the image as harmless as a JPG.

This version 3 could have the version number changed to 2 in order to do cool SVG things, so full-fat SVG as version 2 is now. But you could just flip to 2 to a 3 on upload, so any embedded URLs are harmless.

This could be useful for the creator too, as it is helpful to have layers of source images in bitmap format to work with, and you can easily export such things accidentally.

Devasta16 hours ago | parent | next

> In 2019, a few months after the initial release of Scratch 3, Scratch discovered that SVGs can contain <script> tags that Scratch would cause to be executed when the SVG loads. This is known as an XSS.

> Example from Scratch's test suite:

  <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
    "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
  <svg version="1.1" xmlns="http://www.w3.org/2000/svg">
    <circle cx="250" cy="250" r="50" fill="red" />
    <script type="text/javascript"><![CDATA[
        alert('from the svg!')
    ]]></script>
  </svg>

Is this really an issue? This is the method that the chrome teams polyfill to replace XSLT suggests you do. https://github.com/mfreed7/xslt_polyfill/tree/main#usage

loading story #47928686

esafak18 hours ago | parent | next

Is there a browser-friendly vector alternative?

loading story #47924792

marlburrow8 hours ago | parent | next

[dead]

nengil18 hours ago | parent | next

[flagged]

shaguoer17 hours ago | parent | next

[flagged]

SpyCoder7717 hours ago | parent

I did not expect to see GarboMuffin.

#visit	13,266,381
#session	74,665
#live-session	0