The woes of sanitizing SVGs
https://muffin.ink/blog/scratch-svg-sanitization/A useful thing I learned recently is that, while CSP headers are usually set using HTTP headers, you can also reliably set them directly in HTML - for example for HTML generated directly on a page where HTTP headers don't come into play:
<iframe sandbox="allow-scripts" srcdoc="
<meta http-equiv='Content-Security-Policy'
content='default-src none; script-src unsafe-inline; style-src unsafe-inline;'>
<!-- untrusted content here -->
"></iframe>
It feels like this shouldn't work, because JavaScript in the untrusted content could use the DOM to delete or alter that meta tag... but it turns out all modern browsers specifically lock that down, treating those CSP rules as permanent as soon as that meta tag has loaded before any malicious code has the chance to subvert them.I had Claude Code run some experiments to help demonstrate this a few weeks ago: https://github.com/simonw/research/tree/main/test-csp-iframe...
I do feel that's there's two distinct types of svg - "bunch of paths with fills" and "clever dangerous stuff" where most real SVGs are of the former type.
Fully expect this to be shot down by someone that's thought about this problem for longer than the 120 seconds I just spent. :)
Even worse, OP's latest post "Every version of Scratch is vulnerable to arbitrary code execution" just tells you how exactly to exploit something similar today in the current version with no mention of responsible disclosure except a plug to say, "hey, check out my project, this one doesn't have RCE!" This is so irresponsible it borders on malicious.
https://developer.mozilla.org/en-US/docs/Web/API/HTML_Saniti...
https://developer.mozilla.org/en-US/docs/Web/API/HTML_Saniti...
I'm sure it'd just open up a whole other can of worms though... not to mention having to wait for browsers to actually support it.
The real solution here is definitely CSP + basic sanitisation though.
(This isn't a comment on the challenges in proper sanitization fwiw, as I've needed to do various of the same things myself)
Like this post didn't even mention presentational attributes, like how cursor attribute can contain a url that gets loaded. Or any of the other tricky parts of svg sanitization, like using dtd to bypass things.
Is it because the SVG parser/renderer being used is an entire library, and it would be prohibitive to write your own SVG parser/renderer or insert your own code into the existing one?
The infamous you can't parse (X)HTML with regex¹ meme is from 2009, yet this fix was done in 2019. I guess the SO answer never mentioned SVG.
Tag names, attributes, attribute values, event callback default-cancelers... so many ways to declare that this node and its children shouldn't parse/evaluate scripts.
As Jay-Z said: "I've got 99 solutions, fixing a problem ain't one"
It is not, and never was, an image format. It's a markup language.
This version 3 could have the version number changed to 2 in order to do cool SVG things, so full-fat SVG as version 2 is now. But you could just flip to 2 to a 3 on upload, so any embedded URLs are harmless.
This could be useful for the creator too, as it is helpful to have layers of source images in bitmap format to work with, and you can easily export such things accidentally.
> Example from Scratch's test suite:
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg version="1.1" xmlns="http://www.w3.org/2000/svg">
<circle cx="250" cy="250" r="50" fill="red" />
<script type="text/javascript"><![CDATA[
alert('from the svg!')
]]></script>
</svg>
Is this really an issue? This is the method that the chrome teams polyfill to replace XSLT suggests you do. https://github.com/mfreed7/xslt_polyfill/tree/main#usage