So we have the XSS cheat sheet to test our XSS filtering - but other than an example benign page I can't find any evil or malformed test data to make sure that my UTF-8 code can handle missbehaving data.
Where can I find some good uh.. bad data to test with? Or what is a tricky sequence of chars?
Check out Markus Kuhn’s UTF-8 decoder stress test