« Dear Cingular: My credit, my privacy. | Main | Statistical Relational Anomalies »

HTML Comments: comedy gold?

With most security testing, the general approach is to attack the ripe, low hanging fruit first. If that turns out to yield no results, then you go on to more difficult, oftentimes less-likely methods of attack.

When it comes to auditing a web application, the obvious low hanging fruit include XSS, SQL injection, parameter manipulation + bypass, etc. One attack that is sometimes overlooked is scanning the code for useful comments. I couldn't recall a tool that did this, so once again I sat down and wrote one. htcomment is the result.

I took the code for a spin and found some interesting results.

Leftover CVS/RCS/$version-control tags on Slashdot:

http://slashdot.org/faq/tags.shtml line 326:
<!--
($VERSION) = '$Id: tags.shtml,v 1.12 2006/11/10 16:07:45 jamiemccarthy Exp $' =~ /^\$Id: \S+ (\S+)/
-->

Debugging code on CNN:

http://cnn.com line 760:
//alert('Query Variable ' + variable + ' not found');
################################################################################
http://cnn.com line 765:
//alert("yep");
################################################################################
http://cnn.com line 770:
//alert("nope");
Disabled functionality on Craigslist:
http://losangeles.craigslist.org/about/privacy.policy.html line 70:
<!-- <li>Subscribers can manage their subscriptions through the
<a href=/cgi-bin/emailSubscriber.cgi>Subscription Management</a> page. -->
Digg wierdness:
################################################################################
http://www.digg.com line 17:
<!-- digg is up 959595-->
################################################################################
http://www.digg.com line 620:
<!-- digg is done serving you. 2.01355321270u 137.03599911 6.6742x10-11m3kg-1s-2 6.6742x10-11m3kg-1s-2 -->
Yahoo load-balancing:
http://www.yahoo.com line 214:
<!-- f11.www.sp1.yahoo.com uncompressed/chunked Sun Feb 25 15:43:39 PST 2007 -->
The code used to parse comments is quite horrendous, but seems to work fairly well. It needs to be able to handle UTF-8/16/32, needs more work to better handle script and style tags, and won't parse sites whose html is all on a single line. Aside from that, its pretty fun. If you discover anything interesting with this, drop me a line. Enjoy.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)