A few days ago, during one of those nights with the baby crying at 2:00 am and the only thing you can do is to read emails, I realised that Gmail shows the content of compressed files when reading them in Google Docs. As often is the case at SensePost, the “think evil ™” came to me and I started to ponder the possibilities of injecting HTML inside the file listing. The idea is actually rather simple. Looking at the file format of a .zip file we see the following:
Every file in the compressed file must have two entries; ZipFileRecord and ZipDirEntry. Both of these entries contain the filename, but only the first one contains the length of filename (it must match the actual length). Our first test case is obvious; if we could modify this name once the file was compressed, would Google sanitise it? Thankfully, the answer is, yes! (go Google!)
As you can see, Google shows the file name inside the compressed file but the tag is displayed with HTML entities. If we then try to see the contents of the file, Google responds by telling us it’s not possible to read the content of the file (it’s empty) and shows you the file “without formatting” after a few seconds:
Finally, the filename is shown but not sanitised:
Why this is possible?
Remember that the zip format has the name of the compressed files twice. Google uses the first one (ZipFileRecord) for displaying the file names, but in the vulnerable page it uses the second one (ZipDirEntry).
Possible attack vectors
Going back to the ‘thinking evil ™’ mindset, it is now possible to leave a “comprehensive” name in the first entry and inject the malicious payload in the second one. When I first discovered the possibility of doing this, I contacted Google, however, the XSS is in the googleusercontent.com domain, which Google’s security team described as a “sandbox” domain (i.e. we aren’t injecting into the DOM of google.com) and therefore not worthy of a bounty. Which I accept, if I had to prove usefulness this could be used as part of a simple social engineering attack, for example:
Leading the victim to my phishing site:
Which then proceeds to steals their Google session, or allows the attacker to use BeEF:
Granted, there are simpler ways of achieving the same result. I just wanted to demonstrate how you can use file meta-information for such an attack.