Tuesday, November 20, 2007

MZ, BSJB, and the Joys of "Magic" Constants in Managed Assemblies.

When I was 10 or so, I thought you could open up a .EXE in a text editor and do meaningful work with it if you only knew some magical secret. I noticed that all EXEs started with the letters "MZ" as in the CMD.EXE example above, but that's as far as I got.

A few years later, I read this article in the special 20th anniversary edition of Byte Magazine that let me know that "MZ" stood for the initials of Mark Zbikowski who designed the MS-DOS executable file format. Needing some magic number to identify bits as an executable, what better choice than your own initials (in Mark's case 0x4D5A)? This is actually very common knowledge that won't get you points at your next nerd dinner.

However, today I will share a point generating piece of knowledge (well for at least a year or so)... Without further ado:



In every piece of managed code, you'll find the four bytes (little-endian DWORD) of 0x42534A42, (of course, in big-endian 0x424A534A). This corresponds to the start of the "General Metadata Header" of an assembly.

The initials correspond to Brian Harry, Susan Radke-Sproull, Jason Zander, and Bill Evans who were part of the team in 1998 that worked on the CLR.

I discovered this gem while reading page 76 of Expert .NET 2.0 IL Assembler. After googling for it though it appears that it's been covered elsewhere too.

With this knowledge, you could write a very simple program that checked to see if a file was a .NET assembly by looking for a start of "MZ" that also contained "BSJB." It looks like the first "virus" that targeted .NET took advantage of this fact.

Now, where can I put "JDM"?

2 comments:

Huseyin Tufekcilerli said...

"When I was 10 or so, I thought you could open up a .EXE in a text editor and do meaningful work with it if you only knew some magical secret."

I was thinking exactly the same when I was first introduced to computers. I was opening any file (text or binary) with a text editor and fiddling around.

Btw the compiled .java files, .class files, start with these 4 bytes: 0xCA 0xFE 0xBA 0xBE aka Cafe Babe and ZIP files start with PK.

Jeff Moser said...

Huseyin: Good point on 0xCAFEBABE and PK. Similar things can be said for WAV, GIF, and other files. I think that Apache's Mimemagic module (and similar modules) looks for these to determine the real file type.