Created by John K. Morris
jmorris@evolutioninteractive.com
Version 1.0 – September 29, 2022
Why yet another Macintosh disk image format?
This is probably the question many of you reading this document are asking. It basically comes down to the simple fact that none of the currently existing formats accurately represent the way data is encoded on a Macintosh floppy disk. There is a place for a format that is an accurate representation of a bitstream that is also the exact length of a track so that it can be looped correctly. And since we are creating a format, it is also a great time to ensure that we organize the data in the image file in a way that allows for easy unpacking with as little memory and processing overhead as possible – this provides more performant usage in hardware and software emulators.
A big benefit of the MOOF format is the ability store copy protected disk images and to successfully run the protected software under emulation. The second benefit is that the MOOF format is actually much simpler to implement than many of the other disk image formats as you don’t need to construct track data on the fly, but it does require that you have correctly emulated the IWM/SWIM chip. MOOF files also contain metadata about the disk image – such as disk name, product name, publisher, system requirements and language – that you can use to display additional information in your emulator.
What the heck is MOOF supposed to stand for?
This reference might be pretty obscure to you unless you are a fan of Apple lore from the very beginnings of the Macintosh. Within the Cairo font of the original Macintosh, one of the glyphs was of a dog with spots. Designed by Susan Kare who did pretty much all of the early Macintosh icons. Anyway, this dog took on the role of a mascot within the Macintosh development team and was given the name “Clarus the Dogcow”. And what else would a Dogcow say other than “MOOF”? This disk image format was given the name MOOF as a tribute to the work and playful spirit of the early pioneers that created the Macintosh.
MOOF File Format Specification
A MOOF file uses a chunk-based file binary format that provides future-proof expandability in a way that is safe for older software which may not recognize newer data chunks.
All data is stored little-endian.
MOOF files begin with the following 12-byte header in order to identify the file type as well as detect any corruption that may have occurred. The easiest way to detect that a file is indeed a MOOF file is to check the first 8 bytes of the file for the signature. The remaining 4 bytes are a CRC of all remaining data in the file. This is only provided to allow you to ensure file integrity and is not necessary to process the file. If the CRC is 0x00000000, then no CRC has been calculated for the file and should be ignored. The exact CRC routine used is shown in Appendix A, and you should be passing in 0x00000000 as the initial crc value.
Byte | Value | Purpose |
---|---|---|
0 | 4D 4F 4F 46 | The ASCII string ‘MOOF’. 0x464F4F4D |
4 | FF | Make sure that high bits are valid (no 7-bit data transmission) |
5 | 0A 0D 0A | LF CR LF – File translators will often try to convert these. |
8 | xx xx xx xx | CRC32 of all remaining data in the file. The method used to generate the CRC is described in Appendix A. |
After the header comes a sequence of chunks which each contain information about the disk image. Using chunks allows for the MOOF disk format to provide forward compatibility as chunks can be added to the specification and will just be safely ignored by applications that do not care (or know) about the information. For lower-performance emulation platforms, the primary data chunks are all located in fixed positions so that direct access to data is possible using just offsets from the start of the file.
All chunks have the following structure:
Offset | Size | Name | Usage |
---|---|---|---|
+0 | 4 bytes | Chunk ID | 4 ASCII characters that make up the ID of the chunk |
+4 | uint32 | Chunk Size | The size of the chunk data in bytes. |
+8 | … | Chunk Data | The chunk data. |
To process the file, you start at the first Chunk ID which will be located at byte 12 of the file, immediately following the header. You read the Chunk ID and the Chunk Size following it. If you want to process this chunk, then your file pointer will be at the start of the data. If you don’t care about this chunk, then skip the number of bytes as Chunk Size indicates and you will now be at the next Chunk ID.
while(data_stream.availableToRead() > 8) { uint32_t chunk_id = data_stream.readU32(); uint32_t chunk_size = data_stream.readU32(); switch(chunk_id) { case INFO_CHUNK_ID: // read the INFO chunk break; case TMAP_CHUNK_ID: // read the TMAP chunk break; case TRKS_CHUNK_ID: // read the TRKS chunk break; case META_CHUNK_ID: // read the META chunk break; default: // no idea what this chunk is, so skip it data_stream.skip(chunk_size); } }
INFO Chunk
The first chunk in an Applesauce file is always an ‘INFO’ chunk. This contains some fundamental information about the contained image. The data of the ‘INFO’ chunk begins at byte 20 of the file and is 60 bytes long (pad chunk with zeros to full length).
Byte | Offset | Type | Vers | Name | Usage |
---|---|---|---|---|---|
12 | uint32 | ‘INFO’ Chunk ID | 0x4F464E49 | ||
16 | uint32 | Chunk Size | Size is always 60. | ||
20 | +0 | uint8 | 1 | INFO Version | Version number of the INFO chunk. Current version is 1. |
21 | +1 | uint8 | 1 | Disk Type | 1 = SSDD GCR (400K), 2 = DSDD GCR (800K), 3 = DSHD MFM (1.44M), 4 = Twiggy |
22 | +2 | uint8 | 1 | Write Protected | 1 = Floppy is write protected |
23 | +3 | uint8 | 1 | Synchronized | 1 = Cross track sync was used during imaging |
24 | +4 | uint8 | 1 | Optimal Bit Timing | The ideal rate that bits should be delivered to the disk controller card. This value is in 125 nanosecond increments, so 8 is equal to 1 microsecond. A GCR disk is typically 16 and an MFM one is 8. |
25 | +5 | UTF-8 32 bytes |
1 | Creator | Name of software that created the MOOF file. String in UTF-8. No BOM. Padded to 32 bytes using space character (0x20). ex: “Applesauce v1.0 ” |
57 | +37 | uint8 | 1 | – | Padding. Value is always 0 (zero). |
58 | +38 | uint16 | 1 | Largest Track | The number of blocks (512 bytes) used by the largest track. Can be used to allocate a buffer with a size safe for all tracks. |
60 | +40 | uint16 | 1 | FLUX Block | Block number where the FLUX chuck resides relative to the start of the file. A FLUX chunk always occupies its own block. If this MOOF does not utilize a FLUX chunk, then this value will be 0. When checking for the existence of a FLUX chunk, make sure that BOTH this value and the next one (Largest Flux Track) are non-zero. |
62 | +42 | uint16 | 1 | Largest Flux Track | The number of blocks (512 bytes) used by the largest flux track. Can be used to allocate a buffer with a size safe for all tracks. |
The chunk is versioned to allow for adding additional info in the future. The “Vers” column in the table above indicates at which version the data field became available. When reading data from the chunk, make sure that value you are looking for actually exists within the version of the chunk you are reading. To be sure that your moof loading is future proof, use >= when checking the INFO Version field. The INFO chunk will always be upgraded in a safe way for older consumers.
TMAP Chunk
The ‘TMAP’ chunk contains a track map. This allows you to map physical drive tracks with the track data contained within the image file ‘TRKS’ chunk. The data of the ‘TMAP’ chunk begins at byte 88 of the file and is 160 bytes long.
Each map entry contains an index number for the track data contained within the ‘TRKS’ chunk. If the map entry is 0, then the correct track data to be using is the first entry in the ‘TRKS’ chunk. Any blank tracks are given a value of 255 (0xFF) in the map and the emulator should be outputting random bits in this case. For single sided disks, every other (side 1) entry will be 255.
This is how the mapping is organized:
Byte | Offset | Type | Name | Usage |
---|---|---|---|---|
80 | uint32 | ‘TMAP’ Chunk ID | 0x50414D54 | |
84 | uint32 | Chunk Size | Size is always 160. | |
88 | +0 | uint8 | Track 0, Side 0 | Index of TRKS entry to use for Track 0, Side 0. |
89 | +1 | uint8 | Track 0, Side 1 | |
90 | +2 | uint8 | Track 1, Side 0 | |
… | … | … | ||
246 | +158 | uint8 | Track 79, Side 0 | |
247 | +159 | uint8 | Track 79, Side 1 |
TRKS Chunk
The ‘TRKS’ chunk contains the data for all of the unique tracks. The data of the ‘TRKS’ chunk begins at byte 256. For more efficient track data copying from SD cards and other block devices, all track data is stored in 512-byte blocks and the blocks are 512 byte aligned relative to the start of the MOOF file. The start of the TRKS chunk has an array of TRK structures to locate all of the actual bit data within the file. The actual bit data begins at byte 1536 (block 3) of the MOOF file. While every track can be a different size, this is rarely the case. GCR disks will have tracks that are all about the same size in each speed zone. If needed, the ‘INFO’ chunk contains a Largest Track value that you can use to safely allocate a storage buffer for any track in this MOOF file.
Byte | Offset | Type | Name | Usage |
---|---|---|---|---|
248 | uint32 | ‘TRKS’ Chunk ID | 0x534B5254 | |
252 | uint32 | Chunk Size | ||
256 | +0 | TRK | Track 00 | First track in track array. TMAP value of 0. |
264 | +8 | TRK | Track 01 | Second track in track array. TMAP value of 1. |
272 | +16 | TRK | Track 02 | Third track in track array. TMAP value of 2. |
… | … | … | ||
1528 | +1272 | TRK | Track 159 | Last track in track array. TMAP value of 159. |
1536 | +1280 | BITS | Beginning of Track Data Blocks | Start of the actual track bits. |
The structure of the TRK type in the previous table is as follows:
Byte | Offset | Type | Name | Usage |
---|---|---|---|---|
– | +0 | uint16 | Starting Block | First block of BITS data. This value is relative to the start of the file, so the first possible starting block is 3. Multiply this value by 512 (x << 9) to get the starting byte of the BITS data. |
– | +2 | uint16 | Block Count | Number of blocks for this BITS data. |
– | +4 | uint32 | Bit Count | The number of bits in the bitstream. |
If your disk doesn’t use all of the entries, then unused TRK structures should be filled out with zeros for all values.
The bits are packed into bytes, but the bytes will not necessarily be representative of nibble values or even be guaranteed to be aligned to byte boundaries. When processing the bitstream, the order of bits within each byte is high to low, meaning the high bit goes first and the low bit is last.
With all of this talk about bitstreams, you may be wondering how to know the difference between GCR and MFM bytes. From the point of view of disk emulation, you don’t really need to know what encoding the track is using. Just pass the bits at the correct speed and it will be exactly what a real drive would be outputting. If you aren’t emulating a drive, but instead trying to extract disk bytes then there is a bit of a longer conversation to have. The Track Index for MFM is always located at the start of the bits. For GCR encoding, you need to pass bytes through a Logic State Sequencer (IWM/SWIM) in order to synchronize bytes and be able to decode the sector structure. For MFM encoding, every byte value is represented by 16-bits of interleaved clock/timing bits. But in order to synchronize you bytes properly, you will need to search for sync markers (like 0x4489).
If you are creating a floppy drive emulator for use with a real Mac, then you will simply be stepping to the next bit in the bitstream at the rate specified by the Optimal Bit Timing in the INFO chunk. If the bit has a 1 value, then you send a 0.5µs pulse on the RDDATA line.
Track data may also be represented as flux timings instead of bits (see FLUX Chunk section for more info). This does not change the structure of the TRKS chunk at all, but for flux data the Bit Count is actually a byte count. The contained flux data is encoded the same way as A2R files. These flux streams are properly looped so there is no time warp when wrapping from the end of the buffer back to the beginning.
A quick explanation of the flux encoding: Each byte represents a single flux transition and its value is the number of ticks since the previous flux transition. A single tick is 125 nanoseconds. Therefore the normal 2 microsecond spacing between sequential GCR 1 bits is represented by approximately 16 ticks. This also puts 101 and 1001 bit sequences at approximately 32 and 48 ticks. You are probably thinking to yourself that when it comes to longer runs of no transitions, how is this unsigned byte going to handle representing the time? That is taken care of via the special value of 255. When you encounter a 255, you need to keep adding the values up until you reach a byte that has a non-255 value. You then add this value to all of your accumulated 255s to give you the tick count. For example 255, 255, 10 should be treated as 255 + 255 + 10 = 520 ticks.
FLUX Chunk (optional)
The ‘FLUX’ chunk contains a track map that is structured and functions identically to the ‘TMAP’ chunk. It allows you to map physical drive tracks with the track data contained within the ‘TRKS’ chunk. This ’FLUX’ map will only contain valid (non-0xFF) entries for tracks that use flux data. Additionally, any tracks that use flux data should be empty (0xFF) in the ‘TMAP’. If a MOOF file hasn’t been created properly and a single track has valid entries in both the ‘TMAP’ and ’FLUX’ maps, the ’FLUX’ one should be used.
Each map entry contains an index number for the track data contained within the ‘TRKS’ chunk. If the map entry is 0, then the correct track data to be using is the first entry in the ‘TRKS’ chunk.
To determine if a MOOF file is using flux data, the ’INFO’ chunk field INFO Version needs to be greater than or equal to 1. You should then check that the FLUX Block and Largest Flux Track values are BOTH non-zero. The FLUX Block value can be used as a shortcut to locate the ‘FLUX’ chunk if you are not walking through all of the blocks at load time (byte offset from start of file to ‘FLUX’ chunk is FLUX Block * 512).
META Chunk (optional)
The ‘META’ chunk contains metadata for the disk image and its existence is optional in the MOOF file. The metadata is stored as a tab-delimited UTF-8 list of keys and values. Columns are by separated by a tab character (‘\t’ 0x09). All rows end with a linefeed character (‘\n’ 0x0A)
Byte | Offset | Type | Name | Usage |
---|---|---|---|---|
– | uint32 | ‘META’ Chunk ID | 0x4154454D | |
– | uint32 | Chunk Size | Length of the metadata string in bytes. | |
– | +0 | String | Metadata | Metadata string in UTF-8. No BOM. |
This is the list of standard metadata keys. Multiple values are pipe-separated.
Key | Purpose | Example Value |
---|---|---|
title | Name/Title of the product. | Prince of Persia |
subtitle | Subtitle of the product. | |
publisher | Publisher of the software. | Brøderbund Software, Inc. |
developer | Developer of the software. Pipe-delimited list if needed. | Jordan Mechner |
copyright | Copyright date. Free form text allowed. | 1989 1987 Muse Software |
version | Version number of the software. Free form text allowed. | 1.0 19870115P |
language | Language (see table A) | English |
requires | Any requirements for the software as listed on the disk or packaging. Free form text field. | Mac Plus |
colordepth | All of the bit depths that the software is compatible with in a pipe-delimited list. (Possible values: 1, 2, 4, 8, 16, 24) | 1|4|8 |
notes | Additional notes. | |
disk_name | Name of the disk side. If the disk is named on the label like Player, Town, Dungeon, etc then it goes here. | Program Install Disk |
disk_number | If the disk is part of a set, then this can be used to identify the disk number. | Disk 1 |
contributor | Name of the person who imaged the disk. | Mr. Pirate |
image_date | RFC3339 date of the imaging. | 2018-01-07T05:00:02.511Z |
If a standard key has no value, then the value will be an empty string. Key names are case-sensitive. Values cannot contain pipe, linefeed or tab characters. It would also be a good idea to keep all values as ASCII-friendly as possible to ensure compatibility with the widest range of devices that will consume these files. No duplicate keys are allowed and key order does not matter. Standard keys that have values laid out in the tables below cannot have values other that those shown below. Implementors are free to add additional keys to the metadata as long as they follow the same rules laid out here.
TABLE A- LANGUAGES
English | Spanish | French | German |
Chinese | Japanese | Italian | Dutch |
Portuguese | Danish | Finnish | Norwegian |
Swedish | Russian | Polish | Turkish |
Arabic | Thai | Czech | Hungarian |
Catalan | Croatian | Greek | Hebrew |
Romanian | Slovak | Ukrainian | Indonesian |
Malay | Vietnamese | Other |
Appendix A: CRC Routine
The integrity of the MOOF files are protected by a standard 32-bit CRC. The routine that has been chosen for use originated with Gary S. Brown in 1986 and is implemented as follows:
static uint32_t crc32_tab[] = { 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, 0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, 0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, 0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5, 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, 0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f, 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, 0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, 0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d, 0x91646c97, 0xe6635c01, 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, 0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb, 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, 0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad, 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, 0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, 0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7, 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, 0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79, 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, 0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f, 0x72076785, 0x05005713, 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, 0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45, 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, 0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9, 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693, 0x54de5729, 0x23d967bf, 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d }; uint32_t crc32(uint32_t crc, const void *buf, size_t size) { const uint8_t *p; p = buf; crc = crc ^ ~0U; while (size--) crc = crc32_tab[(crc ^ *p++) & 0xFF] ^ (crc >> 8); return crc ^ ~0U; }