A2R 3.x Disk Image Reference

Created by John K. Morris
jmorris@evolutioninteractive.com

Version 3.0 – September 16 , 2021

The A2R format began life as the container format for raw flux images as recorded by the Applesauce hardware and software. The A2R format has evolved a few times since I originally started work on Applesauce, and is now more of a generic flux format that is compatible with disk imagers and tools other than Applesauce. It is a simple format, but contains many advantages over other flux formats such as being extensible, content-independent, allows a variable number of captures per-track, supports hard sectored disks, and is able to contain complex metadata.

The 3.x specification was created in September of 2021 and is not backwards compatible with the 2.x specification whose specification can be found here.

A2R File Format Specification

An A2R file uses a chunk-based file binary format that provides future-proof expandability in a way that is safe for older software which may not recognize newer data chunks.

All data is stored little-endian.

A2R files begin with the following 8-byte header in order to identify the file type as well as detect any corruption that may have occurred.

Byte	Value	Purpose
0	41 32 52 33	The ASCII string ‘A2R3’ (0x33523241). The final character in this string is the major version number of the A2R specification which this file conforms. Files conforming to the earlier A2R 2.x spec use ‘A2R2’.
4	FF	Make sure that high bits are valid (no 7-bit data transmission)
5	0A 0D 0A	LF CR LF – File translators will often try to convert these.

After the header comes a sequence of chunks which each contain information about the disk image. Using chunks allows for the A2R disk format to provide forward compatibility as chunks can be added to the specification and will just be safely ignored by applications that do not care (or know) about the information.

All chunks have the following structure:

Offset	Size	Name	Usage
+0	4 bytes	Chunk ID	4 ASCII characters that make up the ID of the chunk
+4	uint32	Chunk Size	The size of the chunk data in bytes.
+8	…	Chunk Data	The chunk data.

To process the file, you start at the first Chunk ID which will be located at byte 8 of the file, immediately following the header. You read the Chunk ID and the Chunk Size following it. If you want to process this chunk, then your file pointer will be at the start of the data. If you don’t care about this chunk, then skip the number of bytes as Chunk Size indicates and you will now be at the next Chunk ID.

while(data_stream.availableToRead() > 8) {
	uint32_t chunk_id = data_stream.readU32();
	uint32_t chunk_size = data_stream.readU32();
	switch(chunk_id) {
	case INFO_CHUNK_ID:
		// read the INFO chunk
		break;
	case RWCP_CHUNK_ID:
		// read the RWCP chunk
		break;
        case SLVD_CHUNK_ID:
                // read the SLVD chunk
                break; 
	case META_CHUNK_ID:
		// read the META chunk
		break;
	default:
		// no idea what this chunk is, so skip it
		data_stream.skip(chunk_size);
	}
}

INFO Chunk

The first chunk in an Applesauce file is always an ‘INFO’ chunk. This contains some fundamental information about the contained image. The data of the ‘INFO’ chunk begins at byte 16 of the file and is 37 bytes long.

Offset	Type	Vers	Name	Usage
	uint32		‘INFO’ Chunk ID	0x4F464E49
	uint32		Chunk Size	Version 1 is 37 bytes long.
+0	uint8	1	INFO Version	Version number of the INFO chunk. Current version is 1.
+1	UTF-8 32 bytes	1	Creator	Name of software that created the A2R file. String in UTF-8. No BOM. Padded to 32 bytes using space character (0x20). ex: “Applesauce v1.0 ”
+33	uint8	1	Drive Type	The drive type used to generate the image. 1 = 5.25″ SS 40trk 0.25 step 2 = 3.5″ DS 80trk Apple CLV 3 = 5.25″ DS 80trk 4 = 5.25″ DS 40trk 5 = 3.5″ DS 80trk 6 = 8″ DS 7 = 3″ DS 80trk 8 = 3″ DS 40trk
+34	uint8	1	Write Protected	1 = Floppy is write protected
+35	uint8	1	Synchronized	1 = Cross track sync/index was used during imaging
+36	uint8	1	Hard Sector Count	0 = Soft sectored 1+ = Number of hard sectors on disk

The chunk is versioned to allow for adding additional info in the future. The “Vers” column in the table above indicates at which version the data field became available. When reading data from the chunk, make sure that value you are looking for actually exists within the version of the chunk you are reading.

Having the correct value in the Drive Type field is essential as it is how software that consumes these files knows how to unpack the tracks to the proper locations on the media surface as well as provides hints as to potential disk formats. If an 80-track drive is being used to image a double-sided 40-track disk, but the user is only reading every other cylinder, the Drive Type should still be 3 and the Location fields in the capture should be using the actual cylinder number that the head is physically on. So, when using the Location formula ((cylinder << 1) + side) the sequence of capture locations would be 0, 1, 4, 5, 8, 9, etc (cyl 0 side 0, cyl 0 side 1, cyl 2 side 0, cyl 2 side 1, cyl 4 side 0, cyl 4 side 1, etc). While the Drive Type specifies double-sided (DS), if you have a single-sided drive of this type, you should still use the double-sided Drive Type, but only use side 0 when calculating the Location.

RWCP Chunk – Raw Captures

An ‘RWCP’ chunk contains raw data streams as captured during the imaging process. The types of captures are based on how Applesauce performs its imaging and is directly linked to its requirement to handle insanely complex physical formats used for copy protection on the Apple II. Applesauce uses a 2-pass imaging process, doing a rapid imaging to determine where on the media surface track data exists and then a second pass that captures longer durations for processing and error correction.

A timing capture is the quick flux timing capture. It starts capturing data when it sees the sync/index sensor trigger and grabs flux timing data for 1.25 disk revolutions. I picked 1.25 revolutions as it tends to be enough overlap in order to successfully loop a normally structured track, but still leaves enough time to transmit data, step the head, and be ready in time for the next index signal.
A xtiming (extended timing) capture is basically the same as a timing except that the capture is 2.25 or more revolutions. While imaging, Applesauce prefers multiple 2.25 revolutions as opposed to a single longer capture as I have found that this is a good balance of performance with reliable analysis. If the entire track checksums correctly and looks to be stable, then it can skip over further reads of the track. Normally structured tracks that read cleanly will tend to have at least 3 full revolutions whereas a problematic one can have up to 7 revolutions.
A bits capture (legacy and not recommended for new images) was originally used to perform the same function as xtiming, but has been deprecated in Applesauce v1.0.3. Instead of capturing flux timing, it captures a bit stream that is 16384 bytes long.

It is perfectly legal for an A2R file to have more than one RWCP chunk in it, although there is no real advantage to doing so unless you wanted to have RWCP chunks with differing Resolution fields. It is also possible for an A2R file to have no RWCP chunks if it instead contains SLVD chunks.

In an RWCP chuck, all of the captures are stored end-to-end as packed data. There is no padding between captures. The RWCP chunk begins like this:

Offset	Type	Vers	Name	Usage
	uint32		‘RWCP’ Chunk ID	0x50435752
	uint32		Chunk Size	Length of the chunk data in bytes.
+0	uint8	1	RWCP Version	Version number of the RWCP chunk. Current version is 1.
+1	uint32	1	Resolution	Number of picoseconds per tick for flux and index timings in this chunk. Applesauce defaults to 62,500 picoseconds (65.5 nanoseconds), but other tools using this format may use a different value.
…	…			Reserved for future use (11 bytes zeroed out)
+16	…		Stream Capture Entries

Each Stream Capture Entry has a header with the following format:

Offset	Type	Name	Usage
+0	uint8	Mark	“C” 0x43 = Capture “X” 0x58 = End of captures
+1	uint8	Capture Type	1 = timing, 2 = bits, 3 = xtiming
+2	uint16	Location	Track where this capture happened. For Drive Type 1 (SS 5.25 @ 0.25 step) disks, this value is in halfphases or quarter tracks. For example track 0.00 is halfphase 0 and track 1.00 is halfphase 4. For all other Drive Types, this value indicates track number as well as side. The formula ((track << 1) + side) can be used (0 = Track 0 Side 0, 1 = Track 0 Side 1, 2 = Track 1 Side 0). Single sided drives should still use this formula, but only use a side value of 0.
+4	uint8	Number of Index Signals	The quantity of index signals in the following array.
+5	Array of uint32	Array of Index Signals	Each entry in the array is an absolute timing (in ticks) from the start of the capture to when the index signal was detected. Hard sectored disks will have multiple signals per disk rotation. Soft sectored disks are not required to have more than one index signal, even for captures that were multiple rotations.
…	uint32	Capture Data Size	Number of bytes of capture data.
…	…	Capture Data

Since Applesauce does multiple captures of each track, it is common for multiple sets of capture data which use the same Location value to exist. When processing each capture, you can verify that the Mark field is 0x43 as a sanity check. Following the final capture entry, the RWCP chunk is ended with a single 0x58 byte that corresponds to the Mark field of the header (it is just a single byte, not a full header) and this is the signal that you are done loading captures for this chunk.

When capturing index signals, if you trigger the start of a capture based on an index signal then do not include this signal in the Array of Index Signals. For example, if you have a soft sectored disk and are starting the capture at the index signal, the first entry into the array will be after the first rotation has completed. For a timing capture that has only 1.25 rotations, there will only be a single entry in the array with a value around 200ms.

In order to determine the duration of a single track rotation, you can use the Index Signal array along with the Hard Sector Count from the INFO chunk. Simply use the Hard Sector Count as an index into the Index Signals array.

uint32 loop_time = capture.indexSignals[info.hardSectorCount];

For the timing and xtiming Capture Types, the Data is stored as a stream of unsigned bytes. Each byte represents a single flux transition and its value is the number of ticks since the previous flux transition. The duration of a single tick is defined by the Resolution field of the RWCP chunk. Therefore with the Applesauce default resolution of 62.5 nanoseconds, the normal double density 4 microsecond spacing between sequential 1 bits is represented by 64 ticks. You are probably thinking to yourself that when it comes to longer runs of no transitions, how is this unsigned byte going to handle representing the time? That is taken care of via the special value of 255. When you encounter a 255, you need to keep adding the values up until you reach a byte that has a non-255 value. You then add this value to all of your accumulated 255s to give you the tick count. For example 255, 255, 10 should be treated as 255 + 255 + 10 = 520 ticks.

Some example code to encode flux timings as capture data:

for(uint32 i = 0; i < flux_count; i++) {
    uint32 time = flux_data[i];
    while (time >= 255) {
        *data++ = 255;
        time -= 255;
    }
    *data++ = time;
}

SLVD Chunk – Solved Flux Streams (optional)

An ‘SLVD’ chunk contains flux stream that have been solved, which means that there are no extra flux transitions and that the data is a perfect loop (single rotation) of the track contents. This track data is also appropriate for use with emulators. This chunk is completely optional and can also exist in A2R files that contain RWCP chunks. The format of the flux data is the same packed data format that is used for timing and xtiming in RWCP chunks (see that section for additional details about the format).

It is legal for an A2R file to have any number of SLVD chunks in it, including none. But, an A2R file should only have one Track Entry per track. If there is no Track Entry for a track then the track is assumed to be empty/unformatted.

In an SLVD chuck, all of the tracks are stored end-to-end as packed data. There is no padding between tracks. The SLVD chunk begins like this:

Offset	Type	Vers	Name	Usage
	uint32		‘SLVD’ Chunk ID	0x44564C53
	uint32		Chunk Size	Length of the chunk data in bytes.
+0	uint8	1	SLVD Version	Version number of the SLVD chunk. Current version is 2.
+1	uint32	1	Resolution	Number of picoseconds per tick for flux and index timings in this chunk. Applesauce defaults to 62,500 picoseconds (62.5 nanoseconds), but other tools using this format may use a different value.
…	…			Reserved for future use (11 bytes zeroed out)
+16	…		Track Entries

Each Track Entry has a header with the following format:

Offset	Type	Name	Usage
+0	uint8	Mark	“T” 0x54 = Track “X” 0x58 = End of tracks
+1	uint16	Location	Track where this data is located. For Drive Type 1 (SS 5.25 @ 0.25 step) disks, this value is in halfphases or quarter tracks. For example track 0.00 is halfphase 0 and track 1.00 is halfphase 4. For all other Drive Types, this value indicates track number as well as side. The formula ((track << 1) + side) can be used (0 = Track 0 Side 0, 1 = Track 0 Side 1, 2 = Track 1 Side 0). Single sided drives should still use this formula, but only use a side value of 0.
+3	uint8	Mirror Distance Outward	If identical flux data extends to neighboring Locations, then this value is used to indicate how far it should reach in the outward direction (lower Location numbers). This is typically only used for Drive Type 1 or with special fat track copy protections. Value should be 0 if it isn’t mirrored to neighboring Locations.
+4	uint8	Mirror Distance Inward	If identical flux data extends to neighboring Locations, then this value is used to indicate how far it should reach in the inward direction (higher Location numbers). This is typically only used for Drive Type 1 or with special fat track copy protections. Value should be 0 if it isn’t mirrored to neighboring Locations.
…	…		Reserved for future use (6 bytes zeroed out)
+11	uint8	Number of Index Signals	The quantity of index signals in the following array. It is possible for this value to be zero for soft sectored disks as many platforms don’t use an index sensor at all.
+12	Array of uint32	Array of Index Signals	Each entry in the array is an absolute timing (in ticks) from the start of the track to when an index signal should be triggered. Hard sectored disks will have multiple signals per disk rotation. If the Number of Index Signals is 0, then this array doesn’t exist at all and the Flux Data begins here.
…	uint32	Flux Data Size	Number of bytes of flux data.
…	…	Flux Data

When processing each track, you can verify that the Mark field is 0x54 as a sanity check. Following the final track entry, the SLVD chunk is ended with a single 0x58 byte that corresponds to the Mark field of the header (it is just a single byte, not a full header) and this is the signal that you are done loading tracks for this chunk.

META Chunk (optional)

The ‘META’ chunk contains metadata for the disk image. The metadata is stored as a tab-delimited UTF-8 list of keys and values. Columns are by separated by a tab character (‘\t’ 0x09). All rows end with a linefeed character (‘\n’ 0x0A).

Offset	Type	Name	Usage
	uint32	‘META’ Chunk ID	0x4154454D
	uint32	Chunk Size	Length of the metadata string in bytes.
+0	String	Metadata	Metadata string in UTF-8. No BOM.

This is the list of standard metadata keys. If the metadata entry supports multiple values, then they should be pipe-separated.

Key	Purpose	Example Value
title	Name/Title of the product.	Prince of Persia
subtitle	Subtitle of the product.
publisher	Publisher of the software.	Brøderbund Software, Inc.
developer	Developer of the software. Pipe-delimited list if needed.	Jordan Mechner
copyright	Copyright date. Free form text allowed. Include the copyright holder if different than the publisher.	1989 1987 Muse Software
version	Version number of the software. Free form text allowed.	1.0 19870115P
language	Language (see table A)	English
requires_platform	Which platform does this run on?	apple2 mac pc cbm atari amiga
requires_machine	Which models of the platform is this compatible with? Pipe-delimited list.	2+\|2e\|2c\|2gs
requires_ram	RAM requirements	64K 1M
notes	Additional notes.
side	Physical disk side formatted as: “Disk #, Side [A\|B]” or “Disk #”	Disk 1, Side A
side_name	Name of the disk or disk side. If the disk side is named on the label like Player, Town, Dungeon, etc then it goes here.	Front
contributor	Name of the person who imaged the disk.	Mr. Pirate
image_date	ISO8601 date of the imaging.	2018-01-07T05:00:02.511Z

If a standard key has no value, then the value will be an empty string. Key names are case-sensitive. Values cannot contain pipe, linefeed or tab characters. It would also be a good idea to keep all values as ASCII-friendly as possible to ensure compatibility with the widest range of devices that will consume these files. No duplicate keys are allowed and key order does not matter. Standard keys that have values laid out in the tables below cannot have values other that those shown below. Implementors are free to add additional keys to the metadata as long as they follow the same rules laid out here.

TABLE A – LANGUAGES

English	Spanish	French	German
Chinese	Japanese	Italian	Dutch
Portuguese	Danish	Finnish	Norwegian
Swedish	Russian	Polish	Turkish
Arabic	Thai	Czech	Hungarian
Catalan	Croatian	Greek	Hebrew
Romanian	Slovak	Ukrainian	Indonesian
Malay	Vietnamese	Other