MMCPM
Author: Michael Minn (see michaelminn.com for contact info)
October 10, 2006
Describes usage of MMCPM, a utility for listing and extracting files from CP/M 2.2 disk images
1. Introduction
MMCPM is a simple POSIX utility for listing and extracting files from CP/M 2.2 disk images. MMCPM was written on Linux, but should be compilable on any machine that can provide a POSIX environment: any Unix, Mac OS X or later, or Windoze with CygWin. This document describes the operation of MMCPM and provides a specific example of usage with disks for the Intertec SuperBrain, a Z-80 based machine from the early 1980's. Additional information is given on how you can read old files and run CP/M software in an emulator.
CP/M (Control Program for Microcomputers) was an 8-bit operating system initially written for Intel 8080 and Zilog Z-80 microcomputers by Gary Kildall of Digital Research, Inc. in 1974. Digital released it as a commercial product in 1976 and later adapted it to 16-bit processors. CP/M running on S-100 bus computers (patterned on the MITS Altair) was an early standard for microcomputing. This standardization made it easier to port applications to different machines and encouraged wider adoption of microcomputers. Many of the ideas for M$-DOS were stolen from CP/M and the 1981 introduction of the IBM-PC running DOS led to the obsolescence of CP/M by the mid 1980s. (reference)
CP/M still has devotees (although I doubt anyone uses it as their regular OS) and there are a number of excellent sites with CP/M information and files, including:
MMCPM is built specificially for CP/M 2.2, not CP/M 3.1 or the later 16-bit variants (CP/M-86 and DOSPlus). For those disks, you may want to explore the more robust (and harder to use) utility package cpmtools. I couldn't get it to work with my strangely formatted SuperBrain disks, but it might be of value to you if you have any problems with MMCPM.
This document assumes some rudamentary knowledge about CP/M and Unix...but you probably wouldn't be reading this if you didn't already have that.
2. Getting Data From CP/M Disks
MIGRATE NOW - Hardware and expertise for for reading CP/M disks is becoming increasingly rare and the physical media can degrade with time. If you have even the slightest inkling that you may want to recover old personal data from old disks, you would be advised to do it immediately. Once it's gone, it's gone forever.
Your major challenge will be just getting the data off the disks. Although CP/M was a standard for filesystem structure and executable code, the actual physical storage of data on disks was different from machine to machine and the diversity of formats and disk controllers in the nascent days of microcomputing was quite stunning. This can make recovery of data from old CP/M disks a challenge.
CP/M disks were not IBM-compatible, although many CP/M disks can be read with PC disk drives given the right software. Some disks, such as the SuperBrain disks described on this page, required a special controller and cannot be read on an IBM-PC.
The easiest way to recover the data is to send it off to a service to have the files copied onto a CD that can then be read with a contemporary machine. Because of the unique hardware needed to read SuperBrain disks, I sent my 23 old SuperBrain floppies from my college days in the mid 80s to Sydex. After examining the disks and getting me a quote (around $200), they turned the job around in a couple of weeks. They were able to provide me with a CD containing the discrete files as well as complete "image" files of the disks. An image file is a complete sequential copy of the contents of a disk and it is these image files that are used with MMCPM.
If you have more time than money, you might try doing the extraction yourself. As of this writing, 5.25" disk drives are still available relatively cheap from old equipment vendors and on eBay. 5.25 drives requires a "Universal" floppy cable to connect from the old the old edge-card connectors used on 5.25 drives to the on the drive to the newer (but rapidly disappearing) 34-pin IDC floppy connector on contemporary motherboards. The BIOS should be able to handle the drive, although you may have to change a BIOS setting at boot time.
On Linux you might be able to use FDUtils and the the "dd" command to extract an image. The FDUtils site includes a HowTo for examining disks with an unknown format and extracting their data.
Under DOS, you can use the 22DISK utility. 22DISK was written by the folks at Sydex, and though they no longer distribute it, you can google around to find a copy out on the web. You can also google around to find an disk image that you can use to create a boot disk for DOS.
3. Reading the CP/M File System
This section describes the CP/M file system and the use of MMCPM to extract individual files from disk image files. If you sent your disks off to a service for extraction, you can skip this section.
Although CP/M disks often have various kinds of skewing and interleaving, they all have the same basic logical format:
- System track(s) that contain the operating system for booting
- A directory
- Data blocks
As with DOS disks, the system tracks always start at sector 1, track 1, side 1. Even if the disk does not contain an operating system, this space is reserved - and varies from manufacturer to manufacturer. On the SuperBrain this is 5K (two tracks of ten 512-byte sectors). In an age of Operating Systems that are sized in gigabytes, it is remarkable that an entire OS could fit in that small a space.
The directory is a set of 32-byte records, one for each file. The number of records is fixed and varies from manufacturer to manufacturer. On the SuperBrain there was space for a maximum of 64 directory entries or 2,048 bytes. Each CP/M 2.2 directory entry has the following layout:
- User number - 1 byte: In lieu of a hierarchical directory structure, CP/M 2.2 had 16 "users" and the owner of a file was specified with a value from 0 to 15. Unused directory entries have a value of 0xe5. Later versions of CP/M used this byte to indicate entries used for disk names, timestamps and passwords.
- File name - 8 bytes upper-case ASCII, padded with spaces
- File type - 3 bytes upper-case ASCII, padded with spaces
- File size (upper byte) - 1 byte 0-31: in 16,384-byte chunks
- Unused - 1 byte
- Extent counter: Files exceeding 32K require multiple directory entries. This field is used to keep track of the entry number
- File size (low byte) - 1 byte: File size in 128-byte chunks (1 = 128, 10 = 1,280 bytes). While this means file sizes have to be a multiple of 128 bytes, a control-z can be used to indicate end of file before the end of a chunk. Total file size is = (upper_byte * 16384) + (lower_byte * 128).
- File block list - 16 bytes: described below
Files are stored as sequences of blocks after the directory. The size of these blocks varied with manufacturer - the SuperBrain block is 2,048 bytes. These blocks do not have to be contiguous on the disk, although they usually are. The file block list in each directory entry provides pointers to up to 16 blocks per file. The block number is relative to the start of the data block section. A block number of zero means no block.
Example entries (the famous Turbo Pascal disk);
002800 00 54 55 52 42 4f 20 20 20 4d 53 47 00 00 00 0c .TURBO MSG.... 002810 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 002820 00 54 49 4e 53 54 20 20 20 4d 53 47 00 00 00 1e .TINST MSG.... 002830 02 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 002840 00 54 55 52 42 4f 4d 53 47 4f 56 52 00 00 00 0b .TURBOMSGOVR.... 002850 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 002860 00 54 49 4e 53 54 4d 53 47 4f 56 52 00 00 00 1a .TINSTMSGOVR.... 002870 05 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 002880 00 54 49 4e 53 54 20 20 20 43 4f 4d 01 00 00 43 .TINST COM...C 002890 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 00 00 00 ................ 0028a0 00 54 55 52 42 4f 20 20 20 4f 56 52 00 00 00 08 .TURBO OVR.... 0028b0 14 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0028c0 00 52 45 41 44 20 20 20 20 4d 45 20 00 00 00 2a .READ ME ...* 0028d0 15 16 17 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0028e0 00 54 4c 49 53 54 20 20 20 43 4f 4d 00 00 00 77 .TLIST COM...w 0028f0 18 19 1a 1b 1c 1d 1e 1f 00 00 00 00 00 00 00 00 ................ 002900 00 54 55 52 42 4f 20 20 20 43 4f 4d 01 00 00 6e .TURBO COM...n 002910 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 00 !"#$%*'()*+,-.. 002920 00 4d 43 20 20 20 20 20 20 50 41 53 00 00 00 31 .MC PAS...1 002930 2f 30 31 32 00 00 00 00 00 00 00 00 00 00 00 00 /012............ 002940 00 4d 43 2d 4d 4f 44 30 31 49 4e 43 00 00 00 0d .MC-MOD01INC.... 002950 33 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3............... 002960 00 4d 43 2d 4d 4f 44 30 33 49 4e 43 00 00 00 23 .MC-MOD03INC...# 002970 34 35 36 00 00 00 00 00 00 00 00 00 00 00 00 00 456............. 002980 00 4d 43 44 45 4d 4f 20 20 4d 43 53 00 00 00 5c .MCDEMO MCS...\ 002990 37 38 39 3a 3b 3c 00 00 00 00 00 00 00 00 00 00 789:;<.......... 0029a0 00 4d 43 20 20 20 20 20 20 48 4c 50 00 00 00 24 .MC HLP...$ 0029b0 3d 3e 3f 00 00 00 00 00 00 00 00 00 00 00 00 00 =>?............. 0029c0 00 4d 43 2d 4d 4f 44 30 35 49 4e 43 00 00 00 57 .MC-MOD05INC...W 0029d0 40 41 42 43 44 45 00 00 00 00 00 00 00 00 00 00 @ABCDE.......... 0029e0 00 4d 43 2d 4d 4f 44 30 32 49 4e 43 00 00 00 19 .MC-MOD02INC.... 0029f0 46 47 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FG.............. 002a00 00 4d 43 2d 4d 4f 44 30 34 49 4e 43 00 00 00 45 .MC-MOD04INC...E 002a10 48 49 4a 4b 4c 00 00 00 00 00 00 00 00 00 00 00 HIJKL........... 002a20 00 4d 43 2d 4d 4f 44 30 30 49 4e 43 00 00 00 07 .MC-MOD00INC.... 002a30 4d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 M............... 002a40 00 54 49 4e 53 54 20 20 20 44 54 41 00 00 00 25 .TINST DTA...% 002a50 4e 4f 50 00 00 00 00 00 00 00 00 00 00 00 00 00 NOP............. 002a60 e5 e5 e5 e5 e5 e5 e5 e5...
The directory starts at 0x2800 (first column) and ends at 0x3000 (not shown). The first file is turbo.msg and it is 1,536 bytes long (0xc * 128). It occupies only block #1, which is the first block after the directory at 0x3000. The ninth file is turbo.com and is 30,464 bytes long (note that the high byte is set to 1). It occupies blocks 0x20 - 0x2e (image file bytes 0x12800 - 0x19800).
4. Downloading, Compiling and Using MMCPM
Once you have determined the physical layout of your disk and extracted a disk image, you can use MMCPM to extract files. MMCPM is an extremely simple program in one small C file.
The image format parameters are configured with C #defines at the top of the file. The default values are for the SuperBrain QD, but you can change them and recompile to suit your particular format:
- SECTOR_SIZE: Obviously, the sector size. CP/M sectors ranged from 128 to 1024 bytes. The SuperBrain had a 512-byte sector.
- SECTORS_PER_TRACK: Again, varies widely. The SuperBrain had 10 sectors per track
- BYTES_PER_SIDE: This constant is used to determine whether the image is single-sided or double-sided. It only matters because of side interleave (explained below), which doesn't affect all machines. Single-sided SuperBrain disks are 179200 bytes. If the image file is larger, it is assumed to be double-sided.
- ALLOCATION_BLOCK_SIZE: The size of blocks specified in the file block list for each directory entry. This will probably be different from the sector size. The SuperBrain blocks are 2048 bytes long.
- SECTOR_SKEW: Old, slow CPUs controllers and buses required time to process the each sector as it was read. Therefore, sector skewing was often used so that, after a sector was read, another physical sector or two would pass under the head before the next logical sector came around. The SuperBrain used a skew of 2, meaning the 10 sectors appear in each track in the order: 1, 3, 5, 7, 9, 2, 4, 6, 8, 10.
- SIDE_INTERLEAVE: With double-sided disks, some formats read a track on one side and then read the other side before moving to the next track, saving disk head movement. Others used all the tracks on one side before moving to the other side, which made made it easier to use both single- and double-sided diskettes. The SuperBrain uses the latter technique, although the disks images I received from Sydex had the sides interleaved (side 1 track 1, side 2 track 1, side 1 track 2, side 2 track 2) while CP/M expected side 1 track 1,2,3...side 2 tracks 1,2,3. When set to 1 SIDE_INTERLEAVE indicates that the sides are (erroneously) interleaved and need to be de-interleaved to make the block numbers meaningful.
- SYSTEM_TRACKS: The number of tracks at the start of the disk used for the operating system. This value is needed to know where to start looking for the directory.
- DIRECTORY_SIZE: The size in bytes of the directory. This value is needed to know how many directory entries to expect and where to start the data block area.
- ONES_COMPLIMENT: The SuperBrain stored the bits on the disks inverted from the way we expect today (1 was 0 and 0 was 1) and the images I received from Sydes were stored this way. ONES_COMPLIMENT should be set to 1 if the bits need to be reversed to make the image usable, or 0 if no conversion is necessary.
- BUFFER_SIZE: This is the maximum acceptable disk size. Since CP/M 2.2 floppies were all small by modern standards, this constant should need no changes.
- EXAMINE_FILE: When using the examine option, MMCPM writes the image (with any inversions or de-interleaving) to this file for further examination.
Once you have set the #defines (if necessary) to match your disk format, compile with to create an mmcpm executable file:
gcc -o mmcpm mmcpm.c
MMCPM had three modes:
- With only an image file name parameter, MMCPM lists the directory contents
with file sizes and a list of file blocks:
mmcpm disk01.img
- With the "copy" option, MMCPM copies the contents of the image
file into separate files in the current directory:
mmcpm copy disk01.img
- With the "examine" option, MMCPM performs any specified
complimenting or de-interleaving and prints a hex dump of the image to
stdout. This can be redirected to a text file or piped to "less"
for additional examination. This option can be helpful when trying to
figure out the format of unfamiliar disks:
mmcpm examine disk01.img | less
5. Reading and Executing CP/M 2.2 Files
Many CP/M files were simply text files that can be read with any text editor.
WordStar was a very popular word processing program for CP/M. The files are partially text and can be inspected with a text editor. There a number of utilities listed HERE for converting WordStar files to HTML to preserve formatting and import into a contemporary word processing program.
If you want to try running old CP/M programs that you have stored on your disks, there are a number of Z-80 emulators out on the web. Two different emulators by different authors but with similar interfaces are Yaze and Yaze-AG. These compile and run on Linux and once you have extracted the files from your disk images, you can "mount" Unix directories to CP/M drive letters.
6. The Intertec Superbrain
The Intertec Superbrain The Intertec Superbrain was a Z-80 based microcomputer that ran CPM 2.2 and went on sale in 1979. The Superbrain was an "all-in-one" system that contained the monitor, keyboard and two 5.25" floppy drives in a single large but surprisingly light housing. In addition to the main Z-80 CPU, it used a second Z-80 as a disk controller. Intertec released four different models:
- SuperBrain Jr - 170K drives
- SuperBrain QD - 340K drives
- SuperBrain SD - 780K drives
- SuperBrain 10 - a network terminal with no drives
While the SuperBrains never had hard drives, Intertec sold a network server called CompuStar that could connect up to 255 SuperBrains and supported 10MB to 144MB Winchester hard drives. (reference)
My father gave me his old SuperBrain QD around 1984 and I used it until I graduated college in 1987 - long past the time when it had become an obsolete relic. In grad school I wrote some terminal server software in Turbo Pascal that allowed me to dial in to the university's mainframes with a 2K baud modem. Amazing.
I can accept failure, everyone fails at something. But I can't accept not trying. (Michael Jordan)