Debugging study by Dr Christophe LENCLUD, April 2017
"The IBM Personal Computer MACRO Assembler", also known as MASM, published by Microsoft and IBM since 1981, was one the firsts Assembler programs to run under MS-DOS / PC DOS on IBM PC or compatible computers. Lot's of code was written for MASM, notably the MS-DOS kernel itself and ROM-BIOS code. MASM is therefore of historical importance in the field of personal computing. In this work, we analyze and fix a bug that prevented MASM Version 1.00 from running on newer DOS machines or emulators under some common circumstances.
This program was disassembled and debugged by dynamic analysis using Goupil Debugger and OSCL, Christophe LENCLUD's own protected mode debugger and operating system. Program tests were carried out using MS-DOS 6.22 running on true PC hardware, either in Real Mode and Virtual 8086 Mode. On the test machine, the amount of low memory reported to DOS by INT 12h is 629 KiB.
This work shows that it is due to a programming error in MASM.EXE's memory and stack setup code: an UNSIGNED 16-bit integer (representing the amount of memory available for the data and stack segment) is compared as if it would be a SIGNED integer.
Below is the disassembly of MASM's stack setup code.
The SUB at offset 0011h (row 10) will overflow from the signed 16-bit positive integer range if the "available memory" is at least 8000h paragraphs or 512 KB. That happens if MASM.EXE is launched with at least 578448 bytes of free contiguous DOS memory. The value used to initialize the stack pointer (at offset 0024, row 20) will therefore be erroneous. When this value is small enough, subsequent stack overflow checks (offset 01B1, row 62) lead to the "*Out Of Memory*" message.
Our tests showed that MASM.EXE 1.00 displays the error message and exits early when the stack pointer (SP) is initialized to less than about 2FB0h (SUB result between 82A6h and 82FBh), that is at most 590656 bytes of contiguous free DOS memory before launching MASM.EXE.
But when SP is too small, the procedure at 1952:016E (called at row 66) cannot display the error message and enters an endless loop, seen by the user as a "hang". This happens when SP is initialized with 2A50h or below, that is SUB result between 8000h and 82A5h i.e. between 578448 and 589280 bytes (about 565 to 574 KB) of contiguous free DOS memory before launch.
The same error affect other programs built by the same Pascal compiler, albeit with slightly different memory bounds. In another topic, we describe a program to detect these buggy programs.
"The IBM Personal Computer MACRO Assembler", also known as MASM, published by Microsoft and IBM since 1981, was one the firsts Assembler programs to run under MS-DOS / PC DOS on IBM PC or compatible computers. Lot's of code was written for MASM, notably the MS-DOS kernel itself and ROM-BIOS code. MASM is therefore of historical importance in the field of personal computing. In this work, we analyze and fix a bug that prevented MASM Version 1.00 from running on newer DOS machines or emulators under some common circumstances.
Material and methods
The studied file is MASM.EXE, date stamp 12 July 1981, size 67584 bytes, MD5=0C68BDE13BF46F813B41FC5B19ED56D8, SHA1=0C68BDE13BF46F813B41FC5B19ED56D8. It displays "(C)Copyright IBM Corp 1981" and has an embedded date stamp "8/24/81 Ver".This program was disassembled and debugged by dynamic analysis using Goupil Debugger and OSCL, Christophe LENCLUD's own protected mode debugger and operating system. Program tests were carried out using MS-DOS 6.22 running on true PC hardware, either in Real Mode and Virtual 8086 Mode. On the test machine, the amount of low memory reported to DOS by INT 12h is 629 KiB.
Bug description
This version of MASM.EXE hangs or displays an error message "*Out Of Memory*" and exits to DOS when it is run with between 578448 and 590656 bytes (about 564 to 577 KB) of free contiguous DOS memory (MEM's "largest executable program size"). This is most probably what was previously reported as a "hang" under some DOS emulators or virtual machines.This work shows that it is due to a programming error in MASM.EXE's memory and stack setup code: an UNSIGNED 16-bit integer (representing the amount of memory available for the data and stack segment) is compared as if it would be a SIGNED integer.
Below is the disassembly of MASM's stack setup code.
The SUB at offset 0011h (row 10) will overflow from the signed 16-bit positive integer range if the "available memory" is at least 8000h paragraphs or 512 KB. That happens if MASM.EXE is launched with at least 578448 bytes of free contiguous DOS memory. The value used to initialize the stack pointer (at offset 0024, row 20) will therefore be erroneous. When this value is small enough, subsequent stack overflow checks (offset 01B1, row 62) lead to the "*Out Of Memory*" message.
Our tests showed that MASM.EXE 1.00 displays the error message and exits early when the stack pointer (SP) is initialized to less than about 2FB0h (SUB result between 82A6h and 82FBh), that is at most 590656 bytes of contiguous free DOS memory before launching MASM.EXE.
But when SP is too small, the procedure at 1952:016E (called at row 66) cannot display the error message and enters an endless loop, seen by the user as a "hang". This happens when SP is initialized with 2A50h or below, that is SUB result between 8000h and 82A5h i.e. between 578448 and 589280 bytes (about 565 to 574 KB) of contiguous free DOS memory before launch.
The same error affect other programs built by the same Pascal compiler, albeit with slightly different memory bounds. In another topic, we describe a program to detect these buggy programs.
Bug fix
If you want to run this version of MASM, there is two workarounds:- Without patching the EXE file: ensure that the free contiguous DOS memory is not between 564 and 577 KB. The safest way is to reduce the available memory to at most 563 KB, for example by loading some TSR, device drivers (such as RAMDRIVE) or debugger.
- You can copy the file, rename it, and then patch it with an hex editor by replacing the byte 7Eh at file offset 64039d (0FA27h) by 76h. This converts the JNG conditional jump to a JNA suitable for unsigned arithmetic. The patched file has MD5=0D4F43922F057E38F4C76697B321294C and SHA1=9F3383D2CB19BC4A78FA90B7A3F26BF3686E48AD. This patched program worked under MS-DOS 6.22 without the described buggy behavior.
Disassembly of the buggy code in MASM.EXE
Code:
; Program MASM.EXE loaded with PSP at 0D5A:0000
; MASM.EXE Entry Point: CS:IP = 1C4B:0000
cs:0000 mov ax,1A93 ; Data and stack segment.
cs:0003 mov ds,ax
cs:0005 mov [0472],es ; ES = 0D5Ah (PSP segment)
cs:0009 cli
cs:000A mov ss,ax ; Stack segment = Data segment.
; Compute available memory for the whole data and stack segment.
cs:000C mov bx,es:[0002] ; BX = ES:[0002] = 9D40h = Segment beyond memory allocated to program.
cs:0011 sub bx,ax ; BX = "Available memory" = 9D40h - 1A93h = 82ADh paragraphs (16 bytes each)
cs:0013 cmp bx,1000 ; Less than 64K?
cs:0017 jng 001C ; *** BUG *** Should be UNSIGNED comparison using JNA, not JNG!!!
cs:0019 mov bx,1000 ; If more than 64K, keep at most 64K.
; Convert paragraphs to offset.
cs:001C shl bx,1 ; BX = 055Ah
cs:001E shl bx,1 ; BX = 0AB4h
cs:0020 shl bx,1 ; BX = 1568h
cs:0022 shl bx,1 ; BX = 2AD0h
; Initialize stack pointer.
cs:0024 mov sp,bx ; SP = BX = 2AD0h, but should be 0000h!!!
cs:0026 sti
cs:0027 mov [0468],sp
cs:002B sub bp,bp
cs:002D mov [045A],bp
cs:0031 mov [0450],bp
cs:0035 mov [0454],bp
cs:0039 mov [0456],bp
cs:003D mov [0458],bp
; Compute memory available for the stack.
cs:0041 mov bx,2950 ; BX -> Begin of uninitialized memory in MASM data segment.
cs:0044 mov [0462],bx
cs:0048 mov [0464],bx
cs:004C mov word ptr [bx],0001
cs:0050 add bx,0002
cs:0053 mov [0466],bx
cs:0057 add bx,00C8
cs:005B mov [046A],bx ; [046A] = BX = 2A1Ah = Highest offset disallowed for stack pointer.
...
; The WORD variable at offset DS:046A will then be used for stack overflow checking during execution:
; This NEAR procedure is called to initialize procedures stack frames. CS = 18B9h.
cs:0192 pop bx ; BX = Return address.
; Browse parameters at the caller's address.
cs:0193 xor ah,ah
cs:0195 mov al,cs:[bx]
cs:0198 inc bx
cs:0199 cmp al,80
cs:019B jb 01A5
cs:019D and al,7F
cs:019F mov ah,al
cs:01A1 mov al,cs:[bx]
cs:01A4 inc bx
cs:01A5 mov cx,cs:[bx]
cs:01A8 inc bx
cs:01A9 inc bx
; Prepare stack frame for the caller procedure and check for stack overflow.
cs:01AA push bp
cs:01AB mov bp,sp
cs:01AD sub bp,cx
cs:01AF jb 01B7
cs:01B1 cmp bp,[046A]
cs:01B5 ja 01BD
; Branch here in case of stack overflow.
cs:01B7 push bx
cs:01B8 call 1952:016E ; This procedure displays the error message "*Out Of Memory*" and then exits to DOS.
; Checks passed. Return to caller by indirect JMP instead of RET.
cs:01BD mov sp,bp
cs:01BF add bp,cx
cs:01C1 add bp,ax
cs:01C3 jmp near bx
Last edited: