Debugging study by Dr Christophe LENCLUD, April 2017

"The IBM Personal Computer MACRO Assembler", also known as MASM, published by Microsoft and IBM since 1981, was one the firsts Assembler programs to run under MS-DOS / PC DOS on IBM PC or compatible computers. Lot's of code was written for MASM, notably the MS-DOS kernel itself and ROM-BIOS code. MASM is therefore of historical importance in the field of personal computing. In this work, we analyze and fix a bug that prevented MASM Version 1.00 from running on newer DOS machines or emulators under some common circumstances.

Material and methods

The studied file is MASM.EXE, date stamp 12 July 1981, size 67584 bytes, MD5=0C68BDE13BF46F813B41FC5B19ED56D8, SHA1=0C68BDE13BF46F813B41FC5B19ED56D8. It displays "(C)Copyright IBM Corp 1981" and has an embedded date stamp "8/24/81 Ver".

This program was disassembled and debugged by dynamic analysis using Goupil Debugger and OSCL, Christophe LENCLUD's own protected mode debugger and operating system. Program tests were carried out using MS-DOS 6.22 running on true PC hardware, either in Real Mode and Virtual 8086 Mode. On the test machine, the amount of low memory reported to DOS by INT 12h is 629 KiB.

Bug description

This version of MASM.EXE hangs or displays an error message "*Out Of Memory*" and exits to DOS when it is run with between 578448 and 590656 bytes (about 564 to 577 KB) of free contiguous DOS memory (MEM's "largest executable program size"). This is most probably what was previously reported as a "hang" under some DOS emulators or virtual machines.

MASM100P.GIF


This work shows that it is due to a programming error in MASM.EXE's memory and stack setup code: an UNSIGNED 16-bit integer (representing the amount of memory available for the data and stack segment) is compared as if it would be a SIGNED integer.

Below is the disassembly of MASM's stack setup code.

The SUB at offset 0011h (row 10) will overflow from the signed 16-bit positive integer range if the "available memory" is at least 8000h paragraphs or 512 KB. That happens if MASM.EXE is launched with at least 578448 bytes of free contiguous DOS memory. The value used to initialize the stack pointer (at offset 0024, row 20) will therefore be erroneous. When this value is small enough, subsequent stack overflow checks (offset 01B1, row 62) lead to the "*Out Of Memory*" message.

Our tests showed that MASM.EXE 1.00 displays the error message and exits early when the stack pointer (SP) is initialized to less than about 2FB0h (SUB result between 82A6h and 82FBh), that is at most 590656 bytes of contiguous free DOS memory before launching MASM.EXE.

But when SP is too small, the procedure at 1952:016E (called at row 66) cannot display the error message and enters an endless loop, seen by the user as a "hang". This happens when SP is initialized with 2A50h or below, that is SUB result between 8000h and 82A5h i.e. between 578448 and 589280 bytes (about 565 to 574 KB) of contiguous free DOS memory before launch.

The same error affect other programs built by the same Pascal compiler, albeit with slightly different memory bounds. In another topic, we describe a program to detect these buggy programs.

Bug fix

If you want to run this version of MASM, there is two workarounds:
  1. Without patching the EXE file: ensure that the free contiguous DOS memory is not between 564 and 577 KB. The safest way is to reduce the available memory to at most 563 KB, for example by loading some TSR, device drivers (such as RAMDRIVE) or debugger.
  2. You can copy the file, rename it, and then patch it with an hex editor by replacing the byte 7Eh at file offset 64039d (0FA27h) by 76h. This converts the JNG conditional jump to a JNA suitable for unsigned arithmetic. The patched file has MD5=0D4F43922F057E38F4C76697B321294C and SHA1=9F3383D2CB19BC4A78FA90B7A3F26BF3686E48AD. This patched program worked under MS-DOS 6.22 without the described buggy behavior.

Disassembly of the buggy code in MASM.EXE

Code:
; Program MASM.EXE loaded with PSP at 0D5A:0000
; MASM.EXE Entry Point:       CS:IP = 1C4B:0000
  cs:0000 mov  ax,1A93         ; Data and stack segment.
  cs:0003 mov  ds,ax
  cs:0005 mov  [0472],es       ; ES = 0D5Ah (PSP segment)
  cs:0009 cli                                    
  cs:000A mov  ss,ax           ; Stack segment = Data segment.
; Compute available memory for the whole data and stack segment.
  cs:000C mov  bx,es:[0002]    ; BX = ES:[0002] = 9D40h = Segment beyond memory allocated to program.
  cs:0011 sub  bx,ax           ; BX = "Available memory" = 9D40h - 1A93h = 82ADh paragraphs (16 bytes each)
  cs:0013 cmp  bx,1000         ; Less than 64K?
  cs:0017 jng  001C            ; *** BUG *** Should be UNSIGNED comparison using JNA, not JNG!!!
  cs:0019 mov  bx,1000         ; If more than 64K, keep at most 64K.
; Convert paragraphs to offset.
  cs:001C shl  bx,1            ; BX = 055Ah
  cs:001E shl  bx,1            ; BX = 0AB4h
  cs:0020 shl  bx,1            ; BX = 1568h
  cs:0022 shl  bx,1            ; BX = 2AD0h
; Initialize stack pointer.
  cs:0024 mov  sp,bx           ; SP = BX = 2AD0h, but should be 0000h!!!
  cs:0026 sti                                  
  cs:0027 mov  [0468],sp                      
  cs:002B sub  bp,bp                          
  cs:002D mov  [045A],bp                      
  cs:0031 mov  [0450],bp                        
  cs:0035 mov  [0454],bp                      
  cs:0039 mov  [0456],bp                      
  cs:003D mov  [0458],bp                      
; Compute memory available for the stack.
  cs:0041 mov  bx,2950         ; BX -> Begin of uninitialized memory in MASM data segment.
  cs:0044 mov  [0462],bx                      
  cs:0048 mov  [0464],bx                      
  cs:004C mov  word ptr [bx],0001              
  cs:0050 add  bx,0002                        
  cs:0053 mov  [0466],bx                        
  cs:0057 add  bx,00C8                        
  cs:005B mov  [046A],bx       ; [046A] = BX = 2A1Ah = Highest offset disallowed for stack pointer.
  ...

; The WORD variable at offset DS:046A will then be used for stack overflow checking during execution:

; This NEAR procedure is called to initialize procedures stack frames. CS = 18B9h.
  cs:0192 pop  bx              ; BX = Return address.
; Browse parameters at the caller's address.
  cs:0193 xor  ah,ah                          
  cs:0195 mov  al,cs:[bx]                      
  cs:0198 inc  bx                              
  cs:0199 cmp  al,80                          
  cs:019B jb   01A5                            
  cs:019D and  al,7F                          
  cs:019F mov  ah,al                            
  cs:01A1 mov  al,cs:[bx]                      
  cs:01A4 inc  bx                              
  cs:01A5 mov  cx,cs:[bx]                      
  cs:01A8 inc  bx                              
  cs:01A9 inc  bx                              
; Prepare stack frame for the caller procedure and check for stack overflow.
  cs:01AA push bp
  cs:01AB mov  bp,sp                          
  cs:01AD sub  bp,cx                          
  cs:01AF jb   01B7                            
  cs:01B1 cmp  bp,[046A]                      
  cs:01B5 ja   01BD                            
; Branch here in case of stack overflow.
  cs:01B7 push bx                              
  cs:01B8 call 1952:016E       ; This procedure displays the error message "*Out Of Memory*" and then exits to DOS.
; Checks passed. Return to caller by indirect JMP instead of RET.
  cs:01BD mov  sp,bp                            
  cs:01BF add  bp,cx                          
  cs:01C1 add  bp,ax                          
  cs:01C3 jmp  near bx
 
Last edited:
Back
Top