The Art of
ASSEMBLY LANGUAGE PROGRAMMING

Chapter Thirteen (Part 8)

Table of Content

Chapter Thirteen (Part 10) 

CHAPTER THIRTEEN:
MS-DOS PC-BIOS AND FILE I/O (Part 9)
13.3.12 - Accessing Command Line Parameters
13.3.13 - ARGC and ARGV

13.3.12 Accessing Command Line Parameters

Most programs like MASM and LINK allow you to specify command line parameters when the program is executed. For example by typing

		ML MYPGM.ASM

you can instruct MASM to assemble MYPGM without any further intervention from the keyboard. "MYPGM.ASM;" is a good example of a command line parameter.

When DOS' COMMAND.COM command interpreter parses your command line it copies most of the text following the program name to location 80h in the PSP as described in the previous section. For example the command line above will store the following at PSP:80h

		11
" MYPGM.ASM"
0Dh

The text stored in the command line tail storage area in the PSP is usually an exact copy of the data appearing on the command line. There are however a couple of exceptions. First of all I/O redirection parameters are not stored in the input buffer. Neither are command tails following the pipe operator ("|"). The other thing appearing on the command line which is absent from the data at PSP:80h is the program name. This is rather unfortunate since having the program name available would allow you to determine the directory containing the program. Nevertheless there is lots of useful information present on the command line.

The information on the command line can be used for almost any purpose you see fit. However most programs expect two types of parameters in the command line parameter buffer-- filenames and switches. The purpose of a filename is rather obvious it allows a program to access a file without having to prompt the user for the filename. Switches on the other hand are arbitrary parameters to the program. By convention switches are preceded by a slash or hyphen on the command line.

Figuring out what to do with the information on the command line is called parsing the command line. Clearly if your programs are to manipulate data on the command line you've got to parse the command line within your code.

Before a command line can be parsed each item on the command line has to be separated out apart from the others. That is each word (or more properly lexeme[7]) has to be identified in the command line. Separation of lexemes on a command line is relatively easy all you've got to do is look for sequences of delimiters on the command line. Delimiters are special symbols used to separate tokens on the command line. DOS supports six different delimiter characters: space comma semicolon equal sign tab or carriage return.

Generally any number of delimiter characters may appear between two tokens on a command line. Therefore all such occurrences must be skipped when scanning the command line. The following assembly language code scans the entire command line and prints all of the tokens that appear thereon:

                include         stdlib.a
includelib      stdlib.lib

cseg            segment byte public 'CODE'
assume  cs:cseg
ds:dseg
es:dseg
ss:sseg

; Equates into command line-

CmdLnLen        equ     byte ptr es:[80h] ;Command line length
CmdLn           equ     byte ptr es:[81h] ;Command line data

tab             equ     09h

MainPgm         proc    far

; Properly set up the segment registers:

push    ds              ;Save PSP
mov     ax
seg dseg
mov     ds
ax
pop     PSP

;---------------------------------------------------------------

print
byte    cr
lf
byte    'Items on this line:'
cr
lf
lf
0

mov     es
PSP         ;Point ES at PSP
lea     bx
CmdLn       ;Point at command line
PrintLoop:      print
byte    cr
lf
'Item: '
0
call    SkipDelimiters  ;Skip over leading delimiters
PrtLoop2:       mov     al
es:[bx]     ;Get next character
call    TestDelimiter   ;Is it a delimiter?
jz      EndOfToken      ;Quit this loop if it is
putc                    ;Print char if not.
inc     bx              ;Move on to next character
jmp     PrtLoop2

EndOfToken:     cmp     al
cr          ;Carriage return?
jne     PrintLoop       ;Repeat if not end of line

print
byte    cr
lf
lf
byte    'End of command line'
cr
lf
lf
0
ExitPgm
MainPgm         endp

; The following subroutine sets the zero flag if the character in
; the AL register is one of DOS' six delimiter characters

; otherwise the zero flag is returned clear. This allows us to use
; the JE/JNE instructions afterwards to test for a delimiter.

TestDelimiter   proc    near
cmp     al
' '
jz      ItsOne
cmp     al
'
'
jz      ItsOne
cmp     al
Tab
jz      ItsOne
cmp     al
';'
jz      ItsOne
cmp     al
'='
jz      ItsOne
cmp     al
cr
ItsOne:         ret
TestDelimiter   endp

; SkipDelimiters skips over leading delimiters on the command
; line. It does not
however
skip the carriage return at the end
; of a line since this character is used as the terminator in the
; main program.

SkipDelimiters  proc    near
dec     bx              ;To offset INC BX below
SDLoop:         inc     bx              ;Move on to next character.
mov     al
es:[bx]     ;Get next character
cmp     al
0dh         ;Don't skip if CR.
jz      QuitSD
call    TestDelimiter   ;See if it's some other
jz      SDLoop          ; delimiter and repeat.
QuitSD:         ret
SkipDelimiters  endp

cseg            ends

dseg            segment byte public 'data'

PSP             word    ?               ;Program segment prefix
dseg            ends

sseg            segment byte stack 'stack'
stk             word    0ffh dup (?)
sseg            ends

zzzzzzseg       segment para public 'zzzzzz'
LastBytes       byte    16 dup (?)
zzzzzzseg       ends
end     MainPgm

Once you can scan the command line (that is separate out the lexemes) the next step is to parse it. For most programs parsing the command line is an extremely trivial process. If the program accepts only a single filename all you've got to do is grab the first lexeme on the command line slap a zero byte onto the end of it (perhaps moving it into your data segment) and use it as a filename. The following assembly language example modifies the hex dump routine presented earlier so that it gets its filename from the command line rather than hard-coding the filename into the program:

                include         stdlib.a
includelib      stdlib.lib

cseg            segment byte public 'CODE'
assume  cs:cseg
ds:dseg
es:dseg
ss:sseg

; Note CR and LF are already defined in STDLIB.A

tab             equ     09h

MainPgm         proc    far

; Properly set up the segment registers:

mov     ax
seg dseg
mov     es
ax          ;Leave DS pointing at PSP

;---------------------------------------------------------------
;
; First
parse the command line to get the filename:

mov     si
81h          ;Pointer to command line
lea     di
FileName    ;Pointer to FileName buffer
SkipDelimiters:
lodsb                   ;Get next character
call    TestDelimiter
je      SkipDelimiters

; Assume that what follows is an actual filename

dec     si              ;Point at 1st char of name
GetFName:       lodsb
cmp     al
0dh
je      GotName
call    TestDelimiter
je      GotName
stosb                   ;Save character in file name
jmp     GetFName

; We're at the end of the filename
so zero-terminate it as
; required by DOS.

GotName:        mov     byte ptr es:[di]
0
mov     ax
es          ;Point DS at DSEG
mov     ds
ax

; Now process the file

mov     ah
3dh
mov     al
0           ;Open file for reading
lea     dx
Filename    ;File to open
int     21h
jnc     GoodOpen
print
byte    'Cannot open file
aborting program...'
cr
0
jmp     PgmExit

GoodOpen:       mov     FileHandle
ax  ;Save file handle
mov     Position
0     ;Initialize file position
ReadFileLp:     mov     al
byte ptr Position
and     al
0Fh         ;Compute (Position MOD 16)
jnz     NotNewLn        ;Every 16 bytes start a line
putcr
mov     ax
Position    ;Print offset into file
xchg    al
ah
puth
xchg    al
ah
puth
print
byte    ': '
0

NotNewLn:       inc     Position        ;Increment character count
mov     bx
FileHandle
mov     cx
1           ;Read one byte
lea     dx
buffer      ;Place to store that byte
mov     ah
3Fh          ;Read operation
int     21h
jc      BadRead
cmp     ax
1           ;Reached EOF?
jnz     AtEOF
mov     al
Buffer      ;Get the character read and
puth                    ; print it in hex
mov     al
' '         ;Print a space between values
putc
jmp     ReadFileLp

BadRead:        print
byte    cr
lf
byte    'Error reading data from file
aborting.'
byte    cr
lf
0

AtEOF:          mov     bx
FileHandle  ;Close the file
mov     ah
3Eh
int     21h

;---------------------------------------------------------------

PgmExit:        ExitPgm
MainPgm         endp

TestDelimiter   proc    near
cmp     al
' '
je      xit
cmp     al
'
'
je      xit
cmp     al
Tab
je      xit
cmp     al
';'
je      xit
cmp     al
'='
xit:            ret
TestDelimiter   endp
cseg            ends

dseg            segment byte public 'data'

PSP             word    ?
Filename        byte    64 dup (0)      ;Filename to dump
FileHandle      word    ?
Buffer          byte    ?
Position        word    0

dseg            ends

sseg            segment byte stack 'stack'
stk             word    0ffh dup (?)
sseg            ends

zzzzzzseg       segment para public 'zzzzzz'
LastBytes       byte    16 dup (?)
zzzzzzseg       ends
end     MainPgm

The following example demonstrates several concepts dealing with command line parameters. This program copies one file to another. If the "/U" switch is supplied (somewhere) on the command line all of the lower case characters in the file are converted to upper case before being written to the destination file. Another feature of this code is that it will prompt the user for any missing filenames much like the MASM and LINK programs will prompt you for filename if you haven't supplied any.

                include         stdlib.a
includelib      stdlib.lib

cseg            segment byte public 'CODE'
assume  cs:cseg
ds:nothing
es:dseg
ss:sseg

; Note: The constants CR (0dh) and LF (0ah) appear within the
; stdlib.a include file.

tab             equ     09h

MainPgm         proc    far

; Properly set up the segment registers:

mov     ax
seg dseg
mov     es
ax                  ;Leave DS pointing at PSP

;---------------------------------------------------------------

; First
parse the command line to get the filename:

mov     es:GotName1
0          ;Init flags that tell us if
mov     es:GotName2
0          ; we've parsed the filenames
mov     es:ConvertLC
0          ; and the "/U" switch.

; Okay
begin scanning and parsing the command line

mov     si
81h                 ;Pointer to command line
SkipDelimiters:
lodsb                           ;Get next character
call    TestDelimiter
je      SkipDelimiters

; Determine if this is a filename or the /U switch

cmp     al
'/'
jnz     MustBeFN

; See if it's "/U" here-

lodsb
and     al
5fh                 ;Convert "u" to "U"
cmp     al
'U'
jnz     NotGoodSwitch
lodsb                           ;Make sure next char is
cmp     al
cr                  ; a delimiter of some sort
jz      GoodSwitch
call    TestDelimiter
jne     NotGoodSwitch

; Okay
it's "/U" here.

GoodSwitch:     mov     es:ConvertLC
1         ;Convert LC to UC
dec     si                      ;Back up in case it's CR
jmp     SkipDelimiters          ;Move on to next item.

; If a bad switch was found on the command line
print an error
; message and abort-

NotGoodSwitch:
print
byte    cr
lf
byte    'Illegal switch
only "/U" is allowed!'
cr
lf
byte    'Aborting program execution.'
cr
lf
0
jmp     PgmExit

; If it's not a switch
assume that it's a valid filename and
; handle it down here-

MustBeFN:       cmp     al
cr          ;See if at end of cmd line
je      EndOfCmdLn

; See if it's filename one
two
or if too many filenames have been
; specified-

cmp     es:GotName1
0
jz      Is1stName
cmp     es:GotName2
0
jz      Is2ndName

; More than two filenames have been entered
print an error message
; and abort.

print
byte    cr
lf
byte    'Too many filenames specified.'
cr
lf
byte    'Program aborting...'
cr
lf
lf
0
jmp     PgmExit

; Jump down here if this is the first filename to be processed-

Is1stName:      lea     di
FileName1
mov     es:GotName1
1
jmp     ProcessName

Is2ndName:      lea     di
FileName2
mov     es:GotName2
1
ProcessName:
stosb                   ;Store away character in name
lodsb                   ;Get next char from cmd line
cmp     al
cr
je      NameIsDone
call    TestDelimiter
jne     ProcessName

NameIsDone:     mov     al
0           ;Zero terminate filename
stosb
dec     si              ;Point back at previous char
jmp     SkipDelimiters  ;Try again.

; When the end of the command line is reached
come down here and
; see if both filenames were specified.

assume  ds:dseg

EndOfCmdLn:     mov     ax
es          ;Point DS at DSEG
mov     ds
ax

; We're at the end of the filename
so zero-terminate it as
;  required by DOS.

GotName:        mov     ax
es          ;Point DS at DSEG
mov     ds
ax

; See if the names were supplied on the command line.
; If not
prompt the user and read them from the keyboard

cmp     GotName1
0     ;Was filename #1 supplied?
jnz     HasName1
mov     al
'1'         ;Filename #1
lea     si
Filename1
call    GetName         ;Get filename #1

HasName1:       cmp     GotName2
0     ;Was filename #2 supplied?
jnz     HasName2
mov     al
'2'         ;If not
read it from kbd.
lea     si
FileName2
call    GetName

; Okay
we've got the filenames
now open the files and copy the
; source file to the destination file.

HasName2        mov     ah
3dh
mov     al
0           ;Open file for reading
lea     dx
Filename1   ;File to open
int     21h
jnc     GoodOpen1

print
byte    'Cannot open file
aborting program...'
cr
lf
0
jmp     PgmExit

; If the source file was opened successfully
save the file handle.

GoodOpen1:      mov     FileHandle1
ax ;Save file handle

; Open (CREATE
actually) the second file here.

mov     ah
3ch          ;Create file
mov     cx
0           ;Standard attributes
lea     dx
Filename2   ;File to open
int     21h
jnc     GoodCreate

; Note: the following error code relies on the fact that DOS
; automatically closes any open source files when the program
; terminates.

print
byte    cr
lf
byte    'Cannot create new file
aborting operation'
byte    cr
lf
lf
0
jmp     PgmExit

GoodCreate:     mov     FileHandle2
ax ;Save file handle

; Now process the files

CopyLoop:       mov     ah
3Fh         ;DOS read opcode
mov     bx
FileHandle1 ;Read from file #1
mov     cx
512         ;Read 512 bytes
lea     dx
buffer      ;Buffer for storage
int     21h
jc      BadRead
mov     bp
ax          ;Save # of bytes read

cmp     ConvertLC
0     ;Conversion option active?
jz      NoConversion

; Convert all LC in buffer to UC-

mov     cx
512
lea     si
Buffer
mov     di
si
ConvertLC2UC:
lodsb
cmp     al
'a'
jb      NoConv
cmp     al
'z'
ja      NoConv
and     al
5fh
NoConv:         stosb
loop    ConvertLC2UC

NoConversion:
mov     ah
40h         ;DOS write opcode
mov     bx
FileHandle2 ;Write to file #2
mov     cx
bp          ;Write however many bytes
lea     dx
buffer      ;Buffer for storage
int     21h
jc      BadWrite
cmp     ax
bp          ;Did we write all of the
jnz     jDiskFull       ; bytes?
cmp     bp
512         ;Were there 512 bytes read?
jz      CopyLoop
jmp     AtEOF
jDiskFull:      jmp     DiskFull

; Various error messages:

BadRead:        print
byte    cr
lf
byte    'Error while reading source file
aborting '
byte    'operation.'
cr
lf
0
jmp     AtEOF

BadWrite:       print
byte    cr
lf
byte    'Error while writing destination file
aborting'
byte    ' operation.'
cr
lf
0
jmp     AtEOF

DiskFull:               print
byte    cr
lf
byte    'Error
disk full.  Aborting operation.'
cr
lf
0

AtEOF:          mov     bx
FileHandle1         ;Close the first file
mov     ah
3Eh
int     21h
mov     bx
FileHandle2         ;Close the second file
mov     ah
3Eh
int     21h

PgmExit:        ExitPgm
MainPgm         endp

TestDelimiter   proc    near
cmp     al
' '
je      xit
cmp     al
'
'
je      xit
cmp     al
Tab
je      xit
cmp     al
';'
je      xit
cmp     al
'='
xit:            ret
TestDelimiter   endp

; GetName- Reads a filename from the keyboard.  On entry
AL
; contains the filename number and DI points at the buffer in ES
; where the zero-terminated filename must be stored.

GetName         proc    near
print
byte    'Enter filename #'
0
putc
mov     al
':'
putc
gets
ret
GetName         endp
cseg            ends

dseg            segment byte public 'data'

PSP             word    ?
Filename1       byte    128 dup (?)     ;Source filename
Filename2       byte    128 dup (?)     ;Destination filename
FileHandle1     word    ?
FileHandle2     word    ?
GotName1        byte    ?
GotName2        byte    ?
ConvertLC       byte    ?
Buffer          byte    512 dup (?)

dseg            ends

sseg            segment byte stack 'stack'
stk             word    0ffh dup (?)
sseg            ends

zzzzzzseg       segment para public 'zzzzzz'
LastBytes       byte    16 dup (?)
zzzzzzseg       ends
end     MainPgm

As you can see there is more effort expended processing the command line parameters than actually copying the files!

13.3.13 ARGC and ARGV

The UCR Standard Library provides two routines argc and argv which provide easy access to command line parameters. Argc (argument count) returns the number of items on the command line. Argv (argument vector) returns a pointer to a specific item in the command line.

These routines break up the command line into lexemes using the standard delimiters. As per MS-DOS convention argc and argv treat any string surrounded by quotation marks on the command line as a single command line item.

Argc will return in cx the number of command line items. Since MS-DOS does not include the program name on the command line this count does not include the program name either. Furthermore redirection operands (">filename" and "<filename") and items to the right of a pipe ("| command") do not appear on the command line either. As such argc does not count these either.

Argv returns a pointer to a string (allocated on the heap) of a specified command line item. To use argv you simply load ax with a value between one and the number returned by argc and execute the argv routine. On return es:di points at a string containing the specified command line option. If the number in ax is greater than the number of command line arguments then argv returns a pointer to an empty string (i.e. a zero byte). Since argv calls malloc to allocate storage on the heap there is the possibility that a memory allocation error will occur. Argv returns the carry set if a memory allocation error occurs. Remember to free the storage allocated to a command line parameter after you are through with it.

Example: The following code echoes the command line parameters to the screen.

                include         stdlib.a
includelib      stdlib.lib

dseg            segment para public 'data'

ArgCnt          word    0

dseg            ends

cseg            segment para public 'code'
assume  cs:cseg
ds:dseg

Main            proc
mov     ax
dseg
mov     ds
ax
mov     es
ax

; Must call the memory manager initialization routine if you use
; any routine which calls malloc!  ARGV is a good example of a
; routine which calls malloc.

meminit

argc                    ;Get the command line arg count.
jcxz    Quit            ;Quit if no cmd ln args.
mov     ArgCnt
1       ;Init Cmd Ln count.
PrintCmds:      printf                  ;Print the item.
byte    "\n%2d: "
0
dword   ArgCnt

mov     ax
ArgCnt      ;Get the next command line guy.
argv
puts
inc     ArgCnt          ;Move on to next arg.
loop    PrintCmds       ;Repeat for each arg.
putcr

Quit:           ExitPgm                 ;DOS macro to quit program.
Main            endp
cseg            ends

sseg            segment para stack 'stack'
stk             byte    1024 dup ("stack   ")
sseg            ends

;zzzzzzseg is required by the standard library routines.

zzzzzzseg       segment para public 'zzzzzz'
LastBytes       byte    16 dup (?)
zzzzzzseg       ends
end     Main

[7] Many programmers use the term "token" rather than lexeme. Technically a token is a different entity.

Chapter Thirteen (Part 8)

Table of Content

Chapter Thirteen (Part 10) 

Chapter Thirteen: MS-DOS PC-BIOS and File I/O (Part 9)
28 SEP 1996