Writing machine code utilities that interact with BASIC
=======================================================
J.G.Harston - 12-Dec-2012
Based on original unpublished Micro User article 15-Feb-1990
When writing a machine code utility to be run as a *command, you need to
decide where in memory it will run from. The standard location is the
serial/tape buffers in page 9 and 10, and most short disk-based *commands
run there. To ensure they function correctly when a second processor is
active the load and execution address are set to &FFFFxxxx to ensure the
code is loaded into the I/O processor.
This is fine for utilities that perform I/O actions, such as examining files
or memory, or manipulating data in I/O memory - such as the screen. However,
for utilities that interact with BASIC they have to run in the same memory
as BASIC.
On initial examination, the simplest way to do this would be to set the load
and execution addresses to &0000xxxx. However, PAGE on the 6502 second
processor defaults to &800. A machine code utility that loads to &900 will
work satisfactorily when the second processor is switched off as it will
load to the tape buffer at &900, and the current BASIC program will
typically be at &E00 or higher - for example &1900 with a DFS-only BBC B.
However, with a 6502 second processor switched on that utility will load on
top of the BASIC program that will typically be in memory from &800 upwards.
The usual solution to this problem is to have two versions of the utility,
one written to load at &FFFF0900 and one written to load at a suitable
location in the second processor, often &C000 or &F600 just above BASIC in
memory.
This has two disadvantages. The first one being that you have two copies of
the same code, relocated to different locations. The second is the
difficulty in deciding exactly where to load the second processor version.
&C000-&F7FF is ok when using standard BASIC, but with HiBASIC the BASIC
interpreter is at &B800-&F7FF. Also, even when using LoBASIC, program
relocation techniques can put the BASIC program or variables above BASIC at
&C000-&F7FF.
A better method is to find a memory location that is in the same place
regardless of which side of the Tube BASIC is running on.
BASIC memory workspace
----------------------
The only memory that is in the same location regardless of where BASIC is
executing is BASIC's workspace[1].
&000-&08F : Zero page workspace
&400-&47F : Integer variables
&480-&4FF : Pointers to real variables
&500-&5FF : FOR/GOSUB/REPEAT stacks
&600-&6FF : String buffer
&700-&7FF : INPUT and command buffer
Examining this memory in detail shows some areas that are unused, and some
specifically reserved for the user to use when in BASIC.
&050-&08F : 64 bytes, &070-&08F officially reserved for the user in BASIC
&46C-&4FF : 20 bytes, calculator workspace, free when not evaluating an
expression
&480-&481 : 2 bytes, theoretical pointer to variables starting with @
&4B6-&4BD : 8 bytes, theoretical pointer to variables starting with [ \ ] ^
&4FA-&4FF : 6 bytes spare
&596-&5A3 : 14 bytes spare
So, a very short utility could load to &050 if it was no more than 64 bytes
long, or &46C if it was no more than 20 bytes long. For instance, the
following short bit of code displays the contents of any comment in the
first line of the program.
OSASCI=&FFE3
FOR I%=0 TO 3 STEP 3
P%=&50
[OPT I%
LDA #0:STA &8E
LDA &18:STA &8F :\ &8E/F=>PAGE
LDY #4
.loop1
LDA (&8E),Y:INY :\ Get byte from first line
CMP #&F4:BEQ loop2 :\ Check for REM token
CMP #13:BNE loop1 :\ Loop until end of line
RTS :\ No REM found, just exit
.loop2
LDA (&8E),Y:INY :\ Get byte from line
JSR OSASCI :\ Print it
CMP #13:BNE loop2 :\ Loop until end of line
RTS
]NEXT
However, for something of comparable size to the 512 bytes in the tape
buffer we need to look in more detail.
String buffer
-------------
The string buffer at &600 holds the results of string evaluations, and the
parameters for CALL. At other times it is free. Consequently, it can be used
for a machine code utility that is no more than 256 bytes long with no
impact on the rest of the BASIC environment.
The following example creates a command *AS that saves the current program
with a filename in a REM comment on the first line of the program.
REM > AS/SRC
DIM mcode% &200 :REM memory to assemble to
load%=&600 :REM Address to load to
osbyte=&FFF4
ptr=&70:cr=&0D:rem=&F4
FOR L%=4 TO 7 STEP 3
P%=load%:O%=mcode%
[OPT L%
.exec%
LDA #21:LDX #0:JSR osbyte :\ Clear keyboard buffer
LDA &18:STA ptr+1
LDY #0:STY ptr+0 :\ ptr=>PAGE
LDA (ptr),Y
CMP #13:BNE noprog :\ No program in memory
INY:LDA (ptr),Y
CMP #&FF:BEQ noprog :\ Empty program
LDY #3
.remloop
INY:LDA (ptr),Y :\ Get byte from line
CMP #cr:BEQ norem :\ Found <cr>, no REM
CMP #rem:BNE remloop :\ Loop until REM found
.spaceloop
INY:LDA (ptr),Y
CMP #32:BEQ spaceloop :\ Step past any spaces
CMP #ASC">":BEQ foundname :\ REM > filename
CMP #34:BEQ foundname :\ REM "filename
DEY :\ No prefix, step back
.foundname
TYA:PHA:LDX #0 :\ Remember offset to filename
.loop
LDY save,X:JSR osbyte138 :\ Insert characters from
INX:CPX #4:BNE loop :\ SAVE <quote>
PLA:TAY :\ Get back offset to filename
.getname
INY:TYA:PHA:LDA (ptr),Y :\ Get filename character
CMP #cr:BEQ addcr :\ End of line
CMP #34:BEQ addcr :\ Terminating quote
TAY:JSR osbyte138 :\ Insert the character
PLA:TAY:BNE getname
.addcr
PLA
LDY #34:JSR osbyte138 :\ Insert terminating quote
LDY #cr :\ Insert <cr>
.osbyte138
TXA:PHA :\ Save X
LDA #138:LDX #0:JSR osbyte :\ Insert Y into kbd buffer
PLA:TAX:RTS :\ Restore X
.noprog
BRK:BRK:EQUS "No program"
.norem
BRK:BRK:EQUS "No REM"
BRK
.save
EQUS "SA."""
]:NEXT
OSCLI "SAVE AS "+STR$~mcode%+" "+STR$~O%+" "+STR$~exec%+" "+STR$~load%
Checking for 6502 BASIC
-----------------------
Before we go any further, two important things must be covered.
You can issue *commands from any language, not just from BASIC, so code
should check that BASIC is the current language before trying to access
BASIC's memory. For example, it would make no sense if you were using View
and entered a *command to examine a BASIC program in memory as there
wouldn't be a BASIC program there to look at.
Usefully, BASIC is considered a special-case ROM so there is an OSBYTE
variable that holds the ROM number of the the onboard BASIC to use.
Comparing this with the current language ROM number will check if BASIC is
the current language.
\ Check that BASIC is the current language
LDA #187:JSR rdbyte:STA tmp :\ Read BASIC ROM number
LDA #252:JSR rdbyte:CMP tmp :\ Read current language ROM number
BEQ basicok :\ They match
BRK:EQUB 249:EQUS "Not in BASIC":BRK
.rdbyte
LDX #0:LDY #255:JSR osbyte :\ Read an OSBYTE variable
TXA:AND #63:RTS :\ Return it from X without b7-b6
.basicok
The top two bits of the ROM number are masked out as they are used as flags.
b7 indicates that the *BASIC command is passed on as a *command instead of
entering the ROM directly. b6 indicates that any automatic ROM relocation is
supressed.[2] Error number 249 is the "Bad language ROM" error number and is
the appropriate error number here.
In addition to checking that BASIC is the current language, the code needs
to be prevented from attempted execution on a non-6502 second processor.
This is done by starting the code with a sideways ROM header indicating that
the code contains 6502 code which is checked by the Tube Client code. The
simplest such header is the following twelve bytes:
.exec%
JMP start :\ Entry point
BRK:BRK:BRK :\ No service entry
EQUB &42 :\ &40=Executable + &02=6502
EQUB 8 :\ Offset to copyright string
EQUB 0:EQUS "(C)" :\ Copyright string
.start
If space allows it is best to have a full code header as this allows you to
include a title and version string identifying the code.
.exec%
JMP start :\ Entry point
BRK:BRK:BRK :\ No service entry
EQUB &42 :\ &40=Executable + &02=6502
EQUB copy-exec% :\ Offset to copyright string
EQUB &00 :\ Binary version number
EQUS "Title" :\ Title string
EQUB &00
EQUS "0.00 (01 Jan 1990)" :\ Version string
.copy
EQUB 0:EQUS "(C) J.G.Harston" :\ Copyright string
EQUB 0
.start
Using all BASIC's workspace
---------------------------
If more than 256 bytes of memory are needed then either the variable
pointers at &400, the loop stacks at &500 or the command buffer at &700 will
be overwritten. In all these cases this means that the simplest solution is
to decide that the command must return to the BASIC command prompt after
executing, that it cannot be executed multiple times within a program.
Consider the following example code:
FOR A%=1 TO 4
B=2^A%
*test
B=B+1
NEXT A%
If *test overwrites the variables at &400-&4FF then on return the
interpreter won't be able to find any variables to be able to continue the
program. If the loop stacks at &500-&5FF are overwritten then the
interpreter won't be able to complete any loops that *test is within.
If *test overwrites the command buffer at &700 then it can't be called from
the command prompt and return correctly. If you use:
>*test
or
>OSCLI "test":PRINT "done"
the rest of the line will be overwritten and the interpreter won't be able
to get to the end of the line.
In all these cases the simplest thing to do is to tell the interpreter to
execute an END to return to the command prompt. The easiest way to do this
is to point BASIC's program pointer in PTRA to a <cr><FF> sequence, which
marks the end of a program in memory.
.exit
LDA #end AND 255:STA &0B
LDA #end DIV 256:STA &0C
LDA #0:STA &0A
RTS
.end
EQUB 13:EQUB 255 :\ <cr><endmarker>
This returns to the command prompt and also clears all the loop stacks so,
for instance, typing NEXT A or RETURN at the command prompt won't
inadvertantly try to resume a non-existant loop.
If the variable pointers at &480-&4FF are overwritten then the equivalent of
CLEAR also needs to be executed. The easiest way to do this is to point PTRA
to the following BASIC code instead:
EQUD 13:EQUB &D8:EQUB 13:EQUB 255 :\ <cr><00><00><00><clear><cr><endmarker>
However, if you need to abort with an error the corrupted variable pointers
will still be in place and need to be cleared manually with the following
code:
.clear
LDA #0:LDX #&7F
.clear_lp
STA &480,X:DEX:BPL clear_lp :\ Clear variables
LDA &00:STA &02 :\ Set VARTOP=LOMEM
LDA &01:STA &03
LDA #end AND 255:STA &0B :\ Point PTRA to end marker
LDA #end DIV 256:STA &0C
LDA #0:STA &0A:RTS
.end
EQUB 13:EQUB 255 :\ <cr><endmarker>
.err_escape
JSR clear
BRK:EQUB 17:EQUS "Escape":BRK
The static integer variable @% that controls print formatting is stored at
&400-&403. As it is never reset to its default except by a *BASIC command it
is best not to overwrite it, so the lowest address any code should start
should be &404.
You must be careful that the code that clears the variables isn't itself in
the memory about to be cleared at &480-&47F.
Command line parameters
-----------------------
When a *command is called on the Master the *command string is copied into a
buffer in private MOS workspace at &DF00. When the Tube is running it is
copied into a buffer in the I/O processor at &0700. However, on the BBC B/B+
with no Tube active the *command command string is parsed in memory where it
is. A *command in a BASIC program will be somewhere above &E00. A *command
entered at the command prompt will be in the command buffer at &700. A
*command issued via OSCLI will be in the string buffer at &600.
In the latter two examples, if the *command loads to memory at &600 or &700
then it will overwrite any command parameters before it can see them.
Consequently, a *command that loads to &600 or &700 must not look for any
command parameters, and vis versa, a *command that looks for parameters must
not load to &600 or &700. A more complicated method is to test if the Tube
is inactive and the returned parameter address is &06xx or &07xx, and only
look for parameters if it isn't.
The command line parameters will always be in I/O memory, so they have to be
collected by calling OSWORD 5 to read them.
\ addr = 5-byte buffer in zero page
LDA #1:LDY #0:LDX #addr:JSR OSARGS :\ Read address of command line
.rdcmdlp
LDX #addr:LDY #0:LDA #5:JSR OSWORD :\ Read byte from I/O memory
INC addr+0:BNE P%+4:INC addr+1 :\ Increment command line address
LDA addr+4 :\ Get byte read from command line
... do something with it
CMP #13:BNE rdcmdlp :\ Loop until <cr>
Sample programs
--------------
ShowREM - Generates code to display the contents of the first REM line.
ASsrc - Generates *AS command that saves the current BASIC program from an
embedded filename in a REM statement. Loads to string buffer at
&600, so is able to do a simple return.
VListSR - Generates *VList command that lists BASIC variables. Loads to &500
and &600 so has to return via an END, but does not disturb the
variables.
xssrc - Generates an update of the Micro User *xs checksum program. Loads
to the whole of &400-&7FF, so demonstrates clearing the
overwritten variable pointers before exiting or generating an
error.
Summary
-------
The following is a generic header that can start any code to execute in
BASIC's workspace from &404 onwards. It clears variables on exit and returns
to the command prompt.
FOR opt=4 TO 7 STEP 3
P%=&404:O%=mcode%
[OPT opt
.exec%
JMP start:BRK:BRK:BRK :\ Header identifies
EQUB &42:EQUB copy-exec% :\ this as 6502 code
EQUB &00:EQUS "Program Name"
EQUB &00:EQUS "0.00 (01 Jan 2000)"
.copy
EQUB &00:EQUS "(C) My Name":EQUB 0
:
.start
LDA #187:JSR rdbyte:STA zp
LDA #252:JSR rdbyte:CMP zp :\ Check BASIC is current language
BNE errBasic:JSR main :\ Call main code
.clear
LDX #&7F:LDA #0:STA &0A :\ Clear PTRA offset
.clear_lp
STA &480,X:DEX:BPL clear_lp :\ Clear variables
LDA &00:STA &02 :\ Set VARTOP=LOMEM
LDA &01:STA &03
LDA #end AND 255:STA &0B :\ Point PTRA to end marker
LDA #end DIV 256:STA &0C:RTS
.end
EQUB 13:EQUB 255 :\ <cr><endmarker>
.rdbyte
LDX #0:LDY #255:JSR osbyte:TXA:AND #63:RTS
.errBasic
JSR clear :\ Errors must JSR clear first
BRK:EQUB 249:EQUS "Not in BASIC":BRK
:
.main
\ Main program code goes here
RTS
References
----------
[1]http://mdfs.net/Docs/Comp/BBC/BASIC/Memory
[2]http://mdfs.net/Docs/Comp/BBC/BASIC/Osbyte187
[3]http://beebwiki.jonripley.com/Reading_command_line