An assembler is essentially a piece of software which aids the writing of machine code. Most personal computers offer only the crudest facilities for using machine code. Some only have a 'machine code monitor'. Others are even less equipped and the only way to enter a machine code program is by means of a boring and error-prone series of POKEs. It is not surprising that few owners of such machines develop a strong attraction for machine code. Of course, assemblers for most popular microcomputers are available on tape or disk but it is a sad fact of life that external software which has to be loaded, particularly from tape, is often too much trouble. Initially, it may be used with enthusiasm but the inevitable 'tape-inertia' syndrome eventually relegates the tape to its coffin.
Assemblers vary in sophistication and the facilities offered for debugging. It is unusual for a personal computer to be equipped with a resident assembler. No doubt manufacture's of machines for this market have previously assumed that few buyers, other than the completely dedicated, would be interested in any programming language other than BASIC. Acorn, conscious that interest in machine code would grow, included a resident assembler in the Atom, a practice which they have continued in the BBC machine.
The assembler used is unique because it is 'wedged' inside the BASIC chip. The position of the assembler inside the language system ensures easy transition between BASIC and machine coding. Machine code splices within a BASIC program are recognised by the large square brackets [ and ]. (These appear on the screen as left- and right-pointing arrows in Mode 7 on the BBC machine.) Although the assembler lacks many of the refinements found in traditional mainframe-oriented software, it is perfectly adequate and quite easy to use. In fact, some features of its design could be considered superior to classical assemblers. Although many readers will already be familiar with the facilities offered, it is necessary, for the sake of continuity, to devote a little space to the following overview.
Bearing in mind the function of an assembler defined at the beginning of this chapter, the foremost requirement of any assembler is the substitution of meaningful letter groups for the instruction op-codes. Thus, to transfer X to A, the hex machine code is &8A (see Appendix C1). The assembler allows us to write TXA instead. All mnemonic op-codes are three-letter groups
Numerical values in operands are assumed to be decimal. If the number is to be interpreted as hex, they must be prefixed with the & character.
The most remarkable and useful property of the assembler is the {attitude allowed with variables. The following deserves emphasis:
Any legitimate BASIC variable or expression can be used in the operand. |
Examples:
100 LDA LOCK
110 LDA LOCK+1
120 LDA #ASC("Y")
130 LDA #BASE DIV 256
140 LDA #BASE MOD 256
150 LDA #SQR(4)
These are all expressions which the assembler will accept although the following common sense provisos apply:
BASIC variables and expressions can also be used in jump operands such as:
JMP START+7
This will cause a jump to the address found by adding 7 to the contents of the variable START.
The previous chapter, dealing with branch instructions and relative addressing, stressed the difficulties associated with counting the number of bytes forward or backwards to reach the desired branch destination. The assembler, to a large extent, offers relief by allowing the use of labels. Thus the assembler disguises the inherent relative addressing and substitutes a straightforward 'branch to label'. For example, we can write:
BNE Sort
Here 'Sort' is the branch operand, referring to a label appearing somewhere in the program.
The label is recognised as such by the assembler because it must begin with a full-stop as shown in the following example:
CLC
.BACK LDA Number1
ADC Number2
BEQ Finish
ROR A
BNE BACK
.Finish RTS
The coding obviously has no practical value so it would be pointless to key in. Note the full-stop before the two labels and note also that there must be no full-stop when the label is an operand directive. The layout is valid but considerable latitude is allowed. For example, it could be re-written as follows:
CLC
.BACK
LDA Number1
ADC Number2
BEQ Finish
ROR A
BNE BACK
.Finish RTS
Notice that the label can stand alone or be on the same line as the instruction, providing there is at least one space separation. The label, although existing within the assembled coding, can also be considered to be a normal BASIC variable. It is, after all, an address and therefore a simple numeric. We could, for instance, discover this address (when out of the assembler and back in BASIC) by writing PRINT BACK or PRINT Finish. A problem arises, which we shall deal with later, when assembling coding in which branches are made to forward positions.
Remarks, which programmers feel are necessary to explain coding, must be distinguished by either the semicolon or the back slash (\). For example:
LDA (pointer,Y) \Indirect loading
We shall use the back slash because it seems a more meaningful symbol and also because the semicolon is strongly associated with the PRINT statement in BASIC. Remember that the back slash in the BBC machine Mode 7 appears on the screen as a funny '½' character.
As in BASIC, more than one assembly instruction can be placed on one line by using the full colon as a separator. For example:
[
CLC:ADC Number:ADC #4:RTS
]
It must be appreciated that the square brackets, which enclose assembly coding, have the status of an instruction, even though they are categorised as 'pseudo instructions'.
Writing many statements per line is popular in BASIC because programs appear shorter in length and less bytes are squandered in storing line numbers. In assembly code, although the length of the program still appears less, there is no real saving because the action of assembly produces the same final object program. Another disadvantage of multi-statements per line is the decrease in readability. A program written in assembly code is not exactly an easy thing to decipher (even for the writer!) and cramming a lot on one line increases the confusion, besides leaving less room for remarks.
When writing a BASIC program, we have no worries about where it will be stored in memory. It is left entirely to the operating system, which stores it in accordance with built-in rules. However, when a piece of assembly code is written, the assembler has no such built-in authority. It is up to the programmer to provide guidance on the desired starting location of the program in memory. Naturally, the guidance given must steer clear of the RAM space already earmarked by the operating system. There are several reasonably free areas in the BBC machine, besides space made artificially by shifting the BASIC text and variable areas around. Before discussing the details of free spaces, it is important to stress the vital importance of one of the special resident integer variables P%. As readers are already aware, the complete set of these twenty-seven special variables are labelled @%,A% ... Z%. They ail occupy fixed locations and can be picked up at any time by the operating system or assembler. Returning to P%,
The contents of P% are taken as the address of the next item of machine code. |
Thus, to locate a program anywhere in memory, it is simply a case of loading the starting address of the first byte into P%. After each byte is stored, Pf%; is automatically incremented by one in readiness for the next byte. In fact, it is convenient to relate P£% to the microprocessor program counter.
The setting of P% must be carried out in the BASIC part of the composite program. For example:
10 P%=&900
20 [
30 \
40 \
50 \Assembly coding
60 \
70 \
80 ]
90 REM rest of BASIC
This would store the first byte of assembly code (wedged between BASIC) in the hex address 0D00. It is possible to use variable names when loading P% so we could have written the top bit of the previous example as:
10 START%=&0D00
20 P%=START%
Instead of committing P% to a fixed machine address, it is possible to delegate some of the responsibility to the operating system. Page 237 of the BBC User Guide tells us that the Dimension statement can be twisted a little in order to accommodate byte arrays. For example:
DIM STRRT% 99
P%=START%
This will reserve 100 bytes for assembly code, the first byte located in START% (START% is, of course an arbitrary name). It is important to recognise the unusual character of the DIM statement in fine 100. There are indeed two variations from the normal BASIC statement for dimensioning arrays. The first point to notice is the absence of the brackets around the 99 although the space before 99 is mandatory. The second point, less vital but still useful to know, is that START%, should not be thought of as the 'name of the array'. It is merely the labelled address of where the first byte is stored. The storage area, like the normal DIM statement, is 'dynamic'. That is to say, it moves up or down depending on the number of fines in the BASIC area. However, the actual machine address of the first byte in the example above can always be ascertained by a command, such as PRINT START%. The snag in using the DIM statement is the fact that you have to make an intelligent guess as to the number of bytes in your coding. The best way is to make a preliminary estimate and then add, say, twenty more for luck because it doesn't matter if you over-estimate. However, if you are a stickler for having things dead right, it is easy to count the bytes after the final debugging and alter the DIM number accordingly.
Other ways of finding space for assembly code are by the use of the pseudo variables TOP and PAGE. Space (for example 256 bytes) can be found below the BASIC program by using PAGE in the following manner:
100 PRGE=PAGE+256
110 P%=P8GE-256
The first line pushes PAGE upwards, and therefore the start of the BASIC program. This will reserve 256 bytes for the assembly coding which, as fine 110 suggests, starts at the old PAGE position. If we wish to reserve some of the assembly area for, say, 20 data bytes, line 110 can be written as:
110 P%=PAGE-236
This leaves 20 bytes for data, the first item being in PAGE and the last in PAGE+19. The assembly program will follow, beginning at PAGE + 20.
If we use LOMEM, the program and (or data can be positioned between TOP (the top of the BASIC program) and LOMEM (where BASIC stores its variables). (See the memory map on page 501 of the BBC User Guide.) For example:
100 LOMEM=LOMEM+256
110 P%=TOP
This will reserve 256 bytes for assembly coding, the first byte being in TOP. The User Guide warns us not to attempt to alter LOMEM in the middle of a program or the interpreter will lose track of the variables.
There is nothing wrong with storing assembly programs and data 'dynamically' as described above. However, storing in a fixed location has, at least, a psychological advantage. You always know exactly where the program is stored and this can be comforting. The trouble with fixed storage in the BBC Micro is its scarcity.
The BBC machine has plenty of fixed free space providing that some of the normal and expansion facilities are sacrificed. For example, page &0D is perfectly safe and is actually described in the User Guide as space for 'user-supplied resident routines'. Unfortunately, this is subject to the proviso that no disk interface is used. It is also possible to use page &0C if we are prepared to sacrifice user-defined character definitions. Page &0B is yet another page available but this time we will lose the facility of programming those delightful red function keys. Use of any of these pages will therefore depend on individual needs, but it is useful to know that three contiguous pages are potentially free, representing a total memory range &0B00 to &0DFF. This is a hefty 768 bytes which, in machine code terms, is capable of supporting a complex program.
However, for those with disk or other expansion options it is advisable to steer clear of fixed locations altogether. It is safer to use the DIM statement. Any subsequent programs given which use fixed locations (to simplify explanations) can always be modified to DIM methods.
To execute a BASIC program, we use the keyword RUN. To execute a machine code program we use the keyword CALL. Both RUN and CALL are BASIC keywords, a fact which serves to emphasise the close relationship between the assembler and the BASIC interpreter. The simple word CALL is very powerful because it combines the role of parameter-passing with machine code execution. Before delving into great detail, it is useful to study the following sequence of events starting with the source code and ending with the final execution:
1. A source code listing
10 P%=&D01
20 [
30 LDA #99
40 STA &0DFF
50 ]
60 PRINT:PRINT"The contents of &0DFF
is ";?&0DFF
Line 10 positions the code in page &0D, a quite arbitrary decision. Lines 30 and 40 are the assembly code which loads the accumulator with decimal 99 then stores it in &0DFF. Line 60 is BASIC and prints out the contents of this address, using the byte indirection operator.
2. Running the program
When we type RUN, the result is:
0D00
0D00 A9 63 LDA #99
0D02 8D FF 0D STA &0DFF
The contents of address &0DFF is 13.
This is the work of the assembler and indeed is called an assembly listing. It has produced the correct machine code from our mnemonic source code. The first column is the hex address of the first byte on that line. The second column is the hex machine code consisting of the op-code, followed by the operand. Notice that the two-byte operand is low-byte, high-byte form and appears back to front. The third column is our original source code.
Note carefully that we have RUN but the machine code doesn't appear to have worked because the printout is 13 instead of the expected 99. This is because the machine code has only been assembled that is to say, it has only been converted from our source code to a pure machine code form. But we have not yet told the code to be executed! Thus, the 'answer' we got of 13 was merely garbage that happened to be in &0DFF. If you try it, you will have a different garbage number (unless the law of averages break down). So, we need another step.
3. Executing the rnachine code with CALL
CALL &0D01
Nothing visible happens but the machine code has been executed. To prove it, RUN the program again. This will produce exactly the same results as the previous RUN but with one important difference. The BASIC fine at the bottom will now read:
The contents of address &0D01 is 99
The steps shown above have deliberately been separated in order to emphasise the difference between the assembly and the call processes. In practice, the CALL would normally be included under a line number in the program rather than activated by a separate command. For example, the listing above could be written:
10 P%=&0D01
20 [
30 LDA #99
40 STA &0DFF
45 RTS
50 ]
55 CALL &0D01
60 PRINT:PRINT"The contents of &0DFF
is ";?&0DFF
70 END
Apart from the extra CALL line, notice that RTS has been squeezed in. This should be considered the normal way to terminate machine code routines in order to ensure smooth control back to BASIC after the code has been executed. Without RTS, an error message from the assembler might (probably will) appear. Now when we type RUN, the program will automatically assemble the code, line 55 will execute it and RTS returns control back to BASIC at line 60.
The pseudo operation OPT controls the activities of the assembler. It is fully described in the User Guide but is repeated (with less detail) here.
The format is OPT pass, where pass is a variable which can be 0, 1, 2 or 3.
If pass 0, the assembly listing is suppressed and no errors are reported.
If pass=1, only assembly errors are suppressed.
If pass=2, only assembly listing is suppressed.
If pass=3, nothing is suppressed
It is important to remember that OPT is not a BASIC keyword, consequently it is only legal within the square bracket area. If OPT is not used, it defaults to OPT 3. Since the previous examples have not used OPT, the assembly listing appeared and assembly errors might also have appeared. It is often inconvenient and purposeless to display these except during the development and debugging stage.
If a branch instruction directs control backwards to a labelled line, the assembler can cope immediately because it has picked up the address of the label on the way. However, if the branch directs to a forward address, the assembler is confused. To see why, imagine what happens in the following piece of code:
110 BNE exit
120 :
130 :
140 .exit
The main job of the assembler is to change mnemonic op-codes and operands into equivalent hex numbers and addresses. So what happens when it reaches line 110? It can easily look up its conversion table to find the hex code for BNE (it is D0). But it can't determine the hex address of the operand because it hasn't yet reached fine 140. Its intelligence is just not equal to the situation, so it gives up.
The BBC assembler is not peculiar in this respect It is a common problem in all but the most sophisticated versions. The standard way out is to give the assembler two goes at it, a trick known as two-pass assembly. The first pass collects all the labels and addresses and the second pass uses them to produce the final assembly.
It would be boring and error-prone if the operator always had to assemble programs twice. Fortunately, the FOR/NEXT loop in BASIC can be left to do the donkey-work. The following piece of code includes a simple forward branch and illustrates the two-pass assembly technique: FOR pass=0 TO 3 STEP 3
100 FOR pass=0 TO 3 STEP 3
110 P%=&0D00
120 [
130 OPT pass
140 LDA #3
150 CMP #3
160 BEQ Finish
170 NOP
180 .Finish RTS
190 ]
200 NEXT pass
210 CALL &0D00
220 END
The code itself is purposeless and barely worth explaining. It is easy to see that the forward branch, (the object of the example), will always take place. Line 170 could have been any useless instruction so NOP is exceptionally well-qualified. Note the following points:
The facility to predefine memory locations with BASIC variables and use them in assembly code should be exploited to the full. Any dodge which makes assembly code look less formidable is worthwhile, even if it does squander a few BASIC lines. For example:
Number1=&70
Number2=&71
Having defined these locations, the variable names can be used in assembly operands, rather than absolute hex addresses. The following example, which adds two numbers (limited to single byte), illustrates some of the interchanges possible between BASIC and assembler:
10 DIM START% 40
20 CLS
30 INPUT"Enter first number "Number1
40 INPUT"Enter second number "Number2
50 Result=&70
60 P%=START%
70 ]
80 CLC
90 LDA #Number1
100 ADC #Number2
110 STA Result
120 RTS
130 ]
140 CALL START%
150 PRINT ?Result
Just for a change, the program is stored by courtesy of DIM, instead of using fixed free space. It is easy to get mixed up with the address of data and the data itself. For example, it may not be obvious why lines 90 and 100 must have the hash mark denoting immediate addressing. Note also that line 150 prints the contents of 'Result'. If 'Result' was printed, it would be an address rather than data in that address.
A macro is short for macro-instruction, a facility offered in some assemblers, whereby a block of instructions performing some task can be defined by name, and later treated as if it were a single instruction. Superficially, this description resembles an ordinary subroutine so it is important to compare them with a view to spotting the differences.
A subroutine is written once, stored in some fixed location and called up by a special jump (JSR). A macro, on the other hand, is assembled in machine code each time it is used in the body of the program. For example, suppose the following is a macro in a fictitious (and using an equally fictitious format) machine:
Macro TDX.
DEX
DEX
DEX
End Macro
The macro is first defined and given some arbitrary name (TDX in example). The macro is then written (three consecutive decrements in the example% The macro is then terminated. From now on, anytime we use TDX, the assembler will include the three instructions in the coding. Note that there is no 'jump' action. The macro is inserted in line every time. Because there is no time wasted in calling and returning from a subroutine, a macro has a higher execution speed.
10 REM Procedures used as macros
20 DIM START% 40
30 P%=START%
40 [:LDA #1:]
50 PROCshiftleft3
60 [:STA &70:]
70 PROCshiftleft3
80 [:STA &71
90 RTS
100 ]
110 CALL START%
120 PRINT?&70
130 PRINT?&71
140 END
150 DEF PROCshiftleft3
160 [:ASL A:ASL A:ASL A:]
170 ENDPROC
>RUN
19EF
19EF A9 01 LDA #1
19F1
19F1 0A ASL A
19F2 0A ASL A
19F3 0A ASL A
19F4
19F4 85 70 STA &70
19F6
19F6 0A ASL A
19F7 0A ASL A
19F8 0A ASL A
19F9
19F9 85 71 STA &71
19FB 60 RTS
8
64
Program 4.1. Procedures used as macros.
Now for the crunch. There is no macro facility in the BBC machine, at least not directly. However, it is possible to wangle it by exploiting BBC BASIC% most powerful asset, the Procedure. Instead of naming a macro, we name a procedure. Defining the macro is replaced by DEF PROCname. Naturally, we must define the procedure in BASIC and use it in BASIC, but the assembler is undaunted providing the square brackets are used correctly.
To illustrate, Program 4.1 loads the accumulator with 1 and then uses a simple 'macro' to produce three shift Lefts on the accumulator. The macro is used twice, so the accumulator is shifted 6 times. After R UN, the assembly listing appears. You will notice that the three ASL instructions are indeed assembled in-line each time the 'macro' is used. This example should emphasise the difference between macros and subroutines. We have stated that a macro is faster and yet, from the assembly listing it is not too obvious why. Actually, the time it takes actually to assemble macros may indeed be longer than assembling the equivalent subroutine. Once assembled (which only has to be done once) however, the execution is faster with macros because no time is wasted in jumping and returning (JSR takes 6 clock cycles and so does RTS). Against this, however, it should be realised that macros use up memory for the assembly code each time they are called.
The main hazard, when using procedures to simulate macros, is failing to observe the rules of the square brackets when dodging in and out of BASIC. The brackets can be round the wrong way or in the wrong places. Program 4.1 above uses a standardised format with the brackets enclosing the procedure call and on the same line.
As with normal procedure calls and definitions, it is possible to pass formal parameters into the assembly enclosure. For example, we can define a procedure as follows:
500 DEF PROCsubtract(number1,number2)
510 [
520 SEC
530 LDA number1
540 SBC number2
550 ]
560 ENDPROC
Once such a procedure is defined, it can be used to subtract, say, contents of Tax from Gross by using:
PROCsubtract(Gross,Tax)
Naturally, such a simple procedure can only handle a single byte subtraction but this is irrelevant it is the principle that matters.
This technique, like macros, is commonplace in professional mainframe assemblers. It simply means that certain changes can be made in assembly code without having to reassemble each time. Again like macros, it is not directly available on the BBC machine but can be simulated by a mixture of BASIC and machine code. As a simple example, it may be that during program development we would like to try the effect of X=4 or X=40 in the same program. A simple IF/THEN/ELSE spliced in the right place would do the job nicely:
IF Speed=slow THEN [ : LDX #4: ] ELSE [ : LDX #40: ]
There are various ways of passing parameters from the main program (which could be BASIC) to a machine code subroutine. The simplest, but not always the most advisable, is via registers. The three most important resident subroutines, OSRDCH, OSWRCH and OSBYTE (discussed in a later chapter) all use registers for parameter passing. OSRDCH and OSWRCH use the accumulator; OSBYTE uses the accumulator and the X and Y index registers.
Apart from the enforced use in resident subroutines, registers are not the ideal medium for passing parameters. The 6502 is not generous as far as they will already be holding variable data although, of course, the stack can be used as a temporary store while the registers are being used. Fortunately, alternatives are available in the shape of the JSR and CALL statements.
The CALL statement is far more powerful than has been indicated earlier in the chapter. For example, we can use:
CALL NAME, G%, A$, Blogs
This shows that, in addition to actually calling (executing) machine code, it is possible to pass a variety of mixed parameters (integer variables, string variables and floating point variables and even single byte numbers) to the assembly code. Essential information on the parameters is passed to a special memory block located in the BBC machine at &0600 onwards. This block contains the following information:
&0600 | number of parameters passed in CALL statement |
&0601 | low byte address of first parameter |
&0602 | high byte address of first parameter |
&0603 | code for parameter type |
&0604 | low byte address of second parameter |
&0605 | high byte address of second parameter |
&0606 | code for parameter type |
&0607 | low byte address of third parameter |
&0608 | high byte address of third parameter |
&0609 | code for parameter type |
This sequence is repeated for any further parameters. The parameter code which has been referred to above is as follows:
0 | 8-bit byte (example ?Z) |
4 | 32-bit integer variable (example Volts%) |
5 | byte floating-point number (example Blogs) |
81 | string variable (example Name$) |
80 | string at a defined address (example $Name) |
The reference above to simple variables, also applies to array variables: For example, A9%(3), C(3) or B$(5).
The parameter block format always begins with the number of variables attached to the CALL. Thereafter, three bytes of information are given for each variable. If, for example, there are five variables in all, there will be 1+ (3×5)=16 bytes of information, starting at &0600 and ending at &060F. It is worth stressing that the information given is not the data itself but the address to which the data has been transferred. These addresses are not constant and only the operating system will be aware of them. When such a situation arises (in which only addresses are given) the data can be easily obtained by the use of indirect addressing. All that needs to be done is to treat the address information as pointers which can be passed to page-zero for use by indirect address action. It is possible to avoid indirect addressing to obtain this data by a series of LDAs and STAs but it is messy and inefficient.
It is easy to be confused by all this so it is essential to attack the subject in gentle steps. We begin by writing a few fines of code just to test that the CALL statement is indeed operating as described above. This code is shown in Program 4.2.
To keep the first example simple, only one parameter is passed in line 20. It is spread out in hex in order to appreciate the result more readily. Since we are demonstrating CALL, there must be some bit of machine code to call, so to maintain simplicity it is sufficient to use a NOP (we are not, at this stage, interested in the particular code). Line 90 is in BASIC and calls up the code, passing G% to the system. Lines 110 to 130 print out (in hex) the contents of the parameter block. On first running the program, it stops at line 140 and displays the assembly code and the contents of the parameter block which have the following significance:
11 | number of parameters passed (just G%) |
1C | low-byte address of where the first byte of G% is stored |
4 | high-byte address |
4 | parameter code (4=four-byte integer) |
10 REM Passing variables via CALL
20 G%=&010234567
30 P%=&0D00
40 START=P%
50 [
60 NOP
70 RTS
80 ]
90 CALL START,G%
100 PRINT
110 FOR Block=&0600 TO &0603
120 PRINT~?Block
130 NEXT
140 END
150 Pointer=&041C
160 FOR Data=Pointer TO Pointer+3
170 PRINT ~Data,~?Data
180 NEXT
>RUN
0D00
0D00 EA NOP
0D01 60 RTS
1
1C
4
4
>GOTO150
41C 67
41D 45
41E 23
41F 1
Program 4.2. Passing variables via the CALL statement.
Thus, we are now in the position of knowing that our data has been stored in address &041C. You may see from this why it was necessary to stop the program at line 140. The preliminary RUN gave us the address information. A GOTO 150 then executes the bottom program which displays the contents of the four addresses &041C to&041F. The original G% has therefore been successfully recovered although in the conventional reverse order (low-byte first). It would have been possible to use the indirection operator (!) instead of the FOR/NEXT loops but the display would have packed horizontally instead of being in more readable, vertical separation steps.
The next program (Program 4.3) goes a step further by showing indirect address pointers picking up parameter data from the call statement. The idea is to pass the address given in the parameter block at &0600, 0601 to the page zero addresses &70, &71 where they will act as the address pointer. The program is best understood by starting at the three lines of BASIC at the bottom (I 60 to 180 inclusive). The objective, again deliberately kept simple, is to input an integer variable (Volts%), pass it through the CALL procedure and print it out again, merely to prove the points described above. At the top of the program three preliminary assignments are made but you should particularly note line 20. It is a convention that the label, assigned to the lowest byte, is used to refer to the whole number, irrespective of the number of bytes. Lines 70 to 100 transfer the address information (where Volts{?£ is stored) into the two zero-page addresses. Notice the economy of using POINTER and POINTER+ I for the low- and high-byte respectively. This is why only the low-byte pointer was assigned in line 20. This little dodge is useful and will appear frequently in subsequent examples.
10 REM Passing variables via CALL (2)
20 POINTER=&70
30 RESULT=&80
40 START=&0C00
50 P%=START
60 [
70 LDA &0601 \Store LB/HB
80 STA POINTER \Address of volts
90 LDA &0602 \in zero page
100 STA POINTER+1
110 LDA #0
120 LDA (POINTER),Y \indirect address
130 STA RESULT
140 RTS
150 ]
160 INPUT"ENTER INTEGER "Volts%
170 CALL START,Volts%
180 PRINT"THE INTEGER PASSED WAS "?RESULT
Program 4.3. Passing variables and use of Indirect address pointers.
Line 120 is the most important of all because it illustrates the beauty of indirect addressing. Index register Y is first cleared because it has no meaning in this context All we want is simple indirect addressing without indexing. Although indirect indexed addressing us used, we could have substituted indexed indirect addressing providing that the X register, instead of the Y register, was cleared initially.
Proceeding another step further, Program 4.4 adds two single-byte numbers, both of them passed via the CALL statement. BASIC is used to input the two numbers into A% and B% and then passed to the system by means of CALL ADD, A%, B%. Since there are two variables, we reserve space for the address pointers in page-zero. The two assignments for this (appropriately labelled 'FIRST' and 'SECOND') are made in line 30. Space is also reserved for RESULT in line 40. Lines 90 to 120 transfer the address information of A% in &0601, &0602 to &70 and &71. Lines 130 to 160 perform a similar task for B%. After clearing Y and the carry bit, lines 190 to 210 perform the addition, again using indirect addressing.
Program 4.4. Single-byte addition.
10 REM 8-bit integer addition
20 MODE 4
30 FIRST=&70:SECOND=&72
40 RESULT=&80
50 ADD=&0C00
60 FOR PASS=0 TO 3 STEP 3
70 P%=ADD
80 [OPT PASS
90 LDA &0601
100 STA FIRST
110 LDA &0602
120 STA FIRST+1
130 LDA &0604
140 STA SECOND
150 LDA &0605
160 STA SECOND+1
170 LDY #0
180 CLC
190 LDA (FIRST),Y \Add pos integers
200 ADC (SECOND),Y \indirect address
210 STA RESULT,Y \Indexed address
220 RTS:]
230 NEXT
240 PRINT
250 PRINT"Addition of two unsigned int
egers ":PRINT
260 INPUT"First unsigned integer ",A%
270 INPUT"Second unsigned integer ",B%
280 CALL ADD,A%,B%
290 PRINT"Addition= ";?RESULT
Although unnecessary (there are no forward branches in the assembly code), the two pass assembly process has been included for the first time. It is a good habit to cultivate, even when unnecessary, in order to maintain consistency in program layout.
Although CALL is ideal for passing parameters to the assembler, it lacks the facility for directly returning parameters back from the assembler. The USR function is sometimes a more convenient, although less versatile, option. The general format of the USR function is as follows:
Result=USR(calling address)
Examples: D%=USR(START), Blogs=USR(&0D00)
Unlike CALL, the parameters to be passed must first be assigned to the four resident integer variables. A%, X%, Y% and C%. When USR is used, the information in these variables is transferred to the microprocessor registers of the same name with C% going into the Carry flag. The transfer is subject to the following provisos:
10 REM 8bit INTEGER ADDITION
20 REM DEMONSTRATING USR
30 MODE4
40 STORE=&70
50 ADD=&0C00
60 FOR PASS=0 TO 3 STEP 3
70 P%=ADD
80 [OPT PASS
90 STX STORE \STORE X REG
100 LDX #0 \SET HIGH BYTE TO 0
110 CLC
120 ADC STORE
130 BCC FINISH \INCREMENT RESULT
140 INX \HIGH BYTE IF CARRY
150 .FINISH \IS SET
160 RTS:]
170 NEXT PASS
180 PRINT
190 PRINT"ADDITION OF TWO UNSIGNED 8bi
t INTEGERS"
200 PRINT
210 INPUT"FIRST UNSIGNED 8bit INTEGER
",A%
220 INPUT"SECOND UNSIGNED 8bit INTEGER
",X%
230 RESULT=USR(&0C00)
240 RESULT=RESULT AND &0000FFFF
250 REM MASK OUT UNWANTED BITS
260 PRINT"ADDITION= ";RESULT
Program 4.5. 8-bit integer addition called by USR
After exit from the assembly code, a four-byte integer is returned to the result-variable supplied by the user. This integer is the composite contents of the four registers P, Y, X, A in that order. For example, suppose we used :
Blogs=USR(START)
Suppose also that the assembly code, on exit, left the registers with the following contents:
A=&FC, X=&67, Y=&FF and P (the process status register)=&03
On exit, the variable Blogs would contain &03FF67FC winch, you will do well to note, is in the reverse order. The fact that the contents of the P register form the most significant byte of the result is, in some respects, unfortunate. This register is normally dedicated to flag bits and some degree of fiddling is necessary if it is ever called upon (indirectly) to contribute meaningful numerical information to the result-parameter.
Program 4.5 illustrates the use of [JSR by adding two single-byte numbers in A% and X% respectively. The two numbers entered from the keyboard are placed in A% and X% by the BASIC fines 210, 220 before calling the code with {JSR. When two single-byte numbers are added, the result may spill over to two bytes but never more. Therefore the two higher order bytes of the four-byte result are so much garbage. Line 240 erases these bytes by use of the AND mask. There were no difficulties with the P register because it held one of the garbage bytes. However, many programs would require data contribution from P. There are only two instructions, PHP and PLP, which have direct action on the total contents of P. These can only push and pull to and from the stack. If we assume that the data to be placed in P initially rests m A, the following two lines of code illustrate a simple way out:
PHA \Push A to stack
PLP \pull A to P
Although the method looks simple, there is a hidden danger. It can turn out to be a hazardous business to interfere with the processor status, particularly if the program allows interrupts. The original status, however, is restored by an RTS but care may be needed to preserve P before using the above.
The respective merits of USR and CALL can be summarised as follows:
CALL:
USR:
The indirection operators form a grey area between high level language and machine code. They are similar to PEEK and POKE operations in traditional BASIC but are more versatile. However, this versatility is obtained at the expense of user-friendliness. The symbolism and format used can feel awkward for users hooked on BASIC. As far as machine code is concerned, the indirection operators are a boon because of the ease with which byte data can be pushed around memory. The definitions and format are well described in the user guides so only a brief outline (for the sake of continuity) is justified.
There, are three operators and all may be taken to mean 'the contents of ...'. For example, ?&0D00, means 'the contents of address &0D00' .A 'word' is taken to mean four bytes at consecutive addresses.
Byte indirection (?)
Word indirection (!)
String indirection ($)
All three operators must be placed before the address to which they refer. Some examples follow:
Byte indirection
(a) ?&X=&23 (b) ?&0D00=5 (c) ?current=&456 (d) ?208=46 (e) PRINT ?X (f) PRINT ~&0C00
Note in (c) above that only the lower order byte (56) is assigned to 'current'; the 4 is dropped because there is no room for it in a single byte.
Word indirection
One example is sufficient:
!&0D00=&23456789
89 goes into &0D00,67 into &OD01, 45 in &0D02 and 23 in &0D03.
String indirection
$&0D00="ABC" is an example worth examining in detail. The ASCII code for A (65) will be poked into &0D00, ASCII for B (66) in &01301 and ASCII for C in &0D02. Note that the dollar sign comes first to distinguish it from a normal string variable.
If the machine code is written within a BASIC program enclosed in the usual square brackets, it can be saved or loaded in the normal way (SAVE"name" or LOAD"name"). However, once the machine code has been assembled (and you know the address of the first byte), it can be saved and loaded separately from the BASIC parts. The formats are as follows:
To save machine code:
*SAVE"name" start-address end-address+1 (addresses will be assumed hex). For example, if a machine code subroutine called "sort" is located between &0D00 and &0D20 inclusive, it can be saved by:
*SAVE"sort" 0D00 0D21
There is an alternative format:
*SAVE"sort" starting-address number-of-bytes
The above example would then read:
*SAVE"sort" 0D00 +21
To load machine code:
The format is:
* LOAD"name"
The code will be loaded into the same address band as when saved. An alternative format is:
*LOAD"name" first-address
This is used (rarely) if, for some reason, it is required to load the code into a hex address, different to that used when the code was saved.
To run machine code:
The majority of machine code is likely to be called from within an outline BASIC program. If, however, the code is self-supporting it can be run by using:
*RUN "name"