Bottom     Previous     Contents

Chapter Four
Handling the Resident Assembler

The assembler format

An assembler is essentially a piece of software which aids the writing of machine code. Most personal computers offer only the crudest facilities for using machine code. Some only have a 'machine code monitor'. Others are even less equipped and the only way to enter a machine code program is by means of a boring and error-prone series of POKEs. It is not surprising that few owners of such machines develop a strong attraction for machine code. Of course, assemblers for most popular microcomputers are available on tape or disk but it is a sad fact of life that external software which has to be loaded, particularly from tape, is often too much trouble. Initially, it may be used with enthusiasm but the inevitable 'tape-inertia' syndrome eventually relegates the tape to its coffin.
Assemblers vary in sophistication and the facilities offered for debugging. It is unusual for a personal computer to be equipped with a resident assembler. No doubt manufacture's of machines for this market have previously assumed that few buyers, other than the completely dedicated, would be interested in any programming language other than BASIC. Acorn, conscious that interest in machine code would grow, included a resident assembler in the Atom, a practice which they have continued in the BBC machine.
The assembler used is unique because it is 'wedged' inside the BASIC chip. The position of the assembler inside the language system ensures easy transition between BASIC and machine coding. Machine code splices within a BASIC program are recognised by the large square brackets [ and ]. (These appear on the screen as left- and right-pointing arrows in Mode 7 on the BBC machine.) Although the assembler lacks many of the refinements found in traditional mainframe-oriented software, it is perfectly adequate and quite easy to use. In fact, some features of its design could be considered superior to classical assemblers. Although many readers will already be familiar with the facilities offered, it is necessary, for the sake of continuity, to devote a little space to the following overview.

Mnemonic op codes

Bearing in mind the function of an assembler defined at the beginning of this chapter, the foremost requirement of any assembler is the substitution of meaningful letter groups for the instruction op-codes. Thus, to transfer X to A, the hex machine code is &8A (see Appendix C1). The assembler allows us to write TXA instead. All mnemonic op-codes are three-letter groups

Numerical operands

Numerical values in operands are assumed to be decimal. If the number is to be interpreted as hex, they must be prefixed with the & character.

Operand variables

The most remarkable and useful property of the assembler is the {attitude allowed with variables. The following deserves emphasis:

Any legitimate BASIC variable or expression can be used in the operand.

Examples:

100 LDA LOCK

110 LDA LOCK+1

120 LDA #ASC("Y")

130 LDA #BASE DIV 256

140 LDA #BASE MOD 256

150 LDA #SQR(4)

These are all expressions which the assembler will accept although the following common sense provisos apply:

  1. The operand expression must be capable of intelligent decoding. That is to say, the resultant must be an address or data acceptable as an operand.
  2. Registers (A, in the above examples) can only hold one byte so it will be up to the programmer to ensure that the data, represented by the variables, is within this limit.
  3. It is essential that the variables be pre-defined in the BASIC program area. For example, we cannot write A=30 inside an area enclosed by the square brackets [....].
  4. Although BASIC expressions can be used, BASIC commands are most definitely illegal. The assembler would ruthlessly reject lines like PRINT A or DEF PROCsort. BASIC commands or statements must be confined to the BASIC area.

BASIC variables and expressions can also be used in jump operands such as:

JMP START+7

This will cause a jump to the address found by adding 7 to the contents of the variable START.

Branch labels

The previous chapter, dealing with branch instructions and relative addressing, stressed the difficulties associated with counting the number of bytes forward or backwards to reach the desired branch destination. The assembler, to a large extent, offers relief by allowing the use of labels. Thus the assembler disguises the inherent relative addressing and substitutes a straightforward 'branch to label'. For example, we can write:

BNE Sort

Here 'Sort' is the branch operand, referring to a label appearing somewhere in the program.
The label is recognised as such by the assembler because it must begin with a full-stop as shown in the following example:

CLC

.BACK LDA Number1

ADC Number2

BEQ Finish

ROR A

BNE BACK

.Finish RTS

The coding obviously has no practical value so it would be pointless to key in. Note the full-stop before the two labels and note also that there must be no full-stop when the label is an operand directive. The layout is valid but considerable latitude is allowed. For example, it could be re-written as follows:

CLC

.BACK

LDA Number1

ADC Number2

BEQ Finish

ROR A

BNE BACK

.Finish RTS

Notice that the label can stand alone or be on the same line as the instruction, providing there is at least one space separation. The label, although existing within the assembled coding, can also be considered to be a normal BASIC variable. It is, after all, an address and therefore a simple numeric. We could, for instance, discover this address (when out of the assembler and back in BASIC) by writing PRINT BACK or PRINT Finish. A problem arises, which we shall deal with later, when assembling coding in which branches are made to forward positions.

Remarks

Remarks, which programmers feel are necessary to explain coding, must be distinguished by either the semicolon or the back slash (\). For example:

LDA (pointer,Y) \Indirect loading

We shall use the back slash because it seems a more meaningful symbol and also because the semicolon is strongly associated with the PRINT statement in BASIC. Remember that the back slash in the BBC machine Mode 7 appears on the screen as a funny '½' character.

Multi-statements

As in BASIC, more than one assembly instruction can be placed on one line by using the full colon as a separator. For example:

[

CLC:ADC Number:ADC #4:RTS

]

It must be appreciated that the square brackets, which enclose assembly coding, have the status of an instruction, even though they are categorised as 'pseudo instructions'.
Writing many statements per line is popular in BASIC because programs appear shorter in length and less bytes are squandered in storing line numbers. In assembly code, although the length of the program still appears less, there is no real saving because the action of assembly produces the same final object program. Another disadvantage of multi-statements per line is the decrease in readability. A program written in assembly code is not exactly an easy thing to decipher (even for the writer!) and cramming a lot on one line increases the confusion, besides leaving less room for remarks.

Storing assembly code

When writing a BASIC program, we have no worries about where it will be stored in memory. It is left entirely to the operating system, which stores it in accordance with built-in rules. However, when a piece of assembly code is written, the assembler has no such built-in authority. It is up to the programmer to provide guidance on the desired starting location of the program in memory. Naturally, the guidance given must steer clear of the RAM space already earmarked by the operating system. There are several reasonably free areas in the BBC machine, besides space made artificially by shifting the BASIC text and variable areas around. Before discussing the details of free spaces, it is important to stress the vital importance of one of the special resident integer variables P%. As readers are already aware, the complete set of these twenty-seven special variables are labelled @%,A% ... Z%. They ail occupy fixed locations and can be picked up at any time by the operating system or assembler. Returning to P%,

The contents of P% are taken as the address of the next item of machine code.

Thus, to locate a program anywhere in memory, it is simply a case of loading the starting address of the first byte into P%. After each byte is stored, Pf%; is automatically incremented by one in readiness for the next byte. In fact, it is convenient to relate P£% to the microprocessor program counter.
The setting of P% must be carried out in the BASIC part of the composite program. For example:

10 P%=&900

20 [

30 \

40 \

50 \Assembly coding

60 \

70 \

80 ]

90 REM rest of BASIC

This would store the first byte of assembly code (wedged between BASIC) in the hex address 0D00. It is possible to use variable names when loading P% so we could have written the top bit of the previous example as:

10 START%=&0D00

20 P%=START%

Instead of committing P% to a fixed machine address, it is possible to delegate some of the responsibility to the operating system. Page 237 of the BBC User Guide tells us that the Dimension statement can be twisted a little in order to accommodate byte arrays. For example:

DIM STRRT% 99

P%=START%

This will reserve 100 bytes for assembly code, the first byte located in START% (START% is, of course an arbitrary name). It is important to recognise the unusual character of the DIM statement in fine 100. There are indeed two variations from the normal BASIC statement for dimensioning arrays. The first point to notice is the absence of the brackets around the 99 although the space before 99 is mandatory. The second point, less vital but still useful to know, is that START%, should not be thought of as the 'name of the array'. It is merely the labelled address of where the first byte is stored. The storage area, like the normal DIM statement, is 'dynamic'. That is to say, it moves up or down depending on the number of fines in the BASIC area. However, the actual machine address of the first byte in the example above can always be ascertained by a command, such as PRINT START%. The snag in using the DIM statement is the fact that you have to make an intelligent guess as to the number of bytes in your coding. The best way is to make a preliminary estimate and then add, say, twenty more for luck because it doesn't matter if you over-estimate. However, if you are a stickler for having things dead right, it is easy to count the bytes after the final debugging and alter the DIM number accordingly.
Other ways of finding space for assembly code are by the use of the pseudo variables TOP and PAGE. Space (for example 256 bytes) can be found below the BASIC program by using PAGE in the following manner:

100 PRGE=PAGE+256

110 P%=P8GE-256

The first line pushes PAGE upwards, and therefore the start of the BASIC program. This will reserve 256 bytes for the assembly coding which, as fine 110 suggests, starts at the old PAGE position. If we wish to reserve some of the assembly area for, say, 20 data bytes, line 110 can be written as:

110 P%=PAGE-236

This leaves 20 bytes for data, the first item being in PAGE and the last in PAGE+19. The assembly program will follow, beginning at PAGE + 20.
If we use LOMEM, the program and (or data can be positioned between TOP (the top of the BASIC program) and LOMEM (where BASIC stores its variables). (See the memory map on page 501 of the BBC User Guide.) For example:

100 LOMEM=LOMEM+256

110 P%=TOP

This will reserve 256 bytes for assembly coding, the first byte being in TOP. The User Guide warns us not to attempt to alter LOMEM in the middle of a program or the interpreter will lose track of the variables.

Fixed locations which are potentially free

There is nothing wrong with storing assembly programs and data 'dynamically' as described above. However, storing in a fixed location has, at least, a psychological advantage. You always know exactly where the program is stored and this can be comforting. The trouble with fixed storage in the BBC Micro is its scarcity.
The BBC machine has plenty of fixed free space providing that some of the normal and expansion facilities are sacrificed. For example, page &0D is perfectly safe and is actually described in the User Guide as space for 'user-supplied resident routines'. Unfortunately, this is subject to the proviso that no disk interface is used. It is also possible to use page &0C if we are prepared to sacrifice user-defined character definitions. Page &0B is yet another page available but this time we will lose the facility of programming those delightful red function keys. Use of any of these pages will therefore depend on individual needs, but it is useful to know that three contiguous pages are potentially free, representing a total memory range &0B00 to &0DFF. This is a hefty 768 bytes which, in machine code terms, is capable of supporting a complex program.
However, for those with disk or other expansion options it is advisable to steer clear of fixed locations altogether. It is safer to use the DIM statement. Any subsequent programs given which use fixed locations (to simplify explanations) can always be modified to DIM methods.

Operating the assembler code

To execute a BASIC program, we use the keyword RUN. To execute a machine code program we use the keyword CALL. Both RUN and CALL are BASIC keywords, a fact which serves to emphasise the close relationship between the assembler and the BASIC interpreter. The simple word CALL is very powerful because it combines the role of parameter-passing with machine code execution. Before delving into great detail, it is useful to study the following sequence of events starting with the source code and ending with the final execution:

1. A source code listing

10 P%=&D01

20 [

30 LDA #99

40 STA &0DFF

50 ]

60 PRINT:PRINT"The contents of &0DFF

is ";?&0DFF

Line 10 positions the code in page &0D, a quite arbitrary decision. Lines 30 and 40 are the assembly code which loads the accumulator with decimal 99 then stores it in &0DFF. Line 60 is BASIC and prints out the contents of this address, using the byte indirection operator.

2. Running the program

When we type RUN, the result is:

0D00

0D00 A9 63 LDA #99

0D02 8D FF 0D STA &0DFF

The contents of address &0DFF is 13.
This is the work of the assembler and indeed is called an assembly listing. It has produced the correct machine code from our mnemonic source code. The first column is the hex address of the first byte on that line. The second column is the hex machine code consisting of the op-code, followed by the operand. Notice that the two-byte operand is low-byte, high-byte form and appears back to front. The third column is our original source code.
Note carefully that we have RUN but the machine code doesn't appear to have worked because the printout is 13 instead of the expected 99. This is because the machine code has only been assembled that is to say, it has only been converted from our source code to a pure machine code form. But we have not yet told the code to be executed! Thus, the 'answer' we got of 13 was merely garbage that happened to be in &0DFF. If you try it, you will have a different garbage number (unless the law of averages break down). So, we need another step.

3. Executing the rnachine code with CALL

CALL &0D01

Nothing visible happens but the machine code has been executed. To prove it, RUN the program again. This will produce exactly the same results as the previous RUN but with one important difference. The BASIC fine at the bottom will now read:

The contents of address &0D01 is 99

The steps shown above have deliberately been separated in order to emphasise the difference between the assembly and the call processes. In practice, the CALL would normally be included under a line number in the program rather than activated by a separate command. For example, the listing above could be written:

10 P%=&0D01

20 [

30 LDA #99

40 STA &0DFF

45 RTS

50 ]

55 CALL &0D01

60 PRINT:PRINT"The contents of &0DFF

is ";?&0DFF

70 END

Apart from the extra CALL line, notice that RTS has been squeezed in. This should be considered the normal way to terminate machine code routines in order to ensure smooth control back to BASIC after the code has been executed. Without RTS, an error message from the assembler might (probably will) appear. Now when we type RUN, the program will automatically assemble the code, line 55 will execute it and RTS returns control back to BASIC at line 60.

Controlling the assembler output

The pseudo operation OPT controls the activities of the assembler. It is fully described in the User Guide but is repeated (with less detail) here.

The format is OPT pass, where pass is a variable which can be 0, 1, 2 or 3.

If pass 0, the assembly listing is suppressed and no errors are reported.
If pass=1, only assembly errors are suppressed.
If pass=2, only assembly listing is suppressed.
If pass=3, nothing is suppressed

It is important to remember that OPT is not a BASIC keyword, consequently it is only legal within the square bracket area. If OPT is not used, it defaults to OPT 3. Since the previous examples have not used OPT, the assembly listing appeared and assembly errors might also have appeared. It is often inconvenient and purposeless to display these except during the development and debugging stage.

The forward branch problem and 2-pass assembly

If a branch instruction directs control backwards to a labelled line, the assembler can cope immediately because it has picked up the address of the label on the way. However, if the branch directs to a forward address, the assembler is confused. To see why, imagine what happens in the following piece of code:

110 BNE exit
120 :
130 :
140 .exit

The main job of the assembler is to change mnemonic op-codes and operands into equivalent hex numbers and addresses. So what happens when it reaches line 110? It can easily look up its conversion table to find the hex code for BNE (it is D0). But it can't determine the hex address of the operand because it hasn't yet reached fine 140. Its intelligence is just not equal to the situation, so it gives up.
The BBC assembler is not peculiar in this respect It is a common problem in all but the most sophisticated versions. The standard way out is to give the assembler two goes at it, a trick known as two-pass assembly. The first pass collects all the labels and addresses and the second pass uses them to produce the final assembly.
It would be boring and error-prone if the operator always had to assemble programs twice. Fortunately, the FOR/NEXT loop in BASIC can be left to do the donkey-work. The following piece of code includes a simple forward branch and illustrates the two-pass assembly technique: FOR pass=0 TO 3 STEP 3

100 FOR pass=0 TO 3 STEP 3

110 P%=&0D00

120 [

130 OPT pass

140 LDA #3

150 CMP #3

160 BEQ Finish

170 NOP

180 .Finish RTS

190 ]

200 NEXT pass

210 CALL &0D00

220 END

The code itself is purposeless and barely worth explaining. It is easy to see that the forward branch, (the object of the example), will always take place. Line 170 could have been any useless instruction so NOP is exceptionally well-qualified. Note the following points:

  1. The FOR loop ensures that OPT 0 applies during the first pass. We don't want a listing and we certainly don't want the inevitable error 'unknown label' to appear.
  2. During the second pass, OPT 3 applies so the assembly listing appears and any errors (there shouldn't be any now) are reported.
  3. The assigning of P% is inside the FOR loop. This ensures that the second pass starts again on the same piece of code. If P?% was assigned before the FOR statement, the second pass would try and assemble code following on from the first pass with unpleasant results.
  4. The NEXT statement which closes the FOR loop must be outside the square brackets.
  5. If assembly code is intermixed with BASIC lines, it is necessary to include an appropriate OPT each time, otherwise it would default to OPT 3.

Using variable names

The facility to predefine memory locations with BASIC variables and use them in assembly code should be exploited to the full. Any dodge which makes assembly code look less formidable is worthwhile, even if it does squander a few BASIC lines. For example:

Number1=&70
Number2=&71

Having defined these locations, the variable names can be used in assembly operands, rather than absolute hex addresses. The following example, which adds two numbers (limited to single byte), illustrates some of the interchanges possible between BASIC and assembler:

10 DIM START% 40

20 CLS

30 INPUT"Enter first number "Number1

40 INPUT"Enter second number "Number2

50 Result=&70

60 P%=START%

70 ]

80 CLC

90 LDA #Number1

100 ADC #Number2

110 STA Result

120 RTS

130 ]

140 CALL START%

150 PRINT ?Result

Just for a change, the program is stored by courtesy of DIM, instead of using fixed free space. It is easy to get mixed up with the address of data and the data itself. For example, it may not be obvious why lines 90 and 100 must have the hash mark denoting immediate addressing. Note also that line 150 prints the contents of 'Result'. If 'Result' was printed, it would be an address rather than data in that address.

Macros

A macro is short for macro-instruction, a facility offered in some assemblers, whereby a block of instructions performing some task can be defined by name, and later treated as if it were a single instruction. Superficially, this description resembles an ordinary subroutine so it is important to compare them with a view to spotting the differences.
A subroutine is written once, stored in some fixed location and called up by a special jump (JSR). A macro, on the other hand, is assembled in machine code each time it is used in the body of the program. For example, suppose the following is a macro in a fictitious (and using an equally fictitious format) machine:

Macro TDX.

DEX

DEX

DEX

End Macro

The macro is first defined and given some arbitrary name (TDX in example). The macro is then written (three consecutive decrements in the example% The macro is then terminated. From now on, anytime we use TDX, the assembler will include the three instructions in the coding. Note that there is no 'jump' action. The macro is inserted in line every time. Because there is no time wasted in calling and returning from a subroutine, a macro has a higher execution speed.

10 REM Procedures used as macros

20 DIM START% 40

30 P%=START%

40 [:LDA #1:]

50 PROCshiftleft3

60 [:STA &70:]

70 PROCshiftleft3

80 [:STA &71

90 RTS

100 ]

110 CALL START%

120 PRINT?&70

130 PRINT?&71

140 END

150 DEF PROCshiftleft3

160 [:ASL A:ASL A:ASL A:]

170 ENDPROC

>RUN

19EF

19EF A9 01 LDA #1

19F1

19F1 0A ASL A

19F2 0A ASL A

19F3 0A ASL A

19F4

19F4 85 70 STA &70

19F6

19F6 0A ASL A

19F7 0A ASL A

19F8 0A ASL A

19F9

19F9 85 71 STA &71

19FB 60 RTS

8

64

Program 4.1. Procedures used as macros.

Now for the crunch. There is no macro facility in the BBC machine, at least not directly. However, it is possible to wangle it by exploiting BBC BASIC% most powerful asset, the Procedure. Instead of naming a macro, we name a procedure. Defining the macro is replaced by DEF PROCname. Naturally, we must define the procedure in BASIC and use it in BASIC, but the assembler is undaunted providing the square brackets are used correctly.
To illustrate, Program 4.1 loads the accumulator with 1 and then uses a simple 'macro' to produce three shift Lefts on the accumulator. The macro is used twice, so the accumulator is shifted 6 times. After R UN, the assembly listing appears. You will notice that the three ASL instructions are indeed assembled in-line each time the 'macro' is used. This example should emphasise the difference between macros and subroutines. We have stated that a macro is faster and yet, from the assembly listing it is not too obvious why. Actually, the time it takes actually to assemble macros may indeed be longer than assembling the equivalent subroutine. Once assembled (which only has to be done once) however, the execution is faster with macros because no time is wasted in jumping and returning (JSR takes 6 clock cycles and so does RTS). Against this, however, it should be realised that macros use up memory for the assembly code each time they are called.
The main hazard, when using procedures to simulate macros, is failing to observe the rules of the square brackets when dodging in and out of BASIC. The brackets can be round the wrong way or in the wrong places. Program 4.1 above uses a standardised format with the brackets enclosing the procedure call and on the same line.
As with normal procedure calls and definitions, it is possible to pass formal parameters into the assembly enclosure. For example, we can define a procedure as follows:

500 DEF PROCsubtract(number1,number2)

510 [

520 SEC

530 LDA number1

540 SBC number2

550 ]

560 ENDPROC

Once such a procedure is defined, it can be used to subtract, say, contents of Tax from Gross by using:

PROCsubtract(Gross,Tax)

Naturally, such a simple procedure can only handle a single byte subtraction but this is irrelevant it is the principle that matters.

Conditional assembly

This technique, like macros, is commonplace in professional mainframe assemblers. It simply means that certain changes can be made in assembly code without having to reassemble each time. Again like macros, it is not directly available on the BBC machine but can be simulated by a mixture of BASIC and machine code. As a simple example, it may be that during program development we would like to try the effect of X=4 or X=40 in the same program. A simple IF/THEN/ELSE spliced in the right place would do the job nicely:

IF Speed=slow THEN [ : LDX #4: ] ELSE [ : LDX #40: ]

Passing parameters via registers

There are various ways of passing parameters from the main program (which could be BASIC) to a machine code subroutine. The simplest, but not always the most advisable, is via registers. The three most important resident subroutines, OSRDCH, OSWRCH and OSBYTE (discussed in a later chapter) all use registers for parameter passing. OSRDCH and OSWRCH use the accumulator; OSBYTE uses the accumulator and the X and Y index registers.
Apart from the enforced use in resident subroutines, registers are not the ideal medium for passing parameters. The 6502 is not generous as far as they will already be holding variable data although, of course, the stack can be used as a temporary store while the registers are being used. Fortunately, alternatives are available in the shape of the JSR and CALL statements.

Passing parameters with CALL

The CALL statement is far more powerful than has been indicated earlier in the chapter. For example, we can use:

CALL NAME, G%, A$, Blogs

This shows that, in addition to actually calling (executing) machine code, it is possible to pass a variety of mixed parameters (integer variables, string variables and floating point variables and even single byte numbers) to the assembly code. Essential information on the parameters is passed to a special memory block located in the BBC machine at &0600 onwards. This block contains the following information:

&0600 number of parameters passed in CALL statement
&0601 low byte address of first parameter
&0602 high byte address of first parameter
&0603 code for parameter type
&0604 low byte address of second parameter
&0605 high byte address of second parameter
&0606 code for parameter type
&0607 low byte address of third parameter
&0608 high byte address of third parameter
&0609 code for parameter type

This sequence is repeated for any further parameters. The parameter code which has been referred to above is as follows:

0 8-bit byte (example ?Z)
4 32-bit integer variable (example Volts%)
5 byte floating-point number (example Blogs)
81 string variable (example Name$)
80 string at a defined address (example $Name)

The reference above to simple variables, also applies to array variables: For example, A9%(3), C(3) or B$(5).
The parameter block format always begins with the number of variables attached to the CALL. Thereafter, three bytes of information are given for each variable. If, for example, there are five variables in all, there will be 1+ (3×5)=16 bytes of information, starting at &0600 and ending at &060F. It is worth stressing that the information given is not the data itself but the address to which the data has been transferred. These addresses are not constant and only the operating system will be aware of them. When such a situation arises (in which only addresses are given) the data can be easily obtained by the use of indirect addressing. All that needs to be done is to treat the address information as pointers which can be passed to page-zero for use by indirect address action. It is possible to avoid indirect addressing to obtain this data by a series of LDAs and STAs but it is messy and inefficient.
It is easy to be confused by all this so it is essential to attack the subject in gentle steps. We begin by writing a few fines of code just to test that the CALL statement is indeed operating as described above. This code is shown in Program 4.2.
To keep the first example simple, only one parameter is passed in line 20. It is spread out in hex in order to appreciate the result more readily. Since we are demonstrating CALL, there must be some bit of machine code to call, so to maintain simplicity it is sufficient to use a NOP (we are not, at this stage, interested in the particular code). Line 90 is in BASIC and calls up the code, passing G% to the system. Lines 110 to 130 print out (in hex) the contents of the parameter block. On first running the program, it stops at line 140 and displays the assembly code and the contents of the parameter block which have the following significance:

11 number of parameters passed (just G%)
1C low-byte address of where the first byte of G% is stored
4 high-byte address
4 parameter code (4=four-byte integer)

10 REM Passing variables via CALL

20 G%=&010234567

30 P%=&0D00

40 START=P%

50 [

60 NOP

70 RTS

80 ]

90 CALL START,G%

100 PRINT

110 FOR Block=&0600 TO &0603

120 PRINT~?Block

130 NEXT

140 END

150 Pointer=&041C

160 FOR Data=Pointer TO Pointer+3

170 PRINT ~Data,~?Data

180 NEXT

>RUN

0D00

0D00 EA NOP

0D01 60 RTS

1

1C

4

4

>GOTO150

41C 67

41D 45

41E 23

41F 1

Program 4.2. Passing variables via the CALL statement.

Thus, we are now in the position of knowing that our data has been stored in address &041C. You may see from this why it was necessary to stop the program at line 140. The preliminary RUN gave us the address information. A GOTO 150 then executes the bottom program which displays the contents of the four addresses &041C to&041F. The original G% has therefore been successfully recovered although in the conventional reverse order (low-byte first). It would have been possible to use the indirection operator (!) instead of the FOR/NEXT loops but the display would have packed horizontally instead of being in more readable, vertical separation steps.
The next program (Program 4.3) goes a step further by showing indirect address pointers picking up parameter data from the call statement. The idea is to pass the address given in the parameter block at &0600, 0601 to the page zero addresses &70, &71 where they will act as the address pointer. The program is best understood by starting at the three lines of BASIC at the bottom (I 60 to 180 inclusive). The objective, again deliberately kept simple, is to input an integer variable (Volts%), pass it through the CALL procedure and print it out again, merely to prove the points described above. At the top of the program three preliminary assignments are made but you should particularly note line 20. It is a convention that the label, assigned to the lowest byte, is used to refer to the whole number, irrespective of the number of bytes. Lines 70 to 100 transfer the address information (where Volts{?£ is stored) into the two zero-page addresses. Notice the economy of using POINTER and POINTER+ I for the low- and high-byte respectively. This is why only the low-byte pointer was assigned in line 20. This little dodge is useful and will appear frequently in subsequent examples.

10 REM Passing variables via CALL (2)

20 POINTER=&70

30 RESULT=&80

40 START=&0C00

50 P%=START

60 [

70 LDA &0601 \Store LB/HB

80 STA POINTER \Address of volts

90 LDA &0602 \in zero page

100 STA POINTER+1

110 LDA #0

120 LDA (POINTER),Y \indirect address

130 STA RESULT

140 RTS

150 ]

160 INPUT"ENTER INTEGER "Volts%

170 CALL START,Volts%

180 PRINT"THE INTEGER PASSED WAS "?RESULT

Program 4.3. Passing variables and use of Indirect address pointers.

Line 120 is the most important of all because it illustrates the beauty of indirect addressing. Index register Y is first cleared because it has no meaning in this context All we want is simple indirect addressing without indexing. Although indirect indexed addressing us used, we could have substituted indexed indirect addressing providing that the X register, instead of the Y register, was cleared initially.
Proceeding another step further, Program 4.4 adds two single-byte numbers, both of them passed via the CALL statement. BASIC is used to input the two numbers into A% and B% and then passed to the system by means of CALL ADD, A%, B%. Since there are two variables, we reserve space for the address pointers in page-zero. The two assignments for this (appropriately labelled 'FIRST' and 'SECOND') are made in line 30. Space is also reserved for RESULT in line 40. Lines 90 to 120 transfer the address information of A% in &0601, &0602 to &70 and &71. Lines 130 to 160 perform a similar task for B%. After clearing Y and the carry bit, lines 190 to 210 perform the addition, again using indirect addressing.

Program 4.4. Single-byte addition.

10 REM 8-bit integer addition

20 MODE 4

30 FIRST=&70:SECOND=&72

40 RESULT=&80

50 ADD=&0C00

60 FOR PASS=0 TO 3 STEP 3

70 P%=ADD

80 [OPT PASS

90 LDA &0601

100 STA FIRST

110 LDA &0602

120 STA FIRST+1

130 LDA &0604

140 STA SECOND

150 LDA &0605

160 STA SECOND+1

170 LDY #0

180 CLC

190 LDA (FIRST),Y \Add pos integers

200 ADC (SECOND),Y \indirect address

210 STA RESULT,Y \Indexed address

220 RTS:]

230 NEXT

240 PRINT

250 PRINT"Addition of two unsigned int

egers ":PRINT

260 INPUT"First unsigned integer ",A%

270 INPUT"Second unsigned integer ",B%

280 CALL ADD,A%,B%

290 PRINT"Addition= ";?RESULT

Although unnecessary (there are no forward branches in the assembly code), the two pass assembly process has been included for the first time. It is a good habit to cultivate, even when unnecessary, in order to maintain consistency in program layout.

Passing variables by means of USR

Although CALL is ideal for passing parameters to the assembler, it lacks the facility for directly returning parameters back from the assembler. The USR function is sometimes a more convenient, although less versatile, option. The general format of the USR function is as follows:

Result=USR(calling address)
Examples: D%=USR(START), Blogs=USR(&0D00)

Unlike CALL, the parameters to be passed must first be assigned to the four resident integer variables. A%, X%, Y% and C%. When USR is used, the information in these variables is transferred to the microprocessor registers of the same name with C% going into the Carry flag. The transfer is subject to the following provisos:

  1. Only the low order bytes of A%, X% and Y% can be passed to A, X and Y registers.
  2. Only the least significant bit of C% can be passed to the C flag in the processor status register.

10 REM 8bit INTEGER ADDITION

20 REM DEMONSTRATING USR

30 MODE4

40 STORE=&70

50 ADD=&0C00

60 FOR PASS=0 TO 3 STEP 3

70 P%=ADD

80 [OPT PASS

90 STX STORE \STORE X REG

100 LDX #0 \SET HIGH BYTE TO 0

110 CLC

120 ADC STORE

130 BCC FINISH \INCREMENT RESULT

140 INX \HIGH BYTE IF CARRY

150 .FINISH \IS SET

160 RTS:]

170 NEXT PASS

180 PRINT

190 PRINT"ADDITION OF TWO UNSIGNED 8bi

t INTEGERS"

200 PRINT

210 INPUT"FIRST UNSIGNED 8bit INTEGER

",A%

220 INPUT"SECOND UNSIGNED 8bit INTEGER

",X%

230 RESULT=USR(&0C00)

240 RESULT=RESULT AND &0000FFFF

250 REM MASK OUT UNWANTED BITS

260 PRINT"ADDITION= ";RESULT

Program 4.5. 8-bit integer addition called by USR

After exit from the assembly code, a four-byte integer is returned to the result-variable supplied by the user. This integer is the composite contents of the four registers P, Y, X, A in that order. For example, suppose we used :

Blogs=USR(START)

Suppose also that the assembly code, on exit, left the registers with the following contents:

A=&FC, X=&67, Y=&FF and P (the process status register)=&03

On exit, the variable Blogs would contain &03FF67FC winch, you will do well to note, is in the reverse order. The fact that the contents of the P register form the most significant byte of the result is, in some respects, unfortunate. This register is normally dedicated to flag bits and some degree of fiddling is necessary if it is ever called upon (indirectly) to contribute meaningful numerical information to the result-parameter.
Program 4.5 illustrates the use of [JSR by adding two single-byte numbers in A% and X% respectively. The two numbers entered from the keyboard are placed in A% and X% by the BASIC fines 210, 220 before calling the code with {JSR. When two single-byte numbers are added, the result may spill over to two bytes but never more. Therefore the two higher order bytes of the four-byte result are so much garbage. Line 240 erases these bytes by use of the AND mask. There were no difficulties with the P register because it held one of the garbage bytes. However, many programs would require data contribution from P. There are only two instructions, PHP and PLP, which have direct action on the total contents of P. These can only push and pull to and from the stack. If we assume that the data to be placed in P initially rests m A, the following two lines of code illustrate a simple way out:

PHA \Push A to stack

PLP \pull A to P

Although the method looks simple, there is a hidden danger. It can turn out to be a hazardous business to interfere with the processor status, particularly if the program allows interrupts. The original status, however, is restored by an RTS but care may be needed to preserve P before using the above.

USR or CALL?

The respective merits of USR and CALL can be summarised as follows:

CALL:

  1. CALL can pass any number of parameters, limited only by the available space from &0600 onwards.
  2. The parameters are not restricted to integers. They can be any of the data forms, including string array variables.
  3. The magnitude of the variables passed is not restricted to single bytes.
  4. There is no provision in the CALL format for directly passing a result parameter (if there is one) back to the calling program. It must be arranged within the coding.

USR:

  1. Only three single-byte integers and one isolated bit can be passed as parameters to the subroutine
  2. A four-byte result can be directly returned.

Indirection operators

The indirection operators form a grey area between high level language and machine code. They are similar to PEEK and POKE operations in traditional BASIC but are more versatile. However, this versatility is obtained at the expense of user-friendliness. The symbolism and format used can feel awkward for users hooked on BASIC. As far as machine code is concerned, the indirection operators are a boon because of the ease with which byte data can be pushed around memory. The definitions and format are well described in the user guides so only a brief outline (for the sake of continuity) is justified.
There, are three operators and all may be taken to mean 'the contents of ...'. For example, ?&0D00, means 'the contents of address &0D00' .A 'word' is taken to mean four bytes at consecutive addresses.

Byte indirection (?)
Word indirection (!)
String indirection ($)

All three operators must be placed before the address to which they refer. Some examples follow:

Byte indirection

(a) ?&X=&23 (b) ?&0D00=5 (c) ?current=&456 (d) ?208=46 (e) PRINT ?X (f) PRINT ~&0C00
Note in (c) above that only the lower order byte (56) is assigned to 'current'; the 4 is dropped because there is no room for it in a single byte.

Word indirection

One example is sufficient:
!&0D00=&23456789
89 goes into &0D00,67 into &OD01, 45 in &0D02 and 23 in &0D03.

String indirection

$&0D00="ABC" is an example worth examining in detail. The ASCII code for A (65) will be poked into &0D00, ASCII for B (66) in &01301 and ASCII for C in &0D02. Note that the dollar sign comes first to distinguish it from a normal string variable.

Saving and loading machine code

If the machine code is written within a BASIC program enclosed in the usual square brackets, it can be saved or loaded in the normal way (SAVE"name" or LOAD"name"). However, once the machine code has been assembled (and you know the address of the first byte), it can be saved and loaded separately from the BASIC parts. The formats are as follows:

To save machine code:

*SAVE"name" start-address end-address+1 (addresses will be assumed hex). For example, if a machine code subroutine called "sort" is located between &0D00 and &0D20 inclusive, it can be saved by:

*SAVE"sort" 0D00 0D21

There is an alternative format:

*SAVE"sort" starting-address number-of-bytes

The above example would then read:

*SAVE"sort" 0D00 +21

To load machine code:

The format is:

* LOAD"name"

The code will be loaded into the same address band as when saved. An alternative format is:

*LOAD"name" first-address

This is used (rarely) if, for some reason, it is required to load the code into a hex address, different to that used when the code was saved.

To run machine code:

The majority of machine code is likely to be called from within an outline BASIC program. If, however, the code is self-supporting it can be run by using:

*RUN "name"

Summary

  1. Assembly code within BASIC must be enclosed within [ ].
  2. Hex op-codes are replaced by three-letter mnemonic groups.
  3. All numerics are assumed decimal unless preceded by &.
  4. Operands can be absolute numeric or any legitimate BASIC variable expression but not BASIC commands.
  5. Registers can only hold one byte so any excess high order operand digits are dropped.
  6. BASIC variables cannot be assigned within assembly code.
  7. Conditional branch destination can be to a symbolic label.
  8. A branch destination label must be preceded by the full-stop. There must be no full-stop when the label is in the operand position.
  9. Each statement on the same line must have the full colon as a delimiter.
  10. P% is special. The contents will be the address of the first byte of the assembly code.
  11. Assembly code can be located by either absolute addressing, by modified DIM statement, below BASIC by using PAGE or between the end of BASIC program and the start of the 'variable space' by use of LOMEM.
  12. The BASIC keyword RUN assembles any machine code present but only CALL can execute it.
  13. CALL can be a statement within BASIC or a separate command.
  14. OPT is a pseudo operation (not a BASIC keyword) so is valid only within the square brackets.
  15. OPT defaults to OPT 3 which instructs the assembler to produce a hsting and to report any errors.
  16. A fully debugged, tested program would normally be run under OPT O.
  17. Assembly code which contains a branch to a forward (higher) address requires two passes through the assembler.
  18. A P% assignment must be inside the FOR loop for two-pass assembly
  19. Macros, although not directly available in the assembler, can be implemented by using procedures.
  20. The square brackets, denoting assembly code are legal within DEF PR0C.
  21. Macros, implemented by procedures, require a temporary exit from the assembler.
  22. Conditional assembly is not directly available but can be implemented by the IF THEN ELSE structure.
  23. Parameters can be passed to assembly subroutines by CALL or USR.
  24. In the BBC machine, the address band beginning at &0600 is the parameter-block used by CALL.
  25. The parameter-block contains only the address of the parameters not the parameters themselves.
  26. The parameter begins with the number of parameters. Three items for each parameter then follow.
  27. The first and second items give the low- and high-byte address and the third gives the parameter type.
  28. The parameter type code is 0 for single byte. 4 for four-byte integer, 5 for five-byte floating point variable.
  29. The parameter type code 80 is for defined address strings and 81 is for normal string variables.
  30. Because the CALL parameter block contains only addresses of data, the data is easily recovered by using indirect addressing.
  31. Before using indirect addressing, Y (or X) are usually zeroed and parameter addresses transferred to page zero.
  32. Parameters passed by USR are by A%, X%, Y% into the same named 6502 registers. The lsb of C% is passed to the carry bit in P (process status register).
  33. The result of USR action (if any) is passed to any designated integer variable.
  34. The USR result is the combined contents of P, Y, X and A as they were on exit from the subroutine.
  35. The most significant byte of the USR result is that contributed by P.
  36. Indirection operators ?, !, $ before a variable refer to single-byte, four-byte and string respectively.
  37. The presence of * before SAVE or LOAD indicates that the actions refer to assembly code programs.
  38. Provision exists within *LOAD for loading assembly programs into a different address block from that used when *SAVE was used.
  39. When using *SAVE, either the last address+1 of the block is stated or the number of bytes.

Self Test

  1. What is the error in the following segment of code?
    LDA Number:.Label CLC:ADC #&20:BNE .Label
  2. RUN, CALL and OPT. Which of these are pseudo-operations?
  3. What is the default number in OPT n?
  4. If the only branch instruction in a code section was BNE &83, would it require two-pass assembly?
  5. Some code is called with CALL name F%, Blogs. State:
    (a) The hex number stored in address &0600
    (b) The contents of address &0603.
    (c) The contents of address &0606
  6. The result variable, returned by a USR call, was &12345678.
    (a) What number was in the X register?
    (b) What number was in the process status register?
  7. Using the word indirection operator, write a BASIC statement which will print out in hex the contents of address 34587.

Next     Top