Bottom     Previous     Contents

Chapter Three
The 6502 Instructions and Addressing Modes

Initial terms and definitions

Some readers will be aware of the following points but repetition is not always valueless. In any case, the terms used to describe aspects of machine code are far from standardised. The previous book, Discovering BBC Micro Machine Code, examined each separate instruction in reasonable detail and it would be pointless to go over the same ground here. Instead, the complete instruction set is relegated to Appendix C which should be consulted frequently during reading this and subsequent chapters. When programming in higher level language such as BASIC, an individual order to the computer is called a statement. For example, Energy=Mass*C^2 is an example of a statement.
In machine code, orders given to the computer are by means of instructions. Instructions are primitive and many are needed to form the familiar high level statements. An instruction will normally consist of an op-code to indicate the required action and an operand to indicate where the data is to be found. Sometimes, the location of the data will be obvious from the op-code but, in the general case, an operand is required.
There are several ways in which the operand can specify the location of the data. They are known as addressing modes and there are thirteen of them in the 6502 although not all of these are available to every instruction. Because one byte is used for the op-code it would be possible to have 256 different ones. However, 90 of the possible combinations are reserved for 'future expansion' (illegal in other words). This leaves 166 valid instructions to choose from. The task of selecting the most suitable op-code is less bewildering than it appears from the figures. There are only 56 completely different instructions. It is the available addressing modes for each instruction which multiply the choice.
The op-codes are specified by means of a pair of hex digits. There is a different op-code for every variation of addressing mode. However, the hex coding is really of academic interest because all machine code on the BBC machine will be entered by means of the resident assembler. The details of the assembler will be discussed fully in Chapter 4. The most useful property of an assembler is the facility to enter op-codes in three-letter mnemonic form. The desired addressing mode is indicated by the form in which the operand is written. The repertoire of instructions is set out formally in Appendix C. Consequently, the purpose of this chapter will be to explain the symbols, to define the addressing modes and to offer guidelines on the choice of a particular instruction and the most suitable method of addressing.

Factors influencing choice

It is not easy to give a specific answer to the question 'What is the correct instruction to use here?' The choice is very often a compromise between execution speed, memory economy and the demands of structure. Newcomers to machine code may be quite satisfied if their subroutine works at all but it soon becomes apparent that there are good and not so good variants. It is popularly supposed that a program written in machine code will always be much faster and take less memory than the BASIC version. This is a reasonable generalisation but not a universal truth. A poorly written machine code program could be slower than the BASIC equivalent Even if it is faster, it is well to remember that a speed advantage, to have any real meaning, must be assessed on human, rather than machine, time scales. If a BASIC version runs in one second and the machine code version runs in a millisecond, the advantage is academic rather than visible. The items of information needed to assess the merits of each instruction are as follows:

  1. What does it do? This information is conveyed by a three-letter mnemonic such as LDA or ADC. Although the mnemonic itself conveys a reasonable idea of what the instruction does, it is primarily intended as an aid to the interpretation of a listing. It cannot cover all the subtleties. It is necessary to augment the mnemonic by either a verbal definition or by a loosely standardised format known as operational symbols (discussed later).
  2. What addressing modes are available?
  3. What flags in the process status register are altered (updated)? Ignorance or confusion in this area is the cause of many an intractable bug. (4) How many clock cycles does it consume? The BBC machine runs at 2 MHz so each clock cycle is half a microsecond. The number of clock cycles is influenced more by the addressing mode than the actual instruction. Clock cycle tune is particularly critical. If the instruction is within a loop which resolves many times. Outside a loop, it is seldom important enough to influence choice.
  4. How many bytes are in the instruction? All instructions take at least one byte because they all have an op-code. The operand, however, can be absent altogether, one byte long, or two bytes long. Knowledge of the number of bytes required can be helpful. For example, it can be a matter of doubt in certain circumstances whether to write &004B or &4B in the operand. They are mathematically the same but an incorrect choice can cause havoc to the program.
  5. What is the hex op-code? Programming will always be performed with the aid of the assembler which uses mnemonic op-codes. However, it is still necessary at times to be aware of the hex coding for every instruction because the assembled machine code program will include it. It is easy to use an incorrect address mode by mistake when writing the operand but the hex code, which is specific for the addressing mode, might highlight the error during debugging. It is interesting, but not particularly rewarding, to write out the hex code in binary. It gives an insight into the mind of the microprocessor designer because some intriguing patterns emerge which can give a clue to the micro program within the chip.
  6. What is the correct syntax for the operand? This depends on the addressing mode and the rules are rigid, nacre so than in BASIC. The assembler does its best but it would be foolish to add user-friendliness to its list of virtues. Make a mistake and you are on your own!

Operational symbols

Universities have traditionally considered computing and data processing subjects to be the prerogative of the mathematics department. The computer is useful as a tool in mathematics so it was considered only natural that computing should be taught by mathematicians. Whether this has helped or hindered progress may be arguable. There is no denying that a mathematical brain was behind the establishment of operational symbols.
How do we describe exactly what an instruction will do, bearing in mind that there must be one, and only one, interpretation? Normal language is one way; perhaps the obvious way. But, to a mathematician, normal language lacks precision and is difficult to formulate concisely without using a lot of ifs and buts. Operational symbols are concise and unequivocal. They explain what the instruction does but make no attempt to explain the meaning of the operand. This is understandable because the meaning of an operand depends only on the addressing mode chosen. For example, the instruction LDA. will have the same operation symbols whether it is using immediate, zero page, absolute, indexed or indirect addressing. The general pattern of operational symbols is of the form:

Action ® Result

The arrow denotes the direction of data transfer and is preferable to the sign sometimes used. The abbreviations used for the registers are those already used but M is used to represent the data specified by the operand.
As a simple example, the instruction STA could be described as follows:

A ® M

This means 'Store a copy of the contents of the accumulator in the address specified by the operand'. Note that the arrow points from the source to the destination and only the destination contents are over-written by the new data; the source data is preserved.
To take a little more complex example, the instruction ADC could be described concisely as follows:

A+M+C ® A

This means 'Add together the present contents of the accumulator, the data specified by the operand, and the carry bit, then place the result in the accumulator'
The shift and rotate instructions are fearsome looking. For example, the instruction ASL (which is Arithmetic Shift Left) has the operational symbolism:

C ¬ (7. . .0) ¬ 0

The bracketed expression indicates the bits within a byte numbered 0 to 7. The action shows that a zero enters from the right and overspill from bit 7 goes into the carry.

Classification of instructions

There are many ways of classifying instructions. Appendix C simply lists them in alphabetical order by mnemonic group. This is useful as a quick reference but is by no means a scientific classification. Appendix C2 classifies them according to the flags affected in the processor status register and can be quite useful. Appendix C4 is an attempt to classify them according to 'popularity'. It is undeniable that some instructions out of the 56 are used a lot, some are used at times and a few are used spasmodically. Unfortunately, the choice of instructions to perform a given task is very much an individual affair. Some programmers have a particular liking for a certain subset. Indeed, it is often possible to recognise a friend's handiwork from the listing which can be almost a fingerprint. Because of the individual character, Appendix C4 can be no more than the author's personal choice although it might help those who are initially bewildered.
In this chapter, the instructions will be introduced (rather than classified), according to need. No account will yet be taken of the various addressing modes under each mnemonic.

Finding temporary homes for data

Due to the single accumulator in the 6502, it is often necessary to find a temporary home for existing data. There are several choices:

  1. Transfer A to another register by the use of TAX or TAY and later restore by TXA or TYA. This is the simple and speedy solution because they are both single-byte instructions, taking only two clock cycles. The trouble is that existing data in the X and Y registers may also be valuable and must not be overwritten. X and Y are often totally committed for indexing or loop counting.
  2. Push A to stack by using PHA and retrieve later by PLA. These are single-byte instructions but they take three clock cycles. It is important to bear in mind the LIFO (last in first out) nature of the stack. Mistakes in the order of retrieval could result in false data entering A. Another danger, of course, is stack overflow although this should be a comparatively rare event.
  3. Store A in a memory location by use of ST A and retrieve it with LDA. This will take three clock cycles if the location is on page zero and four on any other page (indexing and indirect addressing can take five or six cycles).

Performing arithmetic

There are only two direct arithmetical instructions, ADC and SBC for addition and subtraction respectively. The carry is always involved and, to avoid introducing garbage carries left over from a previous operation, it is important to be aware of the following rules:

  1. Before using ADC, the carry should normally be cleared with CLC.
  2. Before using SBC, the carry should normally be set with SEC.

Although in some circumstances the carry can be treated as the 'ninth bit', it should be borne in mind that this is purely a way of looking at it. Obviously, this ninth bit is not transferred by STA, TAX or TAY.
Addition and subtraction of single byte numbers are, of course, severely limited in the range of the result (255 in unsigned binary and +127 and -128 in two's complement binary). Fortunately, the carry bit allows double or multiple byte numbers to be added or subtracted because it can act as the continuity element between the msb of one byte and the lsb of the next. Thus, the carry is only cleared before the two lower order bytes are added. The higher order byte additions will include the carry over (if any) from any previous process so it would be fatal to clear the carry first.
It is important not to forget that there are two arithmetic modes depending on the D flag being set or cleared. The default condition is D = 0, which is the normal two's complement binary arithmetic mode. It is wise, though, to ensure the default condition by initialising with CLD at the head of a program. On the rare occasions when decimal (BCD) mode is required then the initialisation begins with SED, but remember this mode continues until cancelled again.
Multiplication and division is possible by a tongue-in-the-cheek method using ASL and LSR respectively. The operations are limited to integral powers of two. Watch must be kept on overspill from the ms b in multiplication and the lsb in division.
Subject to overspill into the carry, shifting left by ASL will multiply by two each time so four consecutive ASL operations will multiply the existing data by 16. Division by two is achieved by LSR although we must remember that the overspill from the right (from the lsb) goes into the carry. As a matter of interest, the reason why LSR is named Logical Shift Right is due to this very reason. It is arithmetically absurd for carry status to be in the lsb position, hence it is deemed to be 'logical' shift. This is in contrast to ASL (Arithmetic Shift Left) where the carry action is at the msb end. Unless the programmer is sure, from previous knowledge of the data, multiplication and division by these instructions must check for the presence of a carry after each use. There will be exceptions, of course, such as when multiple-byte precision is used. In these circumstances, the carry will be providing continuity between the component bytes when used in conjunction with ROL or ROR.

Clearing memory and registers

There are no instructions in the 6502 which can clear any of the registers or memory locations to zero. The usual way to clear registers is to store zero in them. To clear memory locations, a previously zeroed register can be stored in them. Those who are fascinated by novelty may be attracted by the following little snippet:

Exclusive-oring data with itself always results in all zeros.

For example, if A contains &9D and we write EOR #&9D, the accumulator result is &00. (To confirm, write out the example in binary form.)

Up-counting and down-counting

Counting is essentially an adding-by-one operation and implies 'upcounting'. It is also called incrementing. Down-counting is subtracting by one. It is also called decrementing. The X and Y registers can be counted up or down by the single byte instructions INX, INY, DEX and DEY, each taking only two clock cycles. Data in memory can be incremented or decremented by means of INC or DEC but not economically. They each take five to seven cycles depending on the addressing mode in use.
The accumulator is left out in the cold, lacking an increment or decrement instruction. It can, of course, be done by adding or subtracting 1 which, like DEX or INX only takes two clock cycles, but it requires two bytes even for the immediate addressing mode. There is also the possibility that the carry might have to be cleared first which, if forgotten, could lead to a mystery bug. An alternative is some roundabout method such as T AX then INX then TXA, providing of course, that X (or Y) is free.
Counting is an essential part of loop control. The number of loop revs can be achieved either by starting with N and counting down to zero or starting with I and counting up to N. The advantage of the count down method is that testing for loop exit can be achieved with BNE or BPL. Unfortunately, it is very easy to be 'one out' in the count down. If we count up to N, an extra comparison instruction such as CPX, CPY or CMP is required to check the exit condition but the method may have the advantage of seeming more 'natural' and errors by one are less likely.

Processing particular bits

There will be times when it will be required to operate on one or more particular bits within a byte, rather than on the entire byte. We may wish to ensure, say, that bit 3 is set to 1 without altering the remaining bits. The possible operations fall into three main groups, clearing bits to zero, setting bits to 1 and finally, changing bits. This is achieved by using one of the three 'logical' instructions AND, ORA and EOR in conjunction with the appropriate mask word in the operand. The action is always on the accumulator.

To clear selected bits:

Use AND with an operand mask as follows: '1's in the mask will leave corresponding bits unchanged. '0's in the mask will ensure that corresponding bits are 0.

To set selected bits:

Use ORA with an operand mask as follows: '0's in the mask will leave corresponding bits unchanged. 'l's in the mask will ensure that corresponding bits are 1.

To change selected bits:

Use EOR with operand mask as follows: '0's in the mask will leave corresponding bits unchanged. '1's in the mask will ensure that corresponding bits are changed.

The explanation for the above behaviour can be found in Appendix A under the heading Logic Gates. However, the following examples may help in understanding how to work out the correct mask:

  1. To ensure that bit 5 in the accumulator is a 0, use AND #&DF (the mask in binary is 1101 1111).
  2. To ensure that bits 2 and 6 in the accumulator are '1's, use ORA #&44 (the mask in binary is 0100 0100).
  3. To ensure that bit 3 in the accumulator is changed, use EOR #&08 (the mask in binary is 0000 1000).

One's complement of accumulator

It is sometimes necessary to flip all the bits in a byte (i.e. produce the one's complement). Assuming the data is already in the accumulator, this can be done by exclusive-oring as foflows:

EOR #&FF or EOR #255

Two's complement of accumulator

The two's complement is obtained by adding 1 to the above. Unfortunately, we can't add the 1 by incrementing because the result is in the accumulator. The only way is to follow with ADC #1, making sure to clear the carry first. The coding is as follows:

EOR #&FF
CLC
ADC #1

Since the two's complement of X is 0-X, an alternative method is simply to subtract the number from zero. This is, by definition, the two's complement but would entail storing the data first before loading the accumulator with 0.

Finding the state of a particular bit

It is sometimes important, particularly in peripheral control, to find out the state of one particular bit within a byte. This can be done by loading the byte into the accumulator, erasing all bits except the one of interest, then testing for zero. If the result is non-zero, the bit must have been a 1. For example, suppose we are interested in bit 3, the coding could be:

LDA data
AND #08 (0000 1000)
BNE etc.

An alternative method, which only works if bit 6 or bit 7 is involved, is the BIT test. For example, we can start by writing:

BIT data ('data' is an arbitrary address)

This copies bit 6 and bit 7 of the data into the V and N bits respectively. This can be followed by BVS or BMI as required. The BIT instruction takes 3 clock cycles if data is on page zero but otherwise 4 cycles. As a bonus, the bit test also logically ANDs the data into the accumulator. If this is a nuisance rather than a bonus, the accumulator should be stored first. Because of this, use of the BIT test is not a commonly used instruction.
Besides the three logical instructions AND, ORA and EOR, the shift and rotate instructions LSR, ASL, ROR and ROL are also used to play around with bits. LSR and ASL should be thought of as 'open-loop' operations because bits are lost if the carry is already full. In contrast, ROR and ROL are 'closed-loop' because the bit pattern circulates. They can all play an important role in peripheral work and some off-beat requirements. The shift and rotate instructions are unique in having 'accumulator' addressing. Thus, they can act on the accumulator or a memory location. If the action is required on the accumulator, the mnemonic must be followed by A. For example, to shift the accumulator right, we must write LSR A. When using accumulator addressing, no operand is necessary (the 'A' is not a true operand and does not consume a byte). Because of this, it should be noted (because it is a common mistake) that the shift and rotate instructions must either have an operand or an 'A'. For example, a naked LSR is illegal.

Double byte multiplication

This provides a useful exercise in shift and rotate operations. Although ASL and ROL both multiply by two. the carry can be a problem if they are not chosen wisely. No carry must be allowed to enter the lower order byte from the right so ASL is appropriate. On the other hand, the higher order byte must take into consideration the carry from the right so ROL must be used. Assuming the data is in two bytes of memory, the coding would be:

ASL low-byte
ROL high-byte

Double byte division

The opposite is required here. Thus, the higher order byte must be attacked first and a carry must not be allowed to enter from the left. This suggests LSR as the first step. The lower order byte must receive a carry (if any) from the left so the correct instruction here is ROR, Assuming that the data is in two bytes of memory, the coding is therefore:

LSR high-byte
ROR low-byte

Branching techniques

The equivalent of the dreaded GOTO in BASIC is JMP. The jump to a new part of the program is unconditional and, because JMP has a two-byte operand, can reach any part of the 64 K memory map. Appendix C lists seven conditional branch instructions. A common cause of a programming bug is an incorrectly used branch test allowing an unexpected loophole. The following points are worth emphasising:

  1. Branch instructions themselves have no effect on the processor status register. Thus, two different branch instructions can follow one another so the original data can be tested for two conditions.
  2. BMI or BPL should only be used if data is represented in two's complement binary. They are meaningless in unsigned binary because there is no differentiation into positive or negative sets.
  3. Before using a branch, make certain that the last operation actually updates the bits you are testing. In other words, check up on Appendix C2 which includes a classification of all instructions according to their effect on the processor flag bits. For example, it may be pointless to use BCC after DEX because only the N and Z flag bits are updated.

The limits of +127 byte forward or -128 bytes backwards have been covered elsewhere. If the branch is beyond range (which should not be often) the customary solution is to combine the branch with a JMP. For example, suppose the branch is to be BNE LOOP and the label 'LOOP' is out of range. The conventional way out is as follows:

BEQ SKIP
JMP LOOP
.SKIP

Note that the opposite test (BEQ) is used instead of BNE so the jump is leap-frogged to the label SKIP.

Comparisons

It is often required to compare two numbers in order to set the status flags without altering the contents of the register. There are three instructions which perform this task, all of which set the N, Z and C flags:

CMP, which compares memory with the contents of the accumulator.
CPX, which compares memory with the contents of the X register.
CPY, which compares memory with the contents of the Y register.

The comparisons are done by subtracting the memory data from a copy of the register in question. The operational symbolism is therefore A-M, X-M or Y-M respectively. It is easy to get mixed up with the direction of the subtraction, so note carefully that the subtraction is from the register. A suitable branch instruction must follow a comparison (otherwise there would be no point in asking for the comparison). It is possible to get in some funny mix-ups. The following examples may help in choosing the correct branch:

  1. To check if the register is less than memory, follow with BCC.
  2. To check if the register is equal to memory, follow with BEQ.
  3. To check if the register is greater than memory, follow with BEQ first then BCS.
  4. To check if the register is greater than or equal to memory, follow with BCS.

Addressing modes

Commencing with a definition, an addressing mode is the significance to be attached to the operand part of the instruction. Addressing modes available on the 6502 can be conveniently divided into three groups: non-indexed, simple indexed, and indirect indexed. Most of these modes may already be familiar to most readers, especially those who have read Discovering BBC Micro Machine Code. However, some revision or restatements are advisable, if only to maintain continuity during the lead-up to the rather nasty (nasty to grasp, that is) indirect addressing modes. Appendix C3 classifies instructions according to the addressing modes available.

Implied addressing

This is the simplest addressing mode in the repertoire because memory is not involved, neither is an operand required. They are all single byte instructions, conveying full information by the op-code alone. They all refer to internal operations on the 6502 registers. Because most of them only take two clock cycles, they are, or should be, the popular choice wherever possible
Instructions which allow implied addressing and consume only two clock cycles are: CLC, CLD, CLI, CL V, DEX, DEY, INX, INY, NOP, SEC, SED, SEI, TAX, TAY, TSX, TXA and TXS.
The following take more than two clock cycles: BRK, PHA, PHP, PLA, PLP, RTI, and RTS.

Immediate addressing

Memory is not involved because the operand is the data. All instructions using immediate addressing consume two bytes: one for the op-code and one for the operand. The standard assembler prefix to denote this mode is the symbol (if). For example:

LDA #32 or LDA #&20

Both are using immediate addressing. The first example is loading the decimal number 32 into the accumulator while the second example loads hex 20. Whether to use hex or decimal is optional but the guiding rule is to choose the more natural form for the purpose in use. For normal numerical work, decimal would be the preferred notation but for AND, EOR or ORA masks, hexadecimal has more meaning. Although it may seem to be stating the obvious, the largest numerical operand is 255 or &FF because immediate addressing only allows a single byte operand. Risking another obvious statement, the assembler would be very unhappy if we tried to load negative numbers in the form LDA #32.
Immediate addressing is used for constants, particularly in conjunction with comparison instructions at the end of a loop as, for example, CMP #20. The constant must, of course, be known to the programmer at the time of writing. In BASIC, we are usually extolled to avoid constants within the body of the program, the advice being to assign them to a variable at the head of the program. Such advice is not necessarily sound when applied to machine code because this would mean a trip to memory to obtain the data. The power of immediate addressing lies in the fact that memory is not involved: the data is immediately available in the instruction, providing, as said before, the programmer knows it at the time of writing.
There are eleven instructions which allow immediate addressing: ADC, AND, CMP, CPX, CPY, EOFL LDA, LDX, LDY, ORA and SBC.

Absolute addressing

We should begin by sorting out some of the confusing terms used by different authorities. The term 'direct' addressing is often used loosely when the operand refers to the address of data, rather than the data itself. Thus, the instruction LDA &0034 is an example of 'direct' addressing (note there is no '#' prefix). The instruction causes the contents of address &0034 to be placed in the accumulator. However, bearing in mind that 6502 has a 64K memory map, it will be evident that addresses between &0000 and &D0FF would result in an inefficient use of memory space if the full four-hex digit address were mandatory. Since the data bus is only eight bits wide, the microprocessor would need to make two trips down the address and up the data bus to collect the full operand. The first two leading zeros are useless passengers.
To improve the efficiency, the address space is broken down into two domains. As mentioned in an earlier chapter, addresses within the range &0000 to &OOFF are designated the page zero domain, to distinguish them from all other addresses &0100 to &FFFF. With regard to the terms used, the Motorola 6800 (the ancestor of the 6502) used the term 'direct' addressing instead of zero-page addressing and 'extended' addressing to cover the rest. Many machine code programmers, brought up on the 6800, had to readjust to the change in terminology. Returning to the 6502, the term 'absolute' addressing is applied to addresses, anywhere in the 64K memory map. In other words, absolute addressing requires four hex digits, while zero-page addressing only requires two. Instructions using absolute addressing require three bytes, one for the op-code and two for the operand.
There are 21 instructions which allow absolute addressing. These are: ADC, AND, ASL, BIT, CMP, CPX, CPY, DEC, EOR, INC, JMP, JSR, LDA, LDX, LDY, LSR, ORA, ROL, ST A, STX and STY.

Zero-page addressing

The concept of zero-page (sometimes called page-zero) is so important that it justifies emphasising the boundaries once again.

Zero-page is the address range &00 to &FF or 0 to 255

There are reasons why this page deserves special treatment. There are obvious speed advantages, due to the single byte operand. This also leads to a saving in program memory space. Another reason is that the more complex addressing modes (to be dealt with later) require address pointers which must be in zero-page. Perhaps the most disappointing aspect is the scarcity of available space. The operating system, not surprisingly, occupies the vast majority of zero-page. In fact there are only thirty-two locations guaranteed free under all circumstances in the BBC machine:

Free space in zero-page is between &70 and &8F inclusive

Because of the restricted space, it is essential, before planning any ambitious machine code systems, to choose zero-page locations with care. The apparent speed advantage; is not, in itself, sufficient to justify squandering locations. In fact, it is sound philosophy to treat zero-page locations in the same fight as registers as valuable and scarce commodities, A good rule is to use zero-page for the most frequently used variable data. Sometimes, it may be wise to use zero-page for data within a loop, even if it means temporarily transferring it from an absolute address and then back again. The advantage of this approach may be appreciated more readily if we examine a few figures. Suppose a variable data item, located in an absolute address, is in the middle of a long loop which revolves 10000 times. Suppose we then transfer it temporarily to zero-page before entering the loop by using:

LDA &xxxx (absolute, 4 clock cycles)
STA &xx (zero-page, 3 clock cycles)

After the loop ends, the status quo can be regained with:

LDA &xx (zero-page, 3 clock cycles)
STA &xxxx (absolute, 4 clock cycles)

The four extra instructions for the complete transfer have taken a total of 14 clock cycles and consumes an extra 10 bytes of programming space. The saving within the loop, however, would be 1 cycle per rev, leading to a total saving of 10000 - 14 = 9986 clock cycles. We shall see later that indirect address pointers in zero-page will take two bytes each and many of these may be required in a program of even moderate complexity.

Relative addressing

The intimate details of relative addressing are only of vital importance if the only method of entry is via a machine code monitor. Since the BBC machine has the advantage of assembler input, it is not necessary to spend quite so much time on the subject. It would not be wise, however, to skip the subject altogether. If we did, the hex columns of the assembler output could often look mysterious.
Relative addressing is only used with branch instructions. In fact, forgetting the assembler for a moment, relative addressing is the only method possible in branch instructions. Using hex machine code as an example,

BEQ &04

The literal meaning is 'If equal to zero, branch 4 bytes forward'. The term 'relative' refers to the program counter. If the branch conditions are satisfied, the program counter (which always contains the address of the next program byte) has 04 added to it. This causes the next byte executed to be 04 bytes ahead, relative to the previous position. In other words, the operand indicates the number of bytes to be skipped. To branch backward, the two's complement is required (see Appendix A) so, to branch 04 bytes back, the instruction would be BEQ &FC. Clearly, the calculation of the correct operand is an error-prone exercise. The assembler takes all the drudgery out of relative addressing by allowing the operand to be a label instead of a relative address. We can use,

BEQ Loop

This works, subject to the proviso that the line, to which we wish to branch, is prefixed with the 'Loop' label (naturally, the choice of label is arbitrary). The assembler is hiding from us the fact that relative addressing is being used. Instead, it appears as a simple 'branch to label' operation which is far less error-prone than grappling with relative addressing.
As for the timings of relative addressing, these will depend on whether or not the branch is taken. If taken, a branch takes 3 clock cycles or, if across a page boundary, 4 clock cycles. If it is not taken, the branch takes 2 clock cycles.
The extra cycle when a page boundary is crossed is due to the alteration to the high- as well as low-byte of the addresses. If speed is very critical, a programmer should watch the hexadecimal assembly listing closely to see if a page boundary is crossed. For example, suppose the program counter was showing &05FC prior to a branch. If the relative branch is &04 ahead, the new program counter reading would be &0600, therefore there has been a boundary crossing between page 5 to page 6 which consumes an extra clock cycle. If such a branch was in the middle of a loop which revolves N times, it would be sensible to manipulate the coding, or alternatively relocate, so that the branch range was limited to the same page, and saving N clock cycles. It is surprising how attention to such small details can result in a material gain in execution speed. Although terribly wasteful in terms of memory, it is better to cut loops out altogether and resort to straight-in-line coding if speed is absolutely vital. In most cases, this will be little more than an idealistic solution.

Indexed addressing

Although briefly discussed elsewhere, the concept of indexed addressing deserves detailed treatment. As far as the BBC assembler is concerned, the indexing mode is denoted by a comma following the operand, followed in turn by X or Y. For example:

LDA &2356,X or LDA &75,Y

Both are examples of indexed addressing but the first is using an absolute address and the second is using a zero-page address. The contents of the X (or Y) register is automatically added to the operand address before the instruction operates on the resultant address. It is well to recap on the terms used in indexed addressing:

  1. The base address is the address as stated in the operand.
  2. The relative address is the contents of index register (X or Y).
  3. The absolute address is the sum of the base and relative addresses.

As an example, assume that X contains 3 and that the instruction LDA &34,X is written. The base address is 34, the relative address is 3 and the absolute address is 34+3=37. Alternatively, the term 'effective' is often used in place of 'absolute'.
Two forms of indexed addressing are recognised:

  1. Absolute indexed, when the operand is any address in the 64K memory map. The instructions allowing X as the index register are ADC, AND, ASL, CMP, DEC, EOR, INC, LDA, LDY, LSR, ORA, ROL, SBC, and STA. The Y index register can be used in ADC, AND, CMP, EOR, LDA, LDX, ORA, SBC and STA
  2. Zero-page indexed, when the operand is on page-zero. The instructions which allow X as the index register are ADC, AND, ASL, CMP, DEC, EOR, INC, LDA, LDY, LSR, ORA, ROL, SBC, STA and STY. There are only two instructions which allow the Y register for indexing. They are LDX and STX.

A mysterious bug can occur when using zero-page indexed addressing if the contents of X plus the operand address come to more than 255 or &FF. Clearly the single byte operand cannot hold numbers of this value so a wrap-around takes place. For example, if the instruction is LDA &FE and X contains 2, the arithmetical sum would be &100. The wrap-around action, however, will mean that the first hex digit is dropped and the absolute address will be &00 instead of &100.
Indexing allows any item in a block of data to be addressed by suitable adjustment of the index register. The operand of an indexed instruction (the base address) can be the address of the first item in the block or the last, depending on convenience or the programmer's whim. For example, if the base address is to be the start of the block, the index register can be incremented (by INX) within the loop until the last item is reached. On the other hand, it may be more convenient to choose the end of the data as the base address,. in which case the index register is decremented (by DEX) until the first item is reached. Decrementation of the index register towards zero is generally recognised to be the more efficient method because the end-ofloop test can be carried out by a simple branch, such as BNE. The incrementation method demands a comparison (CPX or CPY) before the branch test. However, program legibility is sometimes more important than speed. There is a natural inclination to count up towards a finite limit rather than to count down towards zero and there is less chance of being 1 out in the count.
Besides accessing a data block sequentially, indexing is useful for lookup tables. For example, imagine a table of sines (or other mathematical functions) between, say, 0 and 89 degrees to be stored in a data block and the base address is where sin(zero) is located. The table can be accessed by

LDA base, X

If the required angle is in X, the sine of the angle will be in the accumulator. The limitation of 8 bits for each sine will only give an accuracy to about two decimal places unless multi-byte working is used. Also, the programmer must take account of the decimal point when interpreting the result. Obviously, it would be absurd to use this method in place of the resident BASIC trig functions unless high speed access is vital.
Indexing is really address modification made easy. Besides being interesting, it is worth examining an alternative method (which was, historically, used before index registers were thought of) involving direct modification of the operand. This consists of loading the operand of an instruction into the accumulator (or other register), changing its value and then returning it to the previous location. To see how this works, consider the following line:

STA blogs

The operand has an arbitrary symbolic address. If this were in a loop and we wished to store the next item in blogs+1 without using indexing, it could be achieved as follows:

.Modify STA blogs
INC Modify+1

Note that the original line has now been given an arbitrary label 'Modify' which is where the op-code STA is stored, so blogs must be located in the next address, modify+ I. The next line increments the contents of blogs+1 so we have achieved 'address modification' by a roundabout method. If the change is to be more than just a simple increment say, adding 7 the coding could be as follows:

.Modify STA blogs
LDA Modify+1
ADC #7
STA Modify+1

Such direct alteration of an operand by the program itself is sometimes useful, but it is not a practice to be recommended. Listings of machine code are never easy to follow and these sorts of tricks can only add to the general confusion. It is worth emphasising that the primary function of an index register lies in the ability to alter the effect of an operand and without altering the operand itself. One disadvantage of the 6502, which soon becomes evident in the early stages of programming, is the limit of 8 bits. This, of course, restricts the range of addresses which can be scanned by indexing even when absolute indexing is used.

Indirect addressing

Mastering any subject consists of systematically overcoming the various intellectual hurdles which appear during a course of study. Student dropouts may occur when a hurdle is reached which is just too high. In machine code programming, there are many hurdles to overcome but the one which is responsible for the greatest student drop-out ratio is the concept of indirect addressing. Indexing is relatively easy to grasp once the advantages of address modification are realised but the following definition may help in understanding why difficulties arise in indirect addressing.

An indirect address is the address of an address.

In assembly language, indirect addressing is indicated by enclosing the operand in parentheses as follows:

LDA (operand)

Note that although the operand is indeed an address, it is where the computer must go to find the address of the data. We shall continue for the moment to use LDA in examples, but it should be mentioned that simple indirect addressing as described above is only available with one instruction, JMP. Providing this is borne in mind, there is no harm in continuing with LDA in the initial stages. Consider the instruction:

LDA (&70)

Because of the parentheses, &70 is an indirect address, referring the computer to go to a double-byte address &70 (low-byte) and &71 (high-byte). This double-byte address is known as the address pointer because it 'points' to where the required data is located. Continuing with the examples, suppose that address &70 contains &35 and address &71 contains &0D. Returning now to the original instruction, LDA (&70), it should now be apparent that the contents of address &0D35 will be loaded into the accumulator. We will further assume that &0D35 contains &56.
Let us recap, using this example to illustrate the terms once more:

The instruction was LDA (&70).
The indirect address is &70.
The address pointer is &0D35.
The data pointed to and finally loaded into the accumulator is &56.

Figure 3.1 may help in the understanding of the above example.

Fig 3.1. Data flow in indirect addressing.

When first introduced to the idea of indirect addressing, it is difficult to grasp the use of it. It appears to be a complicated and tortuous path to follow, merely to place data in the accumulation For instance, it is understandable, and pertinent, to ask why the line in the above example couldn't have been written in the simpler absolute addressing form:

LDA &0D35

After all, it may be argued, both forms would have identical effects. They would both load the same item of data into the accumulator, but the second form would not be wasting a valuable location (&70) in zero-page and would certainly be quicker to execute. The answer to this lies in the ability of indirect addressing to alter the effect of an operand without altering the operand itself. You will remember that this quality was the fundamental justification for the use of indexed addressing. If the address pointer is changed in any instruction using indirect addressing, the effect of the instruction acts on a different location. This has far-reaching advantages, particularly when writing general purpose machine code subroutines. Clearly, when writing a subroutine intended to act on a block of data, it would be restrictive to force the writer of the program using the subroutine to always place the data in a fixed memory block. However, with indirect addressing, all that is necessary is for the main program to know where the address pointer is (&70 and &71 in our previous example) and load it with the starting address of the data block. This flexibility means that the writer of the machine code subroutine need have no knowledge of the whereabouts of the eventual data block.
Before proceeding further, it should be remembered that the descriptions so far have been simplified by assuming that a 6502 has the instruction LDA (operand). Apart from the single instruction, JMP, simple indirect addressing is not supported. Instead, we have the added benefit (and unfortunately, the added complication) of indirect addressing combined with indexing. In fact, there are two forms to choose from, called 'indirect indexed' and 'indexed indirect'.

Indirect indexed addressing

This is the form most often required. Only the Y index register is allowed in this mode. The assembler form is:

LDA (operand), Y

The operand is single byte and therefore can only refer to a zero-page address.
The only difference between this mode and simple indirect addressing is the addition of the Y register contents to the address pointer. That is to say, the operand still defines where a double-byte pointer is located but the pointer is modified by the addition of the Y register contents. As an example, assume that the following line is written:

LDA (&70),Y

Also assume that the contents of address &70 contains &35, address &71 contains &0D and the Y register contains &02. The effective address pointer will be &0D35 + &02 = &0D37. The effect of the instruction is therefore to load the contents of address &0D37 into the accumulator. The example figures can be used to define a few more terms connected with indirect indexing:

The instruction was LDA (&70),Y.
The base address pointer was &0D35.
The relative offset in Y was &02.
The effective address pointer was &0D37.

Figure 3.2 illustrates the example.

Fig 3.2. Data flow in indirect indexed addressing.

Indirect indexed addressing allows the effect of the operand to be altered in either of two ways, by changing the base address pointer, by altering the contents of the Y register or both. The index register should be looked upon as an optional extra because there is no need to use it actively. For example, if Y is reset to &00, the instruction,

LDA (&70),Y

has the same effect as the simple (but fictitious) indirect addressing example given earlier:

LDA (&70)

However, an obvious use of indirect indexing lies in sequencing through a block of data items by incrementing or decrementing the Y register. It is helpful to distinguish simple indexed loops from indirect indexed loops by considering under what circumstances they would be used:

  1. Use simple indexing if the base address is known and constant.
  2. Use indirect indexing if the base address is not known at the time of writing or is liable to require changing.

One advantage of indirect addressing not yet mentioned is the ability to reach any part of the 64K memory map by use of a single-byte operand. This is because the address pointer in zero-page is double-byte (16 bits).
The following example is outline coding to perform a process on a block of memory with just sufficient detail to illustrate indirect indexed addressing. Assume that the address of the first data item has been prior assigned to the address pointer in &70 (low-byte) and &71 (high-byte) and the length of the block minus 1 is 20.

LDY #20
.data LDA (&70),Y
.
.
process
.
.
DEY
BPL data
.
rest of program

The example should require little explanation, except perhaps to note that the indexing proceeds downwards towards zero, so the processing begins with the last data item and finishes with the first. As mentioned earlier, a downwards scan enables the end of the loop to be tested without the use of a CPY.
Some variations in the jargon exist. The alternative name for indirect indexed (and in some ways more informative) is 'post-indexed' indirect addressing because the indexing is done after the indirect address has been found. Also, address pointers are sometimes called address vectors.
Indirect indexed addressing is available with ADC, AND, CMP, EOR, LDA, ORA, SBC, and STA. They ail take 5 dock cycles except STA which takes 6. If a page boundary is crossed, they take an extra clock cycle.

Indexed indirect addressing

This mode doesn't enjoy quite the same measure of popularity as indirect indexed. The assembler form is:

LDA (operand, X)

Note carefully the position of the parentheses, that X is inside instead of outside and only X is allowed for indexing. As before, the operand must be single-byte so can only refer to a zero-page address.
X is shown within parentheses to emphasise the manner in which indexing is carried out. The behaviour of indexed indirect addressing is as follows:

The address of the pointer in indexed indirect addressing is the sum of the operand and the contents of X.

This definition may explain why an alternative name of this mode is 'preindexed' indirect addressing. To aid understanding,. first study the following numerical example:

LDA (&70,X)

In the first instance, assume that X is zero. The pointer is then the double byte address which happens to be in &70 (low-byte) and &71 (high-byte). However, if we assume that X contains &02, the address pointer is located at the double-byte address &72 and &73. Proceeding with this example, suppose that &72 contains &35 and &73 contains &0D, the instruction would load the accumulator with the contents of address &0D35. The example is illustrated in Fig. 3.3.

Fig. 3.3. Data flow in indexed indirect addressing.

Until familiarity is gained, it is easy to get mixed up with the two indirect modes because of the relatively superficial differences in the assembler form. In order to emphasise the difference in form and effect, it is worth viewing the two side by side:

Indirect indexed (post-indexed indirect) addressing keeps the pointer at a constant location but uses Y indexing to modify the pointer value.

Indexed indirect (pre-indexed indirect) addressing uses X indexing to modify the operand, and hence, the location of the address pointer.

As hinted earlier, indexed indirect addressing is not a commonly used mode. One area in which it is valuable is in handling peripheral interrupts. The course of a program can often depend on the particular peripheral which has requested interrupt. For example, the data sent to a printer will originate from a different area than the data sent to a digital-to-analogue converter. Assuming there are two peripherals on line, then we can arrange to have two separate address pointers to service them, located in zero-page. Suppose these double-byte addresses occupy the four locations &72,&73 and &74, &75 and consider the following line:

STA (&70,X)

The value placed in X must be that which modifies the operand to locate the desired address pointer. Care should be taken when calculating the value of X. The indirect address pointer is a two-byte address, so X must be changed by two at a time, otherwise the instruction above will define the high-byte instead of the low-byte. For example, if X is initially zero, the address pointer selected is located at &72, &73. If X is incremented only once, there is a foul-up because the address pointer is taken to be &73, &74 which is the high-byte of the first pointer and the low-byte of the second.
Apart from handling peripherals, indexed indirect addressing can be used to simulate the CASE statement found in some of the structured languages or the ON GOTO in BASIC. Control can be switched to separate machine code processes, each switched by a unique address pointer. The value in X determines which process is activated.
Indexed indirect addressing is available with ADC, AND, CMP, EOR, LDA, ORA, SBC and STA

Summary

  1. A machine code instruction always has an op-code but not all have operands.
  2. The op-code defines the required action; the operand indicates where data is to be found.
  3. Addressing modes are various ways in which operands express location of data.
  4. The computer recognises only binary op-codes expressed as two hex digits but the resident assembler allows three-letter mnemonic groups.
  5. The precise effect of an instruction is more concise if written in operational symbolism rather than words.
  6. During transfers, source data remains intact but old data at the destination is overwritten.
  7. In normal use, the carry is cleared before adding but set before subtracting.
  8. In double or multiple byte arithmetic, clear carry only before adding the lowest order bytes and set carry only before subtracting the lowest order bytes.
  9. Memory or registers are cleared by a load zero.
  10. There are no CLR instructions. There are no instructions to increment or decrement A.
  11. Use AND to clear, ORA to set and EOR to change selected bits within a byte.
  12. To flip over all bits, exclusive or with &FF
  13. To produce two's complement, flip first and then add 1.
  14. To find the state of a single bit, mask out uninteresting bits using AND and test for zero.
  15. The BIT test copies bit 6 and 7 of the data into V and N bits respectively and ANDs the data into A.
  16. LSR has the carry bit at the lsb end; AS R has the carry bit at the msb end.
  17. Only shift and rotate instructions have accumulator-addressing.
  18. In double-byte multiplication, use ASL for low-order and ROL for high-order byte.
  19. In double-byte division, use LSR first for the high-order then ROR for the low-order byte.
  20. The current state of the process status register determines whether or not a branch takes place.
  21. Branch instructions themselves do not affect the process status register.
  22. BMI and BPL are only useful if two's complement binary is used.
  23. If the branch is out of range, combine with JMP.
  24. In comparisons (CMP, CPX or CPY), the data is subtracted from the register in order to set flags but the original contents are restored.
  25. To check if the register is less, use BCC; to check if equal use BEQ; to check if greater, use BEQ first then BCS.
  26. Implied addressing has no operand.
  27. Immediate addressing is when the operand, which must be single byte, is the data.
  28. Absolute addressing is when the operand, which must be double-byte. is the address of the data.
  29. Zero-page addressing is when the operand, which must be single byte, is the page-zero address of the data.
  30. There are only 32 addresses guaranteed left free by the operating system, &70 to &8F inclusive.
  31. Relative addressing, used only with branch instructions, is when the operand signifies how many bytes away is the next instruction.
  32. Two's complement arithmetic is used to cover forward and backward branches. With the assembler, branch-to-label is possible.
  33. Absolute indexed addressing is when the operand (which must be double byte) plus the index register, is the address of the data.
  34. Zero-page indexed addressing is when the operand (which must be single byte) plus the index register, is the address of the data.
  35. In an indexed instruction, the operand defines the base address, the index register the relative address. The sum of the two is the absolute or 'effective' address.
  36. The operand in simple indirect addressing is the address of the lower order byte of a two-byte address pointer. Only JMP offers this mode.
  37. JMP excepted, address pointers can only reside in zero-page (page zero).
  38. Indirectly indexed addressing modifies the address pointer by the addition of Y. The assembler operand format is (operand), Y.
  39. Indexed indirect addressing modifies the address of the address pointer by the addition of the X register. The operand assembler format is (operand,X).
  40. Address pointers are also called vectors.

Self test

  1. Using three fines, multiply data in the accumulator by 3.
  2. Write the instruction to clear bit 5 in the accumulator.
  3. Write the instruction to change bits 3 and 6 in the accumulator.
  4. Write the instruction to set bit 2 in the accumulator.
  5. If the accumulator initially contains 17, what will it contain (in hex) after EOR #&FF?
  6. What is wrong with LDA #&23DF?
  7. Which 6502 register is involved in relative addressing?
  8. In the BBC machine, where are the 32 free locations in page-zero (answer in decimal address range)?
  9. If the effective address in the instruction EOR &73,X is &84, what is the relative address in hex?
  10. Name the one instruction in the 6502 which offers non-indexed indirect addressing.
  11. Which index register is allowed in indexed indirect addressing?
  12. What addressing mode is being used in STA (&75),Y?
  13. In the instruction ADC (&73),Y where is the high-byte of the address pointer located if Y contains 6?
  14. In the instruction ADC (&73,X) where is the low-byte of the address pointer located if X contains 6?
  15. What is an alternative name for indexed indirect addressing?

Next     Top