PAGE | &D -- 'return' |
PAGE + 1 | LSB of line number |
PAGE + 2 | MSB of line number |
PAGE + 3 | Length of line |
....... | Text of line |
PAGE + N | &D -- 'return' |
PAGE + N + 1 | LSB of line number |
PAGE + N + 2 | MSB of line number |
PAGE + N + 3 | Length of line |
PAGE + N + 4 | Start of text of next line |
etc. . . |
The text of the lines is stored in normal ASCII codes, except for a few special cases:
-- All keywords are stored as tokens. These are single byte abbreviations.
-- The line numbers in GOSUB/GOTO/RESTORE/ON . . . GOTO/ON . . . GOSUB are stored in special binary format.
The tokens used are listed in the User Guide. A point to watch is that certain keywords are not totally tokenised. For example, TOP is tokenised as the keyword 'TO', as in FOR, followed by the ASCII letter 'P'.
The format used following a GOTO or GOSUB is particularly involved:
The line number is replaced by a byte 141, followed by three bytes of code:
Bits -- | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Byte 1 | 0 | 1 | 128s | 64s | 0 | 16384s | 0 | 0 |
Byte 2 | 0 | 1 | 32s | 16s | 8s | 4s | 2s | 1s |
Byte 3 | 0 | 1 | 8192s | 4096s | 2048s | 1024s | 512s | 256s |
Those bits with a bar across their values are one if the line number does not contain the value, and zero if it does. The format is thus basically binary, except that the order of the bits has been altered.
As an example, the line GOTO 12345 will be 'hand tokenised': the code of GOTO is &E5, so this will be the first byte of the line.
A space follows, so the next byte is &20.
Then we get the code 141, or &8D (oddly enough, the double height code in teletext graphics).
The number 12345 in binary is "0011000000111001".
This can be better expressed as:
1 unit
0 twos
0 fours
1 eight
1 sixteen
1 thirty-two
0 sixty-fours
0 one-hundred-and-twenty-eights
0 two-hundred-and-fifty-sixes
0 five-hundred-and-twelves
0 one-thousand-and-twenty-fours
0 two-thousand-and-forty-eights
1 four-thousand-and-ninety-six
1 eight-thousand-and-one-hundred-and-ninety-two
0 sixteen-thousand-three-hundred-and-eighty-fours
Thus the binary format for the next three bytes is as follows:
Byte 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 |
Byte 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 |
Byte 3 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
Byte 1 | &54 |
Byte 2 | &79 |
Byte 3 | &70 |
10 GOTO 12345 12345 FOR T%=PAGE TO PAGE+20 12346 PRINT ~T%,~?T% 12347 NEXT T% RUN E00 D E01 0 E02 A E03 20 E05 E5 E06 20 E07 8D E08 54 E09 79 E0A 70 E0B D E0C 30 E0D 39 E0E 12 E0F 20 E10 E3 E11 20 E12 54 E13 25 E14 3D
The bytes I have described start at &E05.
The idea of using this peculiar code is to increase the speed of various operations concerning statements like GOTO/GOSUB/RESTORE/ON. . .GOTO. The most obvious advantage of this approach is that GOTO 1 occupies the same space as GOTO 32767. Thus, the command RENUMBER need only alter these three bytes, and the two bytes containing each line number to renumber the whole program. Actually, it renumbers the lines, and then looks for any byte 141s. When it finds one, the three bytes following it are renumbered. On other computers, the whole program text may need to be moved about, to accommodate the differing lengths of program lines as the GOTO and GOSUB destinations are altered.
The other advantage occurs when the line is being interpreted -- the computer need not convert a string of ASCII digits into binary before acting on the command -- it has them in a form of binary already.
It should be noted that the only part of this of use to a good programmer is the RESTORE statements option when it is included with a line number.
There is a table starting at address &806D in the BASIC ROM which contains all the keywords in ASCII, followed by their tokens. The table ends at address &8358.
The format of the table is: ASCII Characters/token/spare byte and so on. The end of the ASCII characters is gauged by when the next character is greater than 127, since all tokens are &80 or greater. The spare byte is used to show certain things about the keyword, which need not concern us here.
The program which follows prints out all legal keywords and their tokens, by accessing the table. I have included a sample run:
10 VDU 14 20 T%=&806D:REM &8071 for Basic 2 30 REPEAT 40 REPEAT 50 PRINT CHR$(?T%); 60 T%=T%+1 70 UNTIL ?T%>127 80 PRINT STRING$(20-POS,".");~?T% 90 T%=T%+2 100 UNTIL T%>&8358:REM &8366 for Basic 2 110 VDU 15 RUN AND.................80 ABS.................94 ACS.................95 ADVAL...............96 ASC.................97 ASN.................98 ATN.................99 AUTO................C6 BGET................9A BPUT................D5 COLOUR..............FB CALL................D6 CHAIN...............D7 CHR$................BD CLEAR...............D8 CLOSE...............D9 CLG.................DA CLS.................DB COS.................9B COUNT...............9C DATA................DC DEG.................9D DEF.................DD DELETE..............C7 DIV.................81 DIM.................DE DRAW................DF ENDPROC.............E1 END.................E0 ENVELOPE............E2 ELSE................8B EVAL................A0 ERL.................9E ERROR...............85 EOF.................C5 EOR.................82 ERR.................9F EXP.................A1 EXT.................A2 FOR.................E3 FALSE...............A3 FN..................A4 GOTO................E5 GET$................BE GET.................A5 GOSUB...............E4 GCOL................E6 HIMEM...............93 INPUT...............E8 IF..................E7 INKEY$..............BF INKEY...............A6 INT.................A8 INSTR(..............A7 LIST................C9 LINE................86 LOAD................C8 LOMEM...............92 LOCAL...............EA LEFT$(..............C0 LEN.................A9 LET.................E9 LOG.................AB LN..................AA MID$(...............C1 MODE................EB MOD.................83 MOVE................EC NEXT................ED NEW.................CA NOT.................AC OLD.................CB ON..................EE OFF.................87 OR..................84 OPENIN..............8E OPENOUT.............AE OPENUP..............AD OSCLI...............FF PRINT...............F1 PAGE................90 PTR.................8F PI..................AF PLOT................F0 POINT(..............B0 PROC................F2 POS.................B1 RETURN..............F8 REPEAT..............F5 REPORT..............F6 READ................F3 REM.................F4 RUN.................F9 RAD.................B2 RESTORE.............F7 RIGHT$(.............C2 RND.................B3 RENUMBER............CC STEP................88 SAVE................CD SGN.................B4 SIN.................B5 SQR.................B6 SPC.................89 STR$................C3 STRING$(............C4 SOUND...............D4 STOP................FA TAN.................B7 THEN................8C TO..................B8 TAB(................8A TRACE...............FC TIME................91 TRUE................B9 UNTIL...............FD USR.................BA VDU.................EF VAL.................BB VPOS................BC WIDTH...............FE PAGE................D0 PTR.................CF TIME................D1 LOMEM...............D2 HIMEM...............D3
Notice how only those functions which take two or more arguments include the bracket in the token. This is because arguments taking a single argument may have the brackets omitted. At the end of the table, the pseudo variables appear again. Their tokens here are used when the variable appears on the right-hand side of an assignment statement. You can see how this works in the list of keywords in the manual.
On the subject of pseudo variables, here is a list of the locations where TOP, HIMEM, PAGE, and LOMEM can be found:
Name | LSB | MSB |
TOP | &12 | &13 |
PAGE | &1D | |
HIMEM | &6 | &7 |
LOMEM | &0 | &1 |