Before C
The venerable BCPL procedural structured programming language is fast to compile, is reliable and efficient, offers a wide range of software libraries and system functions, and is available on several platforms, including the Raspberry Pi.
In the 1960s, the main high-level programming languages were Fortran, Basic, Algol 60, and COBOL. To optimize code or to provide low-level operations, assembler programming offered the only means to access registers and execute specific machine instructions. BCPL, which was used as a teaching language in many universities, provided a language with a rich syntax, addressed the scoping limitations of the other languages, and had low-level operations such as bit manipulation and computation of variable addresses.
Where BCPL differs from the other languages is that it is typeless; all variables are considered to be a word, typically 16 or 32 bits. Programmers can access individual bits and bytes of a word, perform both arithmetic and logical operations on words, compute the address of a word, or use a word as a pointer to another word. One further novel aspect of BCPL is that the compiler is small and written in BCPL, producing intermediate code for a virtual machine and simplifying the development of the compiler for a wide range of computers. BCPL was used on mainframe computers and minicomputers in the 1970s and microprocessors in the 1980s.
The early developers of Unix were influenced by, and many aspects of C were adopted directly from, BCPL. Although BCPL also supported characters and bytes, the lack of richer types was addressed in C, which became the programming language of choice for Unix (and subsequently Linux), leaving BCPL mostly for academic applications. Several groups developed compilers, operating systems, software utilities, commercial packages, and even flight simulation software in BCPL, but for the most part, BCPL has been forgotten.
The demise of BCPL in both academia and industry is disappointing, particularly because it is a powerful teaching language, introducing students to algorithms, software design, and compiler design. Later, languages such as Pascal and Modula-2 became popular languages to introduce concepts in computer science but have been superseded by Java, Python, and C++. Whereas the learning curve for BCPL is small, enabling students to become productive in a short time, the complexity of languages such as C++ can be a barrier to students learning their first programming language.
The BCPL Language
The example in Listing 1 of a small BCPL program computes factorial values from 1! to 5!. Because C was developed from BCPL, the syntax of both languages is similar. The include
directive in C is a GET
directive in BCPL, the assignment operator =
in C is :=
in BCPL, and the fences (curly brackets) {
and }
are identical. In C the address of a variable a
is denoted by &a
, whereas in BCPL it is given by @a
. Indirection, or the use of pointers, is given by *a
in C or !a
in BCPL. Arrays are organized so that a!b
in BCPL corresponds to a[b]
in C.
Listing 1: 1! to 5! in BCPL
01 GET "libhdr"
02
03 LET start() = VALOF
04 {
05 FOR i = 1 TO 5 DO
06 writef("fact(%n) = %i4*n", i, fact(i))
07 RESULTIS 0
08 }
09
10 AND fact(n) = n=0 ‑> 1, n*fact(n‑1)
The GET
directive includes the common procedures and definitions needed in the compilation of a program. The procedure start
is similar to main
in C, where the VALOF
keyword denotes that start
is a function with the result returned by the RESULTIS
keyword. The variable i
, a local variable of the procedure start
, is implicitly defined at the start of the FOR
loop, which is executed five times. The writef
function is similar to printf
in C. The recursive function fact
tests whether n
is zero and returns either 1
or n*(n-1)!
, where the parameter n
is a local variable of the procedure fact
.
In BCPL, a variable is defined as a word that can represent an integer, a bit pattern, a character, a pointer to a string of characters, a floating-point number, or an address. A programmer can apply arithmetic operators, logical operators, shift operators, an address operator, or indirection to a variable – the compiler assumes that the programmer knows what they are doing and, subject to syntactic and sematic compilation checks, places very few constraints on programming constructions. Arguably, C and BCPL fall into the category of languages that provide almost unlimited power for a programmer with very few checks on their intention.
Both C and BCPL allow sections of a program to be compiled separately (e.g., to provide a library of functions). Global variables and procedures in BCPL, which are similar to external variables and functions in C, can be accessed by all sections of a program, whereas static variables are only accessible from the section in which they are declared. The other category of variables is local or dynamic variables, which are declared and used in the same way as C. When a local variable is declared, space is allocated on a stack, which grows and shrinks dynamically, typically on entry to and exit from a procedure, respectively, enabling procedures to be called recursively.
Portability
BCPL was developed by Martin Richards in the Computer Laboratory at the University of Cambridge. His more recent Cintcode implementation is extensive and provides numerous examples of coding, mathematical algorithms, and even operating system functions. The advantages of this implementation are considerable: It is fast to compile, is reliable and efficient, and offers a wide range of software libraries and system functions. It is also available on several platforms, including the PC and the Raspberry Pi. The only drawback is the loss of speed from interpreting the compiled code.
I refer you to Martin Richard’s textbook , and his website which includes a version of Cintcode, that is straightforward to download and implement on an RPi. Also, a guide directed at young people programming a Raspberry Pi provides an extensive description of BCPL and the Cintcode implementation and numerous examples of BCPL programs.
For the programmer intending to write applications in BCPL that exploit the processing power of the ARM cores of a Raspberry Pi, a BCPL compiler generating ARM instructions directly is likely to produce code which runs considerably faster than interpreted code. For other users less concerned with processing speed, the tools and support provided by the Cintcode implementation of BCPL offer a stable and reliable platform.
BCPL for the Raspberry Pi
The arrival of the Raspberry Pi with its ARM cores, network connection, sound and video outputs, USB ports, and I/O interface running under the Linux operating system has encouraged the development of a range of programming languages for this platform. A code generator for BCPL that I developed compiles BCPL directly to ARM machine code, which can be linked with the standard Linux gcc
toolset. The compiler (7,000 lines) compiles itself in less than 0.2 seconds on a Raspberry Pi 4B.
This 32-bit implementation of BCPL compiles a BCPL program prog.b
to prog.o
, where prog.o
is a Linux object module linked with two libraries – blib.o
and alib.o
– by the gcc
linker to produce an executable ELF module, prog
. The library blib.b
is written in BCPL and contains the common BCPL library functions. A small library alib.s
is written in Linux assembler and contains low-level functions to access the Linux runtime environment.
Although the gcc
linker builds the executable program, the object code produced by the compiler contains only blocks of position-independent code, requiring no relocation. At runtime, alib
initializes the BCPL environment, setting up the workspace for the stack and global and static variables. Strictly, gcc
is only used to generate a Linux-compatible module that can be loaded, whereas the linking of a BCPL program and libraries is performed by alib
.
Notes for Developers
The compiler uses registers r0 to r9 for arithmetic operations, logic operations, and procedure calls. The code generator attempts to optimize the code by keeping variables in registers, minimizing the number of memory accesses.
Register rg points to the global vector, and register rp is the BCPL stack pointer or frame pointer. Procedure linkage, procedure arguments, and local variables are allocated space in the current frame. Stack space is claimed on entry to a procedure and released on return from a procedure. The link register lr holds the return address on entry to a procedure and can also be used as a temporary register within a procedure. The system stack pointer sp is not used by the BCPL compiler, so it can be used to push and pop temporary variables. The compiler uses the BCPL stack for procedure linkage and the storage of local variables. It should be noted that the ARM core is a pipelined processor and reference to pc during an instruction implies the address of the current instruction+8 for most instructions. The program counter pc is used in the code generation of relative addresses used for procedure calls and branches and also in switchon
expressions in BCPL.
Although Linux libraries are not explicitly linked, the libc library is available to BCPL programs. Fortunately, the register calling mechanisms of the GNU gcc
tool chain and BCPL are distinct and independent. The BCPL stack grows upward, with no access or modification to the system stack. In C, the stack grows downward, and local variables are stored relative to the system stack pointer sp. Consequently, it is possible to call C functions from BCPL.
In the ARM Procedure Call Standard (APCS), the first four arguments are loaded into registers r0, r1, r2, and r3, respectively, and a result is returned in register r0. The address of the procedure is computed, and the procedure is called by an appropriate branch and link (bl) instruction or a branch, link, and exchange instruction (blx).
However, C and BCPL have two important differences: (1) BCPL strings are defined by the string size in the first byte followed by the 8-bit characters of the string, whereas strings in C are arrays of 8-bit characters terminated with a zero byte. BCPL strings must be converted to C strings, if calling C. (2) Addresses of variables, vectors, and strings in BCPL are word addresses, whereas they are machine addresses in C. Passing an address from BCPL to C requires a logical left shift of two places, and passing an address from C to BCPL requires a logical right shift of two places. Care is needed with strings in C because they are not necessary aligned on 32-bit word boundaries.
In both C and BCPL, the registers r0-r9 are not preserved across procedure calls. Additionally, the BCPL registers rp, rg, and lr cannot be guaranteed to be preserved in C, and it is advisable to store these registers before calling a C procedure. In practice, they can be pushed onto the system stack and popped on return by:
push {rg, rp, lr}
pop {rg, rp, lr}
The code produced by the code generator for the factorial example is shown in Listing 2 with comments to explain specific instructions. Note that register r0 is reloaded at location 0x38 because it is reached by code from locations 0x34 and 0x74; consequently, the content of register r0 is not assured. Additionally, the reference to the string
"fact(%n) = %i4*n"
is not known at location 0x4C when the instruction is generated; therefore, a full static reference is generated with the offset 0x00000028 stored at location 0x90.
Register |
Name |
Function |
0 |
r0 |
Data register 0 |
1 |
r1 |
Data register 1 |
2 |
r2 |
Data register 2 |
3 |
r3 |
Data register 3 |
4 |
r4 |
Data register 4 |
5 |
r5 |
Data register 5 |
6 |
r6 |
Data register 6 |
7 |
r7 |
Data register 7 |
8 |
r8 |
Data register 8 |
9 |
r9 |
Data register 9 |
10 |
rg |
Global vector |
11 |
rp |
BCPL stack |
12 |
ip |
Unused |
13 |
lr |
Link register |
14 |
sp |
System stack pointer |
15 |
pc |
Program counter |
Listing 2: Code Generator Output
0: 0000003c data Section size (words)
4: 0000fddf data Section identifier
8: 6361660b data Section name “fact”
c: 20202074 data
10: 20202020 data
14: 0000dfdf data Entry identifier
18: 6174730b data Procedure name “start”
1c: 20207472 data
20: 20202020 data
24: e8a4c800 stmia r4!,{fp,lr,pc} Standard procedure entry
28: e884000f stm r4,{r0,r1,r2,r3}
2c: e244b00c sub fp,r4,#12
30: e3a00001 mov r0,#1 Initial value i
34: e58b000c str r0,[fp,#12] Save i
38: e59b000c ldr r0,[fp,#12] Load i
3c: e28b4024 add r4,fp,#36 Set new stack frame
40: eb000017 bl 0xa4 Call f(i)
44: e1a02000 mov r2,r0 Arg 3 = f(i)
48: e59b100c ldr r1,[fp,#12] Arg 2 = i
4c: e59fe03c ldr lr,[pc,#60] Arg 1 = “fact(%n) = %i4*n”
50: e08f000e add r0,pc,lr pc offset
54: e1a00120 lsr r0,r0,#2 BCPL address
58: e28b4010 add r4,fp,#16 Set new stack frame
5c: e59ae178 ldr lr,[sl,#376] Global writef
60: e12fff3e blx lr Call writef()
64: e59b000c ldr r0,[fp,#12] Load i
68: e2800001 add r0,r0,#1 Increment by 1
6c: e58b000c str r0,[fp,#12] Store i
70: e3500005 cmp r0,#5 Check end of for-loop
74: daffffef ble 0x38 Continue for-loop
78: e3a00000 mov r0,#0 Return 0
7c: e89b8800 ldm fp,{fp,pc} Standard procedure return
80: 6361660f data String “fact(%n) = %i4*n”
84: 6e252874 data
88: 203d2029 data
8c: 0a346925 data
90: 00000028 data
94: 0000dfdf data Entry identifier
98: 6361660b data String “fact”
9c: 20202074 data
a0: 20202020 data
a4: e8a4c800 stmia r4!,{fp,lr,pc} Standard procedure entry
a8: e884000f stm r4,{r0,r1,r2,r3}
ac: e244b00c sub fp,r4,#12
b0: e3500000 cmp r0,#0 Test n=0
b4: 1a000001 bne 0xc0 Skip if not
b8: e3a00001 mov r0,#1 Return 1
bc: e89b8800 ldm fp,{fp,pc} Standard procedure return
c0: e59b000c ldr r0,[fp,#12] Load n
c4: e2400001 sub r0,r0,#1 Decrement n
c8: e28b4010 add r4,fp,#16 Set new stack frame
cc: ebfffff4 bl 0xa4 Call f(n-1)
d0: e59b100c ldr r1,[fp,#12] Get n
d4: e0000190 mul r0,r0,r1 Return n*(n-1)
d8: e89b8800 ldm fp,{fp,pc} Standard procedure return
dc: 00000000 data No statics
e0: 00000000 data Start of global vector
e4: 00000001 data Global 1 (start)
e8: 00000024 data Offset to global 1
ec: 0000005e data Maximum global of the section
Installation
The file bcpl_distribution
contains the files shown in Table 2. The object files bcpl.o
and blib.o
each contain a block of position-independent code. The assembler module leader.s
provides a means of identifying the start of a BCPL program. The runtime library alib.s
is written in assembler code and includes data regions for the global variables and static variables and is linked to the GNU C runtime library libc
. Note that the files bcpl.b
and bcplfecg.h
are only needed to rebuild the compiler and are not required for user applications.
File Name |
Function |
alib.s |
A runtime library written in GNU ARM assembler |
blib.b |
The BCPL runtime library, written in BCPL |
blib.o |
A precompiled version of the BCPL runtime library blib.b |
bcpl.b |
The BCPL compiler and code generator to run under Linux |
bcpl.o |
A precompiled version of the BCPL compiler and code generator |
bcplcg.b |
The code generator used by the BCPL compiler for the ARM processor |
bcplfecg.h |
A header file used by the code generator |
leader.s |
A small assembler program only used to locate the start of a BCPL program |
libhdr.h |
The standard BCPL header |
The distribution also includes several BCPL examples and a user guide (Table 3). The programs queens.b
and primes.b
are described in Martin Richard’s excellent notes to young people interested in programming the Raspberry Pi [3].
File |
Content |
bench.b |
A small program to time the execution of a small fragment of BCPL |
fact.b |
A small program to print the factorial numbers from 1! to 5! |
primes.b |
A small program to print the prime numbers less than 1,000 |
queens.b |
An implementation of the “Queens” problem for 1 to 16 pieces |
guide.pdf |
A guide to BCPL for the Raspberry Pi, including installation notes |
To install BCPL on Raspberry Pi Model 3 or 4, create a new directory and copy the distribution files in bcpl-distribution
to this directory. Alternatively, to install BCPL on a Raspberry Pi Model 2, copy the distribution files in bcpl-distribution-rpi2. In a terminal shell, enter the commands
>unzip bcpl-distribution.zip
>as leader.s -o leader.o
>as alib.s -o alib.o
>gcc leader.o bcpl.o blib.o alib.o -o bcpl
to build and test the compiler (>
denotes the Linux prompt).
For a first compiler test, compile and run the program fact.b
, which prints the factorial numbers from 1! to 5!:
>./bcpl fact.b -o fact
>./fact
Further confidence tests rebuild the BCPL compiler bcpl.b
with the BCPL compiler and build the library blib.b
:
>./bcpl bcpl.b -o bcpl
>./bcpl -c blib.b
The BCPL library files and the compiler can then be copied to the appropriate Linux shared directories:
>sudo mkdir /usr/include/BCPL
>sudo cp libhdr.h /usr/include/BCPL/
>sudo cp bcpl /usr/bin/
>sudo cp leader.o /usr/lib/
>sudo cp blib.o /usr/lib/
>sudo cp alib.o /usr/lib/
The remaining BCPL programs can now be compiled and run with the command bcpl
rather than ./bcpl
. The compiler searches for library files in the working directory before searching the directories /usr/include/BCPL
and /usr/lib
.
Nostalgia
The influence of BCPL on the development of C and its later variants cannot be overstated. The availability of BCPL for the Raspberry Pi allows old computer science students to dust off copies of their programs, which should run directly on the Raspberry Pi. BCPL was used extensively in many UK university computer science departments. The portable multi-tasking operating system Tripos was written entirely in BCPL in the Computer Laboratory at the University of Cambridge and used in early versions of the Commodore Amiga, in the automotive industry, and in financial applications.
The logic simulator HILO-2 (the forerunner of Verilog) was developed in BCPL. Numerous utilities, including the early word processor roff
were written in BCPL. Before the availability of floating-point hardware, I adapted BCPL compilers for the Motorola 6809 and 68000 processors to use scaled fixed-point arithmetic in real-time flight simulation. nnn