Bare Metal Programming for ARM VersatilePB I : Getting Started

Goal :

  1. To understand Bare Metal Programming.
  2. To create, compile, and run a bare metal program, for versatilePB board using Qemu.

Requirements :

  1. qemu,
  2. Ubuntu,
  3. Cross compilation toolchain

What is Bare metal Programming??

From :

bare metal: n.

1. [common] New computer hardware, unadorned with such snares and 
    delusions as an operating system, an HLL, or even assembler. 
    Commonly used in the phrase programming on the bare metal, which
    refers to the arduous work of bit bashing needed to create 
    these basic tools for a new machine. Real bare-metal programming
    involves things like building boot proms and BIOS chips, 
    implementing basic monitors used to test device drivers, 
    and writing the assemblers that will be used to write the 
    compiler back ends that will give the new machine a real 
    development environment.

 2. “Programming on the bare metal” is also used to describe a style
 of hand-hacking that relies on bit-level peculiarities of a particular
 hardware design, esp. tricks for speed and space optimization that 
 rely on crocks such as overlapping instructions (or, as in the famous
 case described in The Story of Mel' (in Appendix A), interleaving of
 opcodes on a magnetic drum to minimize fetch delays due to the device's 
 rotational latency). This sort of thing has become rare as the relative
 costs of programming time and machine resources have changed, but is 
 still found in heavily constrained environments such as industrial
 embedded systems. See Real Programmer.

So, BMP, is a kind of programming, which is direclty linked to low level hardware interaction with minimalistic support of external tools customized to launch the application on the silicon.

Bare Metal == Bare silicon

From :

Bare metal programs run without an operating system beneath;

coding on bare metal is useful to

deeply understand how a hardware architecture works and

what happens in the lowest levels of an operating system.

Why C For Baremetal Programming??

From :

 Bare Metal Programming

Like assembly, in C you are forced to work directly with memory, 
pointers, and the underlying arrangement of the bits and bytes 
that make up your data structures and their layout (packed 
structures, byte ordering, data types and conversions, etc.) 
No garbage collection, no forgiveness for buffer overruns, no 
regular expressions. C is for bare metal programming — which 
typically require safe coding practices, tight program design, 
and a good understanding of the toolchain and the target environment.

But when you want the underlying elements of the data structures 
to be visible and your algorithms to be working with these bare
metal details, C is a great choice for turning out programs that
interact with hardware, networks, sensors, and peripherals, and 
that are small (compiled size), fast (execution time), and have 
a small footprint (few dependencies, low loading overhead).

With the aid of bare bones compilation modes like
gcc -S %1.c -fno-exceptions -s -Os, and of working syntax 
converters like A2I, whenever you wish to see the kind the
assembly code that C is generating for you, you can.

Embrace C for systems level development, and you’ll retain 
the power of assembly but from a more efficient point of view.

So, Let me know what does a Executable file contain? [ Example a.out]

Okay. Enough of this boredom. Let me get my dirty hand on code….

Simplest Bare metal code..

Contents of test.c

int c_entry() {
return 0;

Contents of starup.s

.section INTERRUPT_VECTOR, "x"
.global _Reset
B Reset_Handler /* Reset */
B . /* Undefined */
B . /* SWI */
B . /* Prefetch Abort */
B . /* Data Abort */
B . /* reserved */
B . /* IRQ */
B . /* FIQ */
LDR sp, =stack_top
BL c_entry
B .

Contents of test.ld

. = 0x0;
.text : {
.data : { *(.data) }
.bss : { *(.bss COMMON) }
. = ALIGN(8);
. = . + 0x1000; /* 4kB of stack memory */
stack_top = .;

Explanation of these contents :

1 : functiona named c_entry taking no arguments and returning int
2 : returns 0
3:  function ends

1: generates a section named INTERRUPT_VECTOR containing executable 
   (“x”) code. Later we can see we use this in linker script.
2: exports the name _Reset to the linker in order to set the program 
   entry point.
3: to 11 is the interrupt vector table that contains a series of branches. 
   The notation “B .” means that the code branches on itself and  stays 
   there forever like an endless for(;;);
14:initializes the stack pointer, that is necessary when calling C 
   functions. The top of the stack (stack_top) will be defined during
15:calls the c_entry function, and saves the return address in the link 
   register (lr).

1 : Enter / Start execution at this symbol inside binary
2 - 14 : Defines section layout
  3 : Begining of section definition
  4 : Start the address at 0x0
  5 : Place the .text segment next
     6 : Place startup.o in text segment.
	     It puts INTERUPT_VECTOR at 0x0
	 7,8 : End of text segment
  9 : Start .data segment here.
 10 : Start .bss Segment following data segment
 11 : Memory allignment is 8 bits
 12 : Allocate 4Kb of stack memory starting from here
 13 : Point the top of stack at this location
14 : Section layout definition ends here

 .txt  : Code segment
 .data : Initialized data
 .bss  : Uninitialized data

Compile them ..

raj@19:31:46:$ ls
startup.s  test.c  test.ld
raj@19:31:47:$ arm-none-eabi-as -mcpu=arm926ej-s -g startup.s -o startup.o
raj@19:32:43:$ ls
startup.o  startup.s  test.c  test.ld
raj@19:32:44:$ arm-none-eabi-gcc -c -mcpu=arm926ej-s -g test.c -o test.o
raj@19:32:53:$ ls
startup.o  startup.s  test.c  test.ld  test.o

Main Function??

Ohh.. okay.. but the simplest program I know is Hello world which has a Main function.. Where is the main here?? And what is this startup.s??

Yeah.. Yeah.. You are right.. Before pre judging something, please have a look at here to understand what is a Main function??

So now, You know Main function is not the starting point and before that C Run time library comes into picture..

But we dont have any support for crt0 in our ‘Bare silicon’. So we dont have main and we replace them with startup.s

Lets Link them..

raj@19:32:54:$ arm-none-eabi-ld -T test.ld test.o startup.o -o test.elf
raj@19:32:54:$  ls
startup.o  startup.s  test.c  test.elf  test.ld  test.o

Now, we got an ELF file, which is the executable

Okay. How do I run this now??

As you can see in the code, we do nothing but we just return 0. So we can not verify the output if we run this.. Sob..Sob..

So all of what I did so far is utter waste??

No!! You can sill verify your work and this is the foundation you have laid..

Let us examine your binaries and try to find out what it has..

We can use nm command to see symbols in a object files..

From man nm :
nm - list symbols from object files

Lets start with test.o which is our compiled c code

raj@20:05:18:$ ls
startup.o  startup.s  test.c  test.elf  test.ld  test.o
raj@20:05:55:$ nm test.o 
00000000 T c_entry

Here you can see the c_entry symbol listed against ‘00000000’ and ‘T’

‘T’ means its in Text segment.

Next, lets see our startup.o

 raj@20:15:54:$nm startup.o
00000020 t Reset_Handler
00000000 T _Reset
         U c_entry
         U stack_top

Here, we can witness our Reset_Handler as ‘t’ at location 20.

This 20 before Reset_Handler is occupied by the _Reset vector,

which we put at first.

Here you can see we have two cases of ‘T’ and ‘t’.

Might be ‘T’ indicates its global.

Okay. We linked it by our linker script test.ld..

Lets see the result of that..

raj@20:15:54:$nm test.elf 
00000020 t Reset_Handler
00000000 T _Reset
00000030 T c_entry
00001050 A stack_top

Good.. We have our data alligned as we have defined.

_Reset at start followed by Reset_Handler, followed by c_entry.

All these are text segment..

Then we can see our stack code..

Yeah.. fine.. But where is my data and .bss segments went??

In above, we cant see this because we have not declared any global data. So obviously we cant see them.. In case if you want to see them, you can declare them and recompile and check.

Buddha said, Dont belive untill you have chcked it.. So, I decided to check it.. I declared three variables as below.

 Contents of test.c 
int global_init_var = 10;
int global_unint_var;
int c_entry() {
    int local_uinit;
    return 0;

And here is the corresponding output

00000020 t Reset_Handler
00000000 T _Reset
00000030 T c_entry
0000005c B global_unint_var
00000058 D global_init_var
00001060 A stack_top

0000005c B global_unint_var – uninitialized Global variable which is put

into bss segment..

00000058 D global_init_var – initialized Global variable which is put

into data segment..

Want to explore some more??

Balducci has explained something about using gdb to debug symbols.. Please have the look at them too..


Author: Shingu

Search in pursuit. Help me if you can.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s