DigiKey Coffee Cup Non-SDK Baremetal RISC-V Assembly & Machine Language for Raspberry Pico 2 (Linux)

The Raspberry Pico 2 has already a Software Development Kit (SDK) that supports the ARM assembly language development. This post will illustrate how to develop a real baremetal assembly language program for the RISC-V without using the Raspberry Pico 2 SDK. Here is the block diagram that illustrate the two different architectures options (ARM or RISC-V) available to the programmer,

In the previous picture it shows that there are 2 independent RISC-V cores (Dual-Core RISC-V) (Dual-Core ARM not used here now in this article). By the standard, a component is termed a core if it contains an independent instruction fetch unit.

In order to program the Raspberry Pico 2 in Assembly language without the SDK for the RISC-V option here are the steps to do it. First, install the RISC-V gcc assembler (toolchain) in Linux,

sudo apt-get -y install gcc-riscv64-linux-gnu

Here is the baseline assembly language program that will blink on and off the LED connected to the GPIO25 in the Raspberry Pico 2. This template RISC-V assembly language that we call digikey_coffee_cup_blink.s, can be used to start developing in RISC-V Assembly language in the Raspberry Pico 2,

digikey_coffee_cup_blink.s


# ------------------------------------------
#  Digikey Coffee Cup RISC Assembly Language
#  Without Raspberry Pi SDK 
#  Blink LED in RP2350 Pi Pico 2
# ------------------------------------------

.option norelax
.option rvc

# Memory map:

# 0x10000000                      XIP Base
# 0x20000000 - 0x2007FFFF: 512 kb SRAM
# 0x20080000 - 0x20081FFF:   8 kb SRAM, too

.equ RESETS_BASE, 0x40020000

.equ CLOCKS_BASE, 0x40010000
.equ CLK_PERI_CTRL,  CLOCKS_BASE + 12 * 6

.equ IO_BANK0_BASE, 0x40028000
.equ GPIO_25_STATUS, IO_BANK0_BASE + (8 * 25)
.equ GPIO_25_CTRL,   IO_BANK0_BASE + (8 * 25) + 4

.equ PADS_BANK0_BASE, 0x40038000
.equ GPIO_25_PAD,    PADS_BANK0_BASE + 0x68

.equ SIO_BASE, 0xd0000000
.equ GPIO_IN,        SIO_BASE + 0x004  # Input value for GPIO pins
.equ GPIO_OUT,       SIO_BASE + 0x010  # GPIO output value
.equ GPIO_OE,        SIO_BASE + 0x030  # GPIO output enable

# --------------------------------------------------
#   Execution starts here, in XIP from SPI flash
# --------------------------------------------------

.text

  # Take care: We are executing at 0x10000000 currently.
  # Copy code from SPI Flash to RAM for execution, mirroring at 0

  auipc x8, 0 # auipc Add Upper Immediate to PC auipc rd, imm 	rd = pc + (imm << 12) 	branch
  li x9, 0x20000000 # Load Immediate (p) 	li rd, imm 	rd = imm 	arithmetic
  li x10, 0x100 # Just copy 256 bytes for this example...

1:lw x11, 0(x8) # lw Load Word 	lw rd, imm(rs1) 	rd = mem[rs1+imm] 	load
  sw x11, 0(x9)

  addi x8, x8, 4
  addi x9, x9, 4
  addi x10, x10, -4

  bnez x10, 1b

  # Long absolute jump into RAM now:

  lui x8, %hi(Reset)
  jalr zero, x8, %lo(Reset)

# -----------------------------------------------------------------------------
#  RAM start
# -----------------------------------------------------------------------------

Reset:

  # Remove reset of all subsystems
  li x10, RESETS_BASE
  sw zero, 0(x10)

  # Enable peripheral clock
  li x10, CLK_PERI_CTRL
  li x11, 0x800
  sw x11, 0(x10)

  # Set GPIO[25] function to single-cyle I/O: Function 5 SIO
  li x10, GPIO_25_CTRL
  li x11, 5
  sw x11, 0(x10)

  # Remove pad isolation control bit and select drive strength to 12 mA
  li x10, GPIO_25_PAD
  li x11, 0x34
  sw x11, 0(x10)

  # Set GPIO[25] output enable
  li x10, GPIO_OE
  li x11, 1<<25
  sw x11, 0(x10)
  
  li x8, GPIO_OUT                # LED output register

# -----------------------------------------------------------------------------
loop: # Blink LED
# -----------------------------------------------------------------------------

  # Register usage:

  # x8  : Initialised with IO address for GPIO
  # x13 : Scratch

  li  x13, 1<<25 
  sw  x13, 0(x8)               # Set LED
  
  # Delay using two registers, addition and bne instruction
  li t0, 1000000
  li t1, 0
.D_timer1:
  addi t1, t1, 1
  bne t0, t1, .D_timer1
    
  li  x13, 0<<25 
  sw  x13, 0(x8)               # Unset LED
  
  # Delay using two registers, addition and bne instruction
  li t0, 1000000
  li t1, 0
.D_timer2:
  addi t1, t1, 1
  bne t0, t1, .D_timer2
  
  j loop

# -----------------------------------------------------------------------------
.p2align 2 # This special signature must appear within the first 4 kb of
image_def: # the memory image to be recognised as a valid RISC-V binary.
# -----------------------------------------------------------------------------

  .word 0xffffded3
  .word 0x11010142
  .word 0x000001ff
  .word 0x00000000
  .word 0xab123579

Now proceed to run the assembler to process the digikey_coffee_cup_blink.s file,

riscv64-linux-gnu-as digikey_coffee_cup_blink.s -o digikey_coffee_cup_blink.o -march=rv32imac

this will produce an object file called digikey_coffee_cup_blink.o that will be used by the linker as follows, plus this file called memmap,

memmap

MEMORY
{
   rom(RX)   : ORIGIN = 0x20000000, LENGTH = 0x0100
}

SECTIONS
{
   .text : { *(.text*) } > rom
}

by executing linker as follows produces digikey_coffee_cup_blink.elf,

riscv64-linux-gnu-ld -o digikey_coffee_cup_blink.elf -T memmap digikey_coffee_cup_blink.o -m elf32lriscv

Now process the digikey_coffee_cup_blink.elf file as follows,

riscv64-linux-gnu-objdump -Mnumeric -d digikey_coffee_cup_blink.elf > digikey_coffee_cup_blink.list 

This will create a list file called digikey_coffee_cup_blink.list. This file shows the generated object code and the relative addresses. Finally to create the digikey_coffee_cup_blink.bin binary file perform this step,

riscv64-linux-gnu-objcopy digikey_coffee_cup_blink.elf digikey_coffee_cup_blink.bin -O binary 

this will create the digikey_coffee_cup_blink.bin binary file (machine language). There is utility called uf2conv.py written in python that will transform the file into the suitable form to load into Raspberry Pico 2,

#!/usr/bin/env python3
import sys
import struct
import subprocess
import re
import os
import os.path
import argparse
import json
from time import sleep


UF2_MAGIC_START0 = 0x0A324655 # "UF2\n"
UF2_MAGIC_START1 = 0x9E5D5157 # Randomly selected
UF2_MAGIC_END    = 0x0AB16F30 # Ditto

INFO_FILE = "/INFO_UF2.TXT"

appstartaddr = 0x2000
familyid = 0x0


def is_uf2(buf):
    w = struct.unpack("<II", buf[0:8])
    return w[0] == UF2_MAGIC_START0 and w[1] == UF2_MAGIC_START1

def is_hex(buf):
    try:
        w = buf[0:30].decode("utf-8")
    except UnicodeDecodeError:
        return False
    if w[0] == ':' and re.match(rb"^[:0-9a-fA-F\r\n]+$", buf):
        return True
    return False

def convert_from_uf2(buf):
    global appstartaddr
    global familyid
    numblocks = len(buf) // 512
    curraddr = None
    currfamilyid = None
    families_found = {}
    prev_flag = None
    all_flags_same = True
    outp = []
    for blockno in range(numblocks):
        ptr = blockno * 512
        block = buf[ptr:ptr + 512]
        hd = struct.unpack(b"<IIIIIIII", block[0:32])
        if hd[0] != UF2_MAGIC_START0 or hd[1] != UF2_MAGIC_START1:
            print("Skipping block at " + ptr + "; bad magic")
            continue
        if hd[2] & 1:
            # NO-flash flag set; skip block
            continue
        datalen = hd[4]
        if datalen > 476:
            assert False, "Invalid UF2 data size at " + ptr
        newaddr = hd[3]
        if (hd[2] & 0x2000) and (currfamilyid == None):
            currfamilyid = hd[7]
        if curraddr == None or ((hd[2] & 0x2000) and hd[7] != currfamilyid):
            currfamilyid = hd[7]
            curraddr = newaddr
            if familyid == 0x0 or familyid == hd[7]:
                appstartaddr = newaddr
        padding = newaddr - curraddr
        if padding < 0:
            assert False, "Block out of order at " + ptr
        if padding > 10*1024*1024:
            assert False, "More than 10M of padding needed at " + ptr
        if padding % 4 != 0:
            assert False, "Non-word padding size at " + ptr
        while padding > 0:
            padding -= 4
            outp.append(b"\x00\x00\x00\x00")
        if familyid == 0x0 or ((hd[2] & 0x2000) and familyid == hd[7]):
            outp.append(block[32 : 32 + datalen])
        curraddr = newaddr + datalen
        if hd[2] & 0x2000:
            if hd[7] in families_found.keys():
                if families_found[hd[7]] > newaddr:
                    families_found[hd[7]] = newaddr
            else:
                families_found[hd[7]] = newaddr
        if prev_flag == None:
            prev_flag = hd[2]
        if prev_flag != hd[2]:
            all_flags_same = False
        if blockno == (numblocks - 1):
            print("--- UF2 File Header Info ---")
            families = load_families()
            for family_hex in families_found.keys():
                family_short_name = ""
                for name, value in families.items():
                    if value == family_hex:
                        family_short_name = name
                print("Family ID is {:s}, hex value is 0x{:08x}".format(family_short_name,family_hex))
                print("Target Address is 0x{:08x}".format(families_found[family_hex]))
            if all_flags_same:
                print("All block flag values consistent, 0x{:04x}".format(hd[2]))
            else:
                print("Flags were not all the same")
            print("----------------------------")
            if len(families_found) > 1 and familyid == 0x0:
                outp = []
                appstartaddr = 0x0
    return b"".join(outp)

def convert_to_carray(file_content):
    outp = "const unsigned long bindata_len = %d;\n" % len(file_content)
    outp += "const unsigned char bindata[] __attribute__((aligned(16))) = {"
    for i in range(len(file_content)):
        if i % 16 == 0:
            outp += "\n"
        outp += "0x%02x, " % file_content[i]
    outp += "\n};\n"
    return bytes(outp, "utf-8")

def convert_to_uf2(file_content):
    global familyid
    datapadding = b""
    while len(datapadding) < 512 - 256 - 32 - 4:
        datapadding += b"\x00\x00\x00\x00"
    numblocks = (len(file_content) + 255) // 256
    outp = []
    for blockno in range(numblocks):
        ptr = 256 * blockno
        chunk = file_content[ptr:ptr + 256]
        flags = 0x0
        if familyid:
            flags |= 0x2000
        hd = struct.pack(b"<IIIIIIII",
            UF2_MAGIC_START0, UF2_MAGIC_START1,
            flags, ptr + appstartaddr, 256, blockno, numblocks, familyid)
        while len(chunk) < 256:
            chunk += b"\x00"
        block = hd + chunk + datapadding + struct.pack(b"<I", UF2_MAGIC_END)
        assert len(block) == 512
        outp.append(block)
    return b"".join(outp)

class Block:
    def __init__(self, addr):
        self.addr = addr
        self.bytes = bytearray(256)

    def encode(self, blockno, numblocks):
        global familyid
        flags = 0x0
        if familyid:
            flags |= 0x2000
        hd = struct.pack("<IIIIIIII",
            UF2_MAGIC_START0, UF2_MAGIC_START1,
            flags, self.addr, 256, blockno, numblocks, familyid)
        hd += self.bytes[0:256]
        while len(hd) < 512 - 4:
            hd += b"\x00"
        hd += struct.pack("<I", UF2_MAGIC_END)
        return hd

def convert_from_hex_to_uf2(buf):
    global appstartaddr
    appstartaddr = None
    upper = 0
    currblock = None
    blocks = []
    for line in buf.split('\n'):
        if line[0] != ":":
            continue
        i = 1
        rec = []
        while i < len(line) - 1:
            rec.append(int(line[i:i+2], 16))
            i += 2
        tp = rec[3]
        if tp == 4:
            upper = ((rec[4] << 8) | rec[5]) << 16
        elif tp == 2:
            upper = ((rec[4] << 8) | rec[5]) << 4
        elif tp == 1:
            break
        elif tp == 0:
            addr = upper + ((rec[1] << 8) | rec[2])
            if appstartaddr == None:
                appstartaddr = addr
            i = 4
            while i < len(rec) - 1:
                if not currblock or currblock.addr & ~0xff != addr & ~0xff:
                    currblock = Block(addr & ~0xff)
                    blocks.append(currblock)
                currblock.bytes[addr & 0xff] = rec[i]
                addr += 1
                i += 1
    numblocks = len(blocks)
    resfile = b""
    for i in range(0, numblocks):
        resfile += blocks[i].encode(i, numblocks)
    return resfile

def to_str(b):
    return b.decode("utf-8")

def get_drives():
    drives = []
    if sys.platform == "win32":
        r = subprocess.check_output(["wmic", "PATH", "Win32_LogicalDisk",
                                     "get", "DeviceID,", "VolumeName,",
                                     "FileSystem,", "DriveType"])
        for line in to_str(r).split('\n'):
            words = re.split(r'\s+', line)
            if len(words) >= 3 and words[1] == "2" and words[2] == "FAT":
                drives.append(words[0])
    else:
        searchpaths = ["/media"]
        if sys.platform == "darwin":
            searchpaths = ["/Volumes"]
        elif sys.platform == "linux":
            searchpaths += ["/media/" + os.environ["USER"], '/run/media/' + os.environ["USER"]]

        for rootpath in searchpaths:
            if os.path.isdir(rootpath):
                for d in os.listdir(rootpath):
                    if os.path.isdir(rootpath):
                        drives.append(os.path.join(rootpath, d))


    def has_info(d):
        try:
            return os.path.isfile(d + INFO_FILE)
        except:
            return False

    return list(filter(has_info, drives))


def board_id(path):
    with open(path + INFO_FILE, mode='r') as file:
        file_content = file.read()
    return re.search(r"Board-ID: ([^\r\n]*)", file_content).group(1)


def list_drives():
    for d in get_drives():
        print(d, board_id(d))


def write_file(name, buf):
    with open(name, "wb") as f:
        f.write(buf)
    print("Wrote %d bytes to %s" % (len(buf), name))


def load_families():
    # The expectation is that the `uf2families.json` file is in the same
    # directory as this script. Make a path that works using `__file__`
    # which contains the full path to this script.
    filename = "uf2families.json"
    pathname = os.path.join(os.path.dirname(os.path.abspath(__file__)), filename)
    with open(pathname) as f:
        raw_families = json.load(f)

    families = {}
    for family in raw_families:
        families[family["short_name"]] = int(family["id"], 0)

    return families


def main():
    global appstartaddr, familyid
    def error(msg):
        print(msg, file=sys.stderr)
        sys.exit(1)
    parser = argparse.ArgumentParser(description='Convert to UF2 or flash directly.')
    parser.add_argument('input', metavar='INPUT', type=str, nargs='?',
                        help='input file (HEX, BIN or UF2)')
    parser.add_argument('-b', '--base', dest='base', type=str,
                        default="0x2000",
                        help='set base address of application for BIN format (default: 0x2000)')
    parser.add_argument('-f', '--family', dest='family', type=str,
                        default="0x0",
                        help='specify familyID - number or name (default: 0x0)')
    parser.add_argument('-o', '--output', metavar="FILE", dest='output', type=str,
                        help='write output to named file; defaults to "flash.uf2" or "flash.bin" where sensible')
    parser.add_argument('-d', '--device', dest="device_path",
                        help='select a device path to flash')
    parser.add_argument('-l', '--list', action='store_true',
                        help='list connected devices')
    parser.add_argument('-c', '--convert', action='store_true',
                        help='do not flash, just convert')
    parser.add_argument('-D', '--deploy', action='store_true',
                        help='just flash, do not convert')
    parser.add_argument('-w', '--wait', action='store_true',
                        help='wait for device to flash')
    parser.add_argument('-C', '--carray', action='store_true',
                        help='convert binary file to a C array, not UF2')
    parser.add_argument('-i', '--info', action='store_true',
                        help='display header information from UF2, do not convert')
    args = parser.parse_args()
    appstartaddr = int(args.base, 0)

    families = load_families()

    if args.family.upper() in families:
        familyid = families[args.family.upper()]
    else:
        try:
            familyid = int(args.family, 0)
        except ValueError:
            error("Family ID needs to be a number or one of: " + ", ".join(families.keys()))

    if args.list:
        list_drives()
    else:
        if not args.input:
            error("Need input file")
        with open(args.input, mode='rb') as f:
            inpbuf = f.read()
        from_uf2 = is_uf2(inpbuf)
        ext = "uf2"
        if args.deploy:
            outbuf = inpbuf
        elif from_uf2 and not args.info:
            outbuf = convert_from_uf2(inpbuf)
            ext = "bin"
        elif from_uf2 and args.info:
            outbuf = ""
            convert_from_uf2(inpbuf)
        elif is_hex(inpbuf):
            outbuf = convert_from_hex_to_uf2(inpbuf.decode("utf-8"))
        elif args.carray:
            outbuf = convert_to_carray(inpbuf)
            ext = "h"
        else:
            outbuf = convert_to_uf2(inpbuf)
        if not args.deploy and not args.info:
            print("Converted to %s, output size: %d, start address: 0x%x" %
                  (ext, len(outbuf), appstartaddr))
        if args.convert or ext != "uf2":
            if args.output == None:
                args.output = "flash." + ext
        if args.output:
            write_file(args.output, outbuf)
        if ext == "uf2" and not args.convert and not args.info:
            drives = get_drives()
            if len(drives) == 0:
                if args.wait:
                    print("Waiting for drive to deploy...")
                    while len(drives) == 0:
                        sleep(0.1)
                        drives = get_drives()
                elif not args.output:
                    error("No drive to deploy.")
            for d in drives:
                print("Flashing %s (%s)" % (d, board_id(d)))
                write_file(d + "/NEW.UF2", outbuf)


if __name__ == "__main__":
    main()

The binary file digikey_coffee_cup_blink.bin looks like this,

 00000417
 200004b7
 10000513
 c08c400c
 04910411
 f97d1571
 20000437
 02040067
 40020537
 00052023
 40010537
 04850513
 85936585
 c10c8005
 40028537
 0cc50513
 c10c4595
 40038537
 06850513
 03400593
 0537c10c
 0513d000
 05b70305
 c10c0200
 d0000437
 06b70441
 c0140200
 000f42b7
 24028293
 03054301
 fe629fe3
 c0144681
 000f42b7
 24028293
 03054301
 fe629fe3
 0001bfd9
 ffffded3
 11010142
 000001ff
 00000000
 ab123579

Now process the digikey_coffee_cup_blink.bin file to convert it to digikey_coffee_cup_blink.uf2 format as follows,

./uf2conv.py --family 0xE48BFF57 --base 0x10000000 digikey_coffee_cup_blink.bin -o digikey_coffee_cup_blink.uf2

the digikey_coffee_cup_blink.uf2 now looks like this, with additional binary information appended,

 0a324655
 9e5d5157
 00002000
 10000000
 00000100
 00000000
 00000001
 e48bff57
 00000417
 200004b7
 10000513
 c08c400c
 04910411
 f97d1571
 20000437
 02040067
 40020537
 00052023
 40010537
 04850513
 85936585
 c10c8005
 40028537
 0cc50513
 c10c4595
 40038537
 06850513
 03400593
 0537c10c
 0513d000
 05b70305
 c10c0200
 d0000437
 06b70441
 c0140200
 000f42b7
 24028293
 03054301
 fe629fe3
 c0144681
 000f42b7
 24028293
 03054301
 fe629fe3
 0001bfd9
 ffffded3
 11010142
 000001ff
 00000000
 ab123579
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 00000000
 0ab16f30

Now we can proceed to program the machine language code into the Raspberry Pico 2 as follows (pressing the BOOTSEL button on the Raspberry Pico 2 board,

picotool load digikey_coffee_cup_blink.uf2

Family ID 'absolute' can be downloaded in absolute space:
  00000000->02000000
Loading into Flash:   [==============================]  100%

At this moment the Raspberry Pico 2 LED will blink,

Now if we want to program really in machine language we can use the following assembly file (with .word directive) and place the machine language directly into the file as follows calling it the same digikey_coffee_cup_blink.s and repeat the process previously described and it will blink the LED,

# ------------------------------------------
#  Digikey Coffee Cup RISC Machine Language
#  Without Raspberry Pi SDK 
#  Blink LED in RP2350 Pi Pico 2
# ------------------------------------------

  .word 0x00000417
  .word 0x200004b7
  .word 0x10000513
  .word 0xc08c400c
  .word 0x04910411
  .word 0xf97d1571
  .word 0x20000437
  .word 0x02040067
  .word 0x40020537
  .word 0x00052023
  .word 0x40010537
  .word 0x04850513
  .word 0x85936585
  .word 0xc10c8005
  .word 0x40028537
  .word 0x0cc50513
  .word 0xc10c4595
  .word 0x40038537
  .word 0x06850513
  .word 0x03400593
  .word 0x0537c10c
  .word 0x0513d000
  .word 0x05b70305
  .word 0xc10c0200
  .word 0xd0000437
  .word 0x06b70441
  .word 0xc0140200
  .word 0x000f42b7
  .word 0x24028293
  .word 0x03054301
  .word 0xfe629fe3
  .word 0xc0144681
  .word 0x000f42b7
  .word 0x24028293
  .word 0x03054301
  .word 0xfe629fe3
  .word 0x0001bfd9
  .word 0xffffded3
  .word 0x11010142
  .word 0x000001ff
  .word 0x00000000
  .word 0xab123579
  
#image_def: # the memory image to be recognised as a valid RISC-V binary.
  .word 0xffffded3
  .word 0x11010142
  .word 0x000001ff
  .word 0x00000000
  .word 0xab123579

The last 5 words are used to identify the RISC-V machine laguage binary.

This article described the RISC-V Non-SDK Assembly language and Machine language programming steps for the Raspberry Pico 2 available at Digikey. The RISC-V programmers view has a set of commands, and also simulators that can be used for firmware development. I hope this will provide an option for those who need to develop real baremetal applications in these electrical devices.
Have a nice day!

This article is also available in spanish here.

Este artículo esta disponible en idioma español aqui.

2 Likes