Implementing a Robust Microcontroller to FPGA SPI Interface: Part 3 - FPGA Top level Modules

This installment continues our exploration of a microcontroller (uC) to Field Programmable Gate Array (FPGA) interface.

  • Part 1 introduces a Verilog design philosophy that guides the development of larger systems. This is a critical piece introducing Register Transfer Level (RTL) design guidelines such as clock boundaries, use of strobes, and the necessity of double buffers.

  • Part 2 presents the SPI protocol. Recall that the chosen protocol is adapted from the 802.3 Ethernet frame with concepts such as variable payload length and a Cyclic Redundancy Check (CRC) to provide a measure of data integrity.

The highlight of the previous installment is the command and response frame repeated here as Figure 1. The command frame describes the protocol for information flowing from the uC to the FPGA. The response frame describes the data flowing from the FPGA to the uC. Recall that the protocol is designed for full-duplex communications using SPI. This requirement demands careful attention to the timing associated with the FPGA hardware. The response frame must be generated in real-time as the command frame is received. For example, while the uC is sending the FPGA the length byte, the FPGA is simultaneously sending the user defined status flags. Likewise, while the uC is sending the CRC for the command frame, the FPGA is sending the CRC for the response frame.

This real-time streaming requirement complicates the design. So much so that this installment is dedicated to describing the top-level FPGA implementation in block diagram format. Future installments will delve into the individual FPGA blocks starting with an in-depth analysis on the double buffer module.

Figure 1: Command and response frames that form the foundation of the uC to FPGA SPI protocol.

Viewing the uC to FPGA Data Transfer From 10,000 Feet

The high-level FPGA hardware view is presented as Figure 2. Let’s start with a right to left overview to better understand the hardware’s operation. If you have not already done so, you may want to review installment one that points out some of the challenges associated with system-level design especially the section on double buffers.

In the upper right corner, we find a Pulse Width Modulator (PWM) module. It’s important to notice that this module has a 16-bit interface. At the same time there is an implicit expectation that data are transferred via SPI using an 8-bit format as shown in Figure 1. This is problematic as unexpected operation will occur if the PWM is updated one byte at a time. Instead, all 16-bits must be synchronously registered ensuring all 16 bits are presented to the PWM on a single clock cycle.

This simultaneous update is performed by the double buffer. This module receives the data from the SPI hardware and the various buffer and control modules in the middle of the block diagram. The double buffer waits until both bytes have been received and the simultaneously updates the PWM. The PWM module will then implement the change at a known starting position such as the beginning of the next PWM duty cycle.

Each double buffer is instantiated with two parameters including the byte width and the address as depicted in Figure 3. The byte width is selected to match the associated hardware such as the 16-bit PWM or the 32-bit Direct Digital Synthesizer (DDS). The address relates back the command frame presented in Figure 1. For example, to set the PWM to a duty cycle, a representative minimal command frame would be: 0x07, 0x00, 0x00, 0x02, 0x00, 0xAA, 0x55, 0xFD, 0x07. In this example, we will write the value of 0xAA55 to the 16-bit PWM which is located at address 0x0200. The read address, as set to 0x0000, is not important in this example. Note that the CRC module will be explored a future installment.

Figure 2: Top-level FPGA block diagram showing the FPGA’s data flow.

Continuing our right to left journey, we encounter the message writer which is fed by a buffer and a CRC validator which in turn is fed by the SPI interface. The message writer agent is triggered by the end frame strobe and the CRC validator. This strobe indicates that the command frame as shown in Figure 1 has been received and verified by comparing the received CRC with a CRC calculated by the CRC validator. Upon activation, the double-buffer writer transfers the contents of the frame buffer to the double buffers associated with the various FPGA hardware modules such as the PWM, DAC, and DDS.

The frame buffer is a critical component of the data integrity verification process. First, the frame buffer collects the data as it is streamed in over the SPI interface. At the same time, the CRC validator is calculating the CRC for the streamed data. As seen in Figure 1, the CRC is appended to the frame. Consequently, the CRC and frame verification integrity is unknown until the entire frame is received. The frame buffer is critical as it allows time for the CRC to be completed. It allows the data to be verified before the FPGA hardware acts on the data. The buffer eliminates the need to unwind changes made on a corrupt frame.

As implied by the frame protocol from Figure 1 and Figure 2, the length, read, write address field, and payload are all processed by the double buffer writer. When the CRC has been validated, the double buffer writer will assert the base address along with the first payload byte. It will then send a write strobe activating the first byte-wide buffer within the double buffer. The double buffer writer will then move to the next data repeating the process N times. Where N is the frame length field minus 5 bytes for the header fields. This process may be better understood by examining the structure of the double buffer as shown in Figure 3.

Figure 3: Block diagram representation of the double buffer.

Recall that each double buffer is instantiated with a base address and a length. It has a controlled sequence of operation. After all of the byte-wide first stage buffers within the double buffer are filled, the double buffer will automatically transfer the information to the second stage. This auto transfer simplifies the design of the message writer as it does not need to control the double buffer’s operation. Instead, it is free to simply transfer the frame payload one consecutive byte after another.

Before moving on to the uC to FPGA data stream we conclude with a comment about the speed of the double buffer writer. Recall that all items in Figure 2 are synchronous within one clock domain. Upon receipt of the CRC validator’s strobe, the double buffer writer will write to the double buffers at a rate limited only by the FPGA’s 100 MHZ clock. Consequently, a 100% full frame buffer is read in approximately 5 uS.

The buffer is read from position 8’b05 to 8’bFF. At the same time, the uC has the ability to start another frame simultaneously filling the frame buffer as it is being emptied. Interference is not a problem as the FPGA message writer will empty the buffer faster than the uC can fill it. This statement should be true even if the SPI module were replaced with a quad or octal SPI.

Tech Tip: Ironically, the uC to FPGA interface described in this post could benefit from an FPGA based soft core processor. This involves instantiating a dedicated uC into the FPGA fabric for the uC to FPGA interface. For Xilinx FPGAs, a controller such as PicoBlaze would be a good match. This small state machine could replace everything except the SPI module, the various double buffers, and their associated peripherals. Depending on your ability to code in assembler, it may still be an elegant solution. There are other powerful options such as MicroBlaze with traditional C programming. You could even move the uC into the FPGA using Xilinx Zynq. Similar options are available for most FPGA platforms.

Viewing the FPGA to uC Data Transfer From 10,000 Feet

The SPI interface as described in this series of articles is full-duplex with the ability to send individual read and write address as shown in the Figure 1. This is a streaming process where data are clocked into the FPGA on the MOSI line while simultaneously sending data to the uC on the MISO line. We will now examine the RTL FPGA transmit process outlined in Figure 2 using a right to left signal flow.

As described in part 1 of this series, the double buffer process is a fundamental requirement. This is most apparent for the entities shown in the lower left side of Figure 2. Observe that the FPGA’s selector switches, the 16-bit DAC, the user defined status flags and the error message count are all registered. Data are captured at the start of each SPI frame. This registering ensures that the data values will not change as they are being streamed back to the uC. This eliminates the hazard associated with updating one byte of a multi-byte data.

All data streaming from the FPGA to the uC are sent to the multiplexer show in the lower right-hand corner of Figure 2. The mux selection process is determined based on the command frame’s read address as shown in Figure 1. As the SPI interface clocks in a byte, the control machinery will advance the mux to the next consecutive read address. This operation allows full-duplex communications with independent read and write addressing. For example, the FPGA slide switches can be read at the same time as writing to a PWM. For simplicity, the mux in Figure 2 is simplified with an implicit N-byte input to byte-width output.

Observe that there are two instantiations of the CRC Validator in Figure 2. As described earlier, the upper module is involved with the uC to FPGA data while the lower is involved with the FPGA to uC data stream.

Continuing the right to left data flow for the FPGA to uC data RTL, we see that the output of the larger mux is presented to the CRC validator. It also passes through a smaller mux on its way to the SPI Interface and ultimately MISO line. Recall that the CRC validator module constructs a CRC on streamed data. The smaller mux allows the CRC to be appended to the FPGA to uC message. This switching action is based on the MSG Reader Control’s understanding of Num_bytes as held in the Frame Buffer.

Part 4 has been posted.

Your comments and suggestions are welcome. Further discussion about high-level RTL system design methodology is especially welcome.

Best Wishes,

APDahlen