How to setup fast STM32 F4 FSMC to control a display on the STM32F4Discovery board?

i.amniels picture i.amniels · Jan 22, 2017 · Viewed 8.2k times · Source

I'm connecting an ILI9341 display controller to an STM32F407vg microcontroller (STM32 Discovery board). The display is connected to the STM32 with a 16-bit parallel databus.

To achieve high datarates I use the FSMC of the STM32. The FSMC is configured as a static RAM controller. I don't use chip select or read. The interface works, I can send data to the display, but it is slow.

I tried writing to the LCD with a for loop, but also with DMA in memory to memory mode. I tried writing data from flash, but also from RAM. Optimizing various DMA settings. All these changes didn't affect the speed at all. So to me there seems to be a huge bottleneck somewhere.

The figure below shows a measurement of a 16-bit word transfer (only the first 8 lines are measured). As you can see, the display's WR line toggles with only 558kHz.

Logic analyzer FSMC 558kHz

The figure below shows the FSMC timing as explained in the reference manual. NWE (write enable) is WR in my measurement. A16 is D/C. enter image description here

ADDSET and DATAST are in HCLK (AHB clock) cycles. The AHB clock is configured at its maximum speed of 168MHz. ADDSET and DATAST are set to 0 and 1. So I configured a speed of 84MHz. I don't expect to achieve 84MHz, because the DMA controller is slower (see below). But I would at least expect to achieve the DMA speed.

With ST's HAL v1.6.0.0 library I set the clock to the maximum speed:

void SystemClock_Config(void)
{

RCC_OscInitTypeDef RCC_OscInitStruct;
RCC_ClkInitTypeDef RCC_ClkInitStruct;

__HAL_RCC_PWR_CLK_ENABLE();

__HAL_PWR_VOLTAGESCALING_CONFIG(PWR_REGULATOR_VOLTAGE_SCALE1);

RCC_OscInitStruct.OscillatorType = RCC_OSCILLATORTYPE_HSI;
RCC_OscInitStruct.HSIState = RCC_HSI_ON;
RCC_OscInitStruct.HSICalibrationValue = 16;
RCC_OscInitStruct.PLL.PLLState = RCC_PLL_ON;
RCC_OscInitStruct.PLL.PLLSource = RCC_PLLSOURCE_HSI;
RCC_OscInitStruct.PLL.PLLM = 16;
RCC_OscInitStruct.PLL.PLLN = 336;
RCC_OscInitStruct.PLL.PLLP = RCC_PLLP_DIV2;
RCC_OscInitStruct.PLL.PLLQ = 7;
if (HAL_RCC_OscConfig(&RCC_OscInitStruct) != HAL_OK)
{
  Error_Handler();
}

RCC_ClkInitStruct.ClockType = RCC_CLOCKTYPE_HCLK|RCC_CLOCKTYPE_SYSCLK
                            |RCC_CLOCKTYPE_PCLK1|RCC_CLOCKTYPE_PCLK2;
RCC_ClkInitStruct.SYSCLKSource = RCC_SYSCLKSOURCE_PLLCLK;
RCC_ClkInitStruct.AHBCLKDivider = RCC_SYSCLK_DIV1;
RCC_ClkInitStruct.APB1CLKDivider = RCC_HCLK_DIV4;
RCC_ClkInitStruct.APB2CLKDivider = RCC_HCLK_DIV2;
if (HAL_RCC_ClockConfig(&RCC_ClkInitStruct, FLASH_LATENCY_5) != HAL_OK)
{
  Error_Handler();
}

HAL_SYSTICK_Config(HAL_RCC_GetHCLKFreq()/1000);

HAL_SYSTICK_CLKSourceConfig(SYSTICK_CLKSOURCE_HCLK);

/* SysTick_IRQn interrupt configuration */
HAL_NVIC_SetPriority(SysTick_IRQn, 0, 0);
}

I initialize FSMC:

void init_fsmc(void){
SRAM_HandleTypeDef sram_init_struct;
FSMC_NORSRAM_TimingTypeDef fsmc_norsram_timing_struct = {0};

sram_init_struct.Instance = FSMC_NORSRAM_DEVICE;
sram_init_struct.Extended = FSMC_NORSRAM_EXTENDED_DEVICE;

fsmc_norsram_timing_struct.AddressSetupTime       = 0;
fsmc_norsram_timing_struct.AddressHoldTime        = 1; // n/a for SRAM mode A
fsmc_norsram_timing_struct.DataSetupTime          = 1;
fsmc_norsram_timing_struct.BusTurnAroundDuration  = 0; 
fsmc_norsram_timing_struct.CLKDivision            = 2; // n/a for SRAM mode A
fsmc_norsram_timing_struct.DataLatency            = 2; // n/a for SRAM mode A
fsmc_norsram_timing_struct.AccessMode             = FSMC_ACCESS_MODE_A;

sram_init_struct.Init.NSBank             = FSMC_NORSRAM_BANK4;
sram_init_struct.Init.DataAddressMux     = FSMC_DATA_ADDRESS_MUX_DISABLE;
sram_init_struct.Init.MemoryType         = FSMC_MEMORY_TYPE_SRAM;
sram_init_struct.Init.MemoryDataWidth    = FSMC_NORSRAM_MEM_BUS_WIDTH_16;
sram_init_struct.Init.BurstAccessMode    = FSMC_BURST_ACCESS_MODE_DISABLE;
sram_init_struct.Init.WaitSignalPolarity = FSMC_WAIT_SIGNAL_POLARITY_LOW;
sram_init_struct.Init.WrapMode           = FSMC_WRAP_MODE_DISABLE;
sram_init_struct.Init.WaitSignalActive   = FSMC_WAIT_TIMING_BEFORE_WS;
sram_init_struct.Init.WriteOperation     = FSMC_WRITE_OPERATION_ENABLE;
sram_init_struct.Init.WaitSignal         = FSMC_WAIT_SIGNAL_DISABLE;
sram_init_struct.Init.ExtendedMode       = FSMC_EXTENDED_MODE_DISABLE; // maybe enable?
sram_init_struct.Init.AsynchronousWait   = FSMC_ASYNCHRONOUS_WAIT_DISABLE;
sram_init_struct.Init.WriteBurst         = FSMC_WRITE_BURST_DISABLE;

__HAL_RCC_FSMC_CLK_ENABLE();

HAL_SRAM_Init(&sram_init_struct, &fsmc_norsram_timing_struct, &fsmc_norsram_timing_struct);
}

I configure DMA:

void init_dma(void){
  __HAL_RCC_DMA2_CLK_ENABLE();

  /*##-2- Select the DMA functional Parameters ###############################*/
  dma_handle.Init.Channel = DMA_CHANNEL_0;
  dma_handle.Init.Direction = DMA_MEMORY_TO_MEMORY;
  dma_handle.Init.PeriphInc = DMA_PINC_DISABLE;               /* Peripheral increment mode */
  dma_handle.Init.MemInc = DMA_MINC_DISABLE;                  /* Memory increment mode */
  dma_handle.Init.PeriphDataAlignment = DMA_PDATAALIGN_HALFWORD; /* Peripheral data alignment */
  dma_handle.Init.MemDataAlignment = DMA_MDATAALIGN_HALFWORD;    /* memory data alignment */
  dma_handle.Init.Mode = DMA_NORMAL;                         /* Normal DMA mode */
  dma_handle.Init.Priority = DMA_PRIORITY_HIGH;              /* priority level */
  dma_handle.Init.FIFOMode = DMA_FIFOMODE_DISABLE;           /* FIFO mode disabled */
  dma_handle.Init.FIFOThreshold = DMA_FIFO_THRESHOLD_FULL;
  dma_handle.Init.MemBurst = DMA_MBURST_SINGLE;              /* Memory burst */
  dma_handle.Init.PeriphBurst = DMA_PBURST_SINGLE;           /* Peripheral burst */

  dma_handle.Instance = DMA2_Stream0;

  if(HAL_DMA_Init(&dma_handle) != HAL_OK)
  {
    // @todo proper error handling.
    return;
  }

  HAL_DMA_RegisterCallback(&dma_handle, HAL_DMA_XFER_CPLT_CB_ID, dma_transfer_complete);
  // @todo proper error handling
  HAL_DMA_RegisterCallback(&dma_handle, HAL_DMA_XFER_ERROR_CB_ID, dma_transfer_error);

  /*##-6- Configure NVIC for DMA transfer complete/error interrupts ##########*/
  /* Set Interrupt Group Priority */
  HAL_NVIC_SetPriority(DMA2_Stream0_IRQn, 1, 0);

  /* Enable the DMA STREAM global Interrupt */
  HAL_NVIC_EnableIRQ(DMA2_Stream0_IRQn);
}

And this is how I start the transaction:

HAL_DMA_Start_IT(&dma_handle, (uint32_t)&data_buffer, (uint32_t)&LCD_RAM, pixelCount);

When I perform a DMA transfer from SRAM1 to SRAM2 with this DMA configuration, I achieve a transferspeed of ~38MHz. So this is the speed I would expect on the FSMC.

What is holding back the FSMC?

Answer

i.amniels picture i.amniels · Jan 24, 2017

I found a very short piece of code here which shows how to configure FSMC with only CMSIS.

With one line of HAL code to activate the clock, this piece of code looks like this:

__HAL_RCC_FSMC_CLK_ENABLE();

int FSMC_Bank = 0;
FSMC_Bank1->BTCR[FSMC_Bank+1] = FSMC_BTR1_ADDSET_1 | FSMC_BTR1_DATAST_1;

// Bank1 NOR/SRAM control register configuration
FSMC_Bank1->BTCR[FSMC_Bank] = FSMC_BCR1_MWID_0 | FSMC_BCR1_WREN | FSMC_BCR1_MBKEN;

I tried this code and now I achieve a speed of ~27MHz. This is in the range of what I expected.

For now I will continue using this code, the root cause analysis is something for later.

BTW If you use FSMC on the STM32F4 Discovery board, desolder resistor R50, because it is connected to FSMC's NWE pin and the circuit which is connected to the STM32 with this 0 Ohm resistor has a pull-up which will distort the transfers.