FreeRTOS
Exploit the full power of your microcontroller with the FreeRTOS multitasking operating system.
My desktop computer uses an operating system (OS) – Linux, of course – but “operating system” is a very loose term, often describing everything that makes a computer work, from launching programs, communicating over a network, managing filesystems, and presenting the user with a sophisticated graphical user interface. Much of this functionality lies outside of the kernel of the OS, and many computers are used in such a way that they don’t need some of these facilities: specifically, servers that generally run headless with administrative and user access performed over some form of network.
In the world of embedded computers, a real-time operating system (RTOS) is much more focused on being a kernel. The simplest microcontroller might require no OS at all, which is known as bare metal programming. Many microcontroller applications are written in this way, and with the judicious use of timers and interrupts, a version of multitasking can be obtained. At some point, however, this cooperative multitasking can lead to spaghetti code that is difficult to understand, debug, and maintain.
Somewhere in the spectrum of applications, from a humble microcontroller sensing when to pop the toast out of your toaster to a complex navigation and control system for a robot, a point is reached wherein some sort of operating system is desirable or even necessary. This type of operating system, RTOS, does not at first glance bear too much resemblance to a server or desktop OS.
To arrive at a useful definition of the term RTOS, you need to understand what is required at the point where bare metal programming runs out of steam. Your application has to read input from sensors or other forms of I/O and do some processing before producing output. Data can arrive unexpectedly and asynchronously and must be organized and processed.
My definition of RTOS starts at the point where it’s clear that unrelated work would be better dealt with in separate tasks: for instance, handling packets arriving over a network connection whilst computing a Fourier transform of analog-to-digital converter results. If you try to do this with cooperative multitasking you risk missing packets or being late (whatever the definition of “late” is in application terms) while occupied with the computation. The bare metal approach would deal with this problem by breaking the computation into smaller chunks to aid switching between different tasks. This approach is certainly one way to consider, but it comes at the cost of added complexity.
My definition, then, is a system that can start separate, unrelated tasks to deal with portions of the application workload, while allowing the programmer to write these tasks as though they had uninterrupted use of the processor. To make this definition useful, however, the tasks must be able to communicate and cooperate, which implies such things as flags, semaphores, mutexes, and queues (visited in detail later).
What is specifically excluded are things such as network stacks, filesystems, and GUIs: They might be required by the application but are excluded from the core RTOS definition, although where required, they may well make use of RTOS features. The official definition of RTOS (e.g., on Wikipedia) is somewhat more complex and lays emphasis on predictable reaction times to external stimuli.
In this article, I use FreeRTOS, a real-time operating system kernel for embedded devices. To make the description of the facilities provided by FreeRTOS less abstract, I use extracts from an example application I have been working on to provide working examples. The system is a radio message concentrator (Figure 1) receiving radio messages from a number of remote, battery-powered sensors. It has a 32-bit microcontroller, a WiFi interface, an 868MHz low-power long-range (LoRa) radio receiver, and a flash-based filesystem in littlefs [2].

FreeRTOS creates tasks to look after these components and synchronizes and communicates between them. The wolfSSL secure sockets library has been ported to the system, allowing the host computer to log in over an SSH connection to interact securely with the various parts of the system and to retrieve data. The system microcontroller has 128KB of RAM. The FreeRTOS kernel uses less than 4KB of the available 1MB.
APIs
The ARM port of FreeRTOS ships with two application programming interfaces (APIs): its own native API and CMSIS, an API specifically designed for Cortex-M microcontrollers. The two APIs are very similar, but I find CMSIS to be cleaner, more intuitive, and more consistent, and it is built on top of FreeRTOS, hiding some detail in favor of a simpler API.
The routine names are similar (e.g., osThreadNew()). I’ll use this API throughout this article, but my explanations are just as relevant to the native FreeRTOS (e.g., xTaskCreate()). The terms “thread” and “task” are used interchangeably here: FreeRTOS prefers “task,” whereas the CMSIS-RTOS API uses “thread.”
Memory Management
In an embedded system without an RTOS, the stack is generally located by the linker above the area in RAM used for the global variables required by the application. The stack can grow until it reaches the end of the allocated area, at which point the results are system dependent. An interrupt may be generated, allowing a more or less graceful recovery, or the stack may silently overwrite global variables, I/O registers, or interrupt vector tables. This situation is probably not recoverable and will require a system reboot. In simple systems the stack size can be estimated and sufficient headroom provided to prevent such an occurrence.
In the case of dynamic memory allocation, such systems generally avoid heaps; that is, they don’t use malloc() and free(), relying instead on statically declared variables so that the RAM layout is known at compile time. This design can add complications with allocating memory and precludes the use of libraries that expect these facilities to be available. Some non-RTOS systems implement their own malloc() and free() by reserving a chunk of RAM for use as a heap and providing their own memory management routines to manage that heap.
Stacks and Heaps
In a system with several tasks, each must have its own stack, and a mechanism must be provided to protect against stack overruns. If dynamic memory allocation is required, the programmer must be prepared for malloc() failures if the heap is exhausted or becomes too fragmented to allocate blocks of the desired size.
FreeRTOS requires a chunk of system RAM to be allocated as a heap. Task stacks are also allocated from this heap by default, so in a small microcontroller most of the RAM may be given over to this heap. FreeRTOS implements a memory management system for this heap, and if configured correctly, malloc() and free() will end up calling the FreeRTOS memory allocation routines. FreeRTOS is configured with a set of #define directives in its configuration file, FreeRTOSConfig.h. Allocation of the heap is done like this:
#define configTOTAL_HEAP_SIZE ((size_t)100000) // allocation 100kB to the heap
During development, it is essential to know whether a stack has overflowed or heap allocation has failed. The following directives should be set to 1:
#define configCHECK_FOR_STACK_OVERFLOW 2
#define configUSE_MALLOC_FAILED_HOOK 1
This causes the routines in Listing 1 to be called (located in freertos.c). You add your own system-specific implementation. My implementation, as will be noted, prints an error message to my debug stream and then halts. In production code, a hardware watchdog would generate a system reboot.
Listing 1: Handlers for Stack Overflow and malloc Failures
/* USER CODE BEGIN 4 */
#include 
void vApplicationStackOverflowHook(xTaskHandle xTask, signed char *pcTaskName)
{
   /* Run time stack overflow checking is performed if configCHECK_FOR_STACK_OVERFLOW is defined to 1 or 2. This hook function is called if a stack overflow is detected. */
  fprintf(stderr,"**** STACK OVERFLOW ****\n");
  while(1);
}
/* USER CODE END 4 */
/* USER CODE BEGIN 5 */
void vApplicationMallocFailedHook(void)
{
   /* vApplicationMallocFailedHook() will only be called if configUSE_MALLOC_FAILED_HOOK is set to 1 in FreeRTOSConfig.h. It is a hook function that will get called if a call to pvPortMalloc() fails. pvPortMalloc() is called internally by the kernel whenever a task, queue, timer or semaphore is created. It is also called by various parts of the demo application. If heap_1.c or heap_2.c are used, then the size of the heap available to pvPortMalloc() is defined by configTOTAL_HEAP_SIZE in FreeRTOSConfig.h, and the xPortGetFreeHeapSize() API function can be used to query the size of free heap space that remains (although it does not provide information on how the remaining heap might be fragmented). */
  fprintf(stderr,"**** MALLOC FAILED ****\n");
  while(1);
} If you want to be able to call malloc() and free() and use the FreeRTOS memory allocator, you need to override one for the default standard library. My implementation in a file called heap.c simply calls the FreeRTOS memory allocation routines (Listing 2). You can then look at your linker script and set the default heap to a small value to avoid wasting that memory.
Listing 2: malloc and free Calls to FreeRTOS
#include <stdio.h>
#include "FreeRTOS.h"
/** */
void *malloc(size_t size)
{
    void *ptr = NULL;
    if (size > 0)
    {
        ptr = pvPortMalloc(size);
    }
    return ptr;
}
/** */
void free(void *ptr)
{
    if (ptr)
    {
        vPortFree(ptr);
    }
}This convenient method ensures that any library that calls malloc() and free() without providing any macros to ensure portability will still call the FreeRTOS memory allocator. At an early stage of development, it’s a good idea to step into a malloc() call with a debugger and make sure your implementation has in fact been linked as expected. The same effect could be achieved with macros:
#define malloc(x) pvPortMalloc(x)
#define free(x) vPortFree(x)
When you initially set up FreeRTOS, I recommend allocating as much RAM as you can to the FreeRTOS heap with configTOTAL_HEAP_SIZE to make the task stacks as large as you can (Listing 3). (I show how task stacks are allocated later in the article.) Bear in mind that any variables you declare globally or static will take away memory available to be allocated to FreeRTOS.
Listing 3: The Heap is Simply an Array of Bytes
/* Allocate the memory for the heap. */
#if( configAPPLICATION_ALLOCATED_HEAP == 1 )
  /* The application writer has already defined the array used for the RTOS heap ‑‑ probably so it can be placed in a special segment or address. */
  extern uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#else
  static uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#endifThe stack required for the default task used for system initialization and that ends up in main() needs be no larger than required for that purpose, because main() calls osKernelStart() and that never returns, control being handed over to tasks previously created (Listing 6). Your linker should provide, with the correct prompting, a memory map file. The FreeRTOS heap is declared in the FreeRTOS file heap_4.c (Listing 3) as a character array.
As an aside, the linker divides memory into sections such as .text for program code and .bss (block starting symbol) for uninitialized variables. In my ARM/GCC compiler, at least, the .bss section is zeroed out in the startup code before main() is called. The .data section is global data, initialized by copying the values from an image in .text generated by the compiler. If you search for ucHeap in the map file, you will see something like Listing 4 in the .bss section. The second column is the address of the heap, the third the size, and the fourth the object file in which it is defined.
Listing 4: Memory Information Utility
.bss.ucHeap    0x0000000020002654    0x186a0 heap_4.o
.bss.xStart    0x000000002001acf4    0x8     heap_4.o
.bss.pxEnd     0x000000002001acfc    0x4     heap_4.o
...
.text.main     0x000000000800318c    0xa0    main.o
...
.data.desc     0x0000000020000328    0x12    usbd_desc.oOnce you have a configuration that compiles, links, and runs, you can build some debug utilities to see how much memory is used. My system has a simple shell accessed over SSH, and typing meminfo results in something like the output in Figure 2. The code that produces this output is shown in Listing 5.

Listing 5: Code that Produces the Output in Figure 2
size_t free = xPortGetFreeHeapSize();
size_t minFree = xPortGetMinimumEverFreeHeapSize();
size_t main_stack_size = main_task_attributes.stack_size;
size_t wifi_stack_size = wifi_task_attributes.stack_size;
size_t lora_stack_size = lora_task_attributes.stack_size;
size_t main_stack_space=osThreadGetStackSpace(main_taskHandle)*sizeof(uint32_t);
size_t wifi_stack_space=osThreadGetStackSpace(wifi_taskHandle)*sizeof(uint32_t);
size_tlora_stack_space=osThreadGetStackSpace(lora_taskHandle) *sizeof(uint32_t);
printf("heap size           %6d bytes\n", configTOTAL_HEAP_SIZE);
printf("heap free           %6d bytes\n", free);
printf("heap min free       %6d bytes\n", minFree);
printf("\n");
printf("main task stack size/free %6d/%6d bytes\n",main_stack_size,main_stack_space);
printf("wifi task stack size/free %6d/%6d bytes\n",wifi_stack_size,wifi_stack_space);
printf("lora task stack size/free %6d/%6d bytes\n",lora_stack_size,lora_stack_space);FreeRTOS has a high-water mark system for stacks, so the numbers shown show the worst case free stack, which is implemented by filling the stack memory with a known byte value at startup, and, when a figure for free space is requested, searching backward to find the first point that value has been replaced (i.e., when that area of the stack has been used).
Because each task requires its own stack, clearly, the more tasks created in the application, the more memory required. Tasks with functions that consume a lot of stack (looking at you, printf()) can entail significant overhead. It is important, therefore, to think carefully about memory usage when planning the architecture of your application. If possible, prototype each task separately and measure its stack usage.
Another important facility when setting up FreeRTOS is assert(). In FreeRTOSConfig.h, find the definition and change it to something like:
#include <stdio.h>
#define configASSERT( x ) if ((x) == 0) { printf("FreeRTOS assert failure\n"); taskDISABLE_INTERRUPTS(); for( ;; );}
If you see this message and are running with a debugger, you should be able to pause execution and examine the stack trace to see when the assert failure occurred.
Task Creation
Tasks are the basic unit of multitasking: The microcontroller’s time is shared between tasks, which are suspended preemptively at regular intervals by the scheduler to allow other tasks to run. This process is known as time slicing or context switching: Each task gets a slice of the overall execution time. If any part of your task polls for some event (although it should probably be using an RTOS flag, explained later), it should call osYield() or osDelay() to signal to the scheduler that other tasks can run. Most other RTOS calls can result in a context switch if another task at the same or higher priority is ready to run.
Listing 6 shows a simplified version of the main() routine in my system. Three tasks run continually in the system, and attributes are defined in the three structure definitions. The StartMainTask() and other Start routines are provided so that if the tasks do exit they land in a safe place.
Listing 6: main()
01 /* Definitions for main_task */
02 osThreadId_t main_taskHandle;
03 
04 const osThreadAttr_t main_task_attributes =
05 {
06   .name = "main_task",
07   .stack_size = 4192 * 4,
08   .priority = (osPriority_t) osPriorityNormal,
09 };
10 /* Definitions for wifi_task */
11 osThreadId_t wifi_taskHandle;
12 
13 const osThreadAttr_t wifi_task_attributes =
14 {
15   .name = "wifi_task",
16   .stack_size = 256 * 4,
17   .priority = (osPriority_t) osPriorityNormal,
18 };
19 
20 /* Definitions for lora_task */
21 osThreadId_t lora_taskHandle;
22 const osThreadAttr_t lora_task_attributes =
23 {
24   .name = "lora_task",
25   .stack_size = 256 * 4,
26   .priority = (osPriority_t) osPriorityNormal,
27 };
28 
29 void StartMainTask(void *argument)
30 {
31   main_task();
32 
33   fprintf(stderr,"**** MAIN TASK EXITED ****\n");
34   while(1)
35   {
36     osDelay(1);
37   }
38 }
39 
40 void StartWifiTask(void *argument)
41 {
42   wifi_task();
43 
44   fprintf(stderr,"**** WIFI TASK EXITED ****\n");
45   while(1)
46   {
47     osDelay(1);
48   }
49 }
50 
51 /** */
52 void StartLoraTask(void *argument)
53 {
54   lora_task();
55 
56   fprintf(stderr,"**** LORA TASK EXITED ****\n");
57   while(1)
58   {
59     osDelay(1);
60   }
61 }
62 
63 void main(void)
64 {
65   osKernelInitialize();
66 
67   /* creation of main_task */
68   main_taskHandle = osThreadNew(StartMainTask, NULL, &main_task_attributes);
69 
70   /* creation of wifi_task */
71   wifi_taskHandle = osThreadNew(StartWifiTask, NULL, &wifi_task_attributes);
72 
73   /* creation of lora_task */
74   lora_taskHandle = osThreadNew(StartLoraTask, NULL, &lora_task_attributes);
75 
76   /* Start scheduler */
77   osKernelStart();
78 
79   /* We should never get here as control is now taken by the scheduler */
80   fprintf(stderr,"**** KERNEL EXITED ****\n");
81 while (1);
82 }Finally, the main() function is the entry point to the system (after hardware initialization), which initializes the RTOS kernel and the three permanent tasks before starting the kernel. The osKernelStart() routine never returns, so it is not possible to execute any further code after that call. Note that the stack sizes are in bytes, so the *4 multiplies the desired stack size in 32-bit words to come up with the stack size in bytes.
Task Control
Although most tasks run forever in some type of loop, probably spending most of the time waiting on a signal from another thread or interrupt service routine (ISR), a task can exit at any time by calling osThreadExit(). Conversely, any task can create another by calling osThreadNew() at any time: Just be aware of the implications of creating a task stack. This ability is useful for tasks that do some calculation in the background. In this case, the thread can set a flag to indicate it has done its thing and then exit. Alternatively, another task can wait for it to exit by calling osThreadJoin(), which blocks until the monitored task has exited. If you want to know whether a task has exited without blocking, you can call OSThreadState() and look for the response osThreadTerminated. Similarly, one task can control another by calling OsThreadSuspend()/Resume()/Terminate().
You can set the priority of a task by calling osThreadSetPriority() or assign a priority on creation. I strongly recommend setting all task priorities to osPriorityNormal to begin. When your application is stable, you can adjust priorities and observe the results. You can easily starve a task of execution time and effectively create a deadlock, so proceed with caution and test carefully. If you assign a high priority to a task that does not regularly wait on an RTOS flag/mutex/semaphore/etc., or that does not call osThreadYield() or osDelay(), there’s a good chance it will hog the processor.
The function osYield() simply gives the remains of the task’s time slice to the next waiting task (i.e., it causes a context switch) and returns next time the task is scheduled, whereas osDelay() causes an immediate context switch and task execution resumes once the specified number of RTOS timer ticks has occurred. By default the timer ticks occur at 1ms intervals, so osDelay(1000) will delay execution of the calling task by one second.
Event and Thread Flags
As noted, a task can poll hardware or a global variable for events and call osYield() or osDelay() in the polling loop, but FreeRTOS provides a superior method for task communication in the form of flags and semaphores.
Flags are used to signal between tasks or between an ISR and a task. In the simplest case, an interrupt occurs and incoming data is buffered; however, an associated task must be woken to deal with that data so that the ISR can exit and allow other tasks to run. Generally, it is a good idea to do as little as possible in an ISR because it runs at high priority and blocks other tasks. Any action from an ISR that might trigger another interrupt, for instance, risks deadlocking the system.
The solution is a flag that the ISR uses to signal to its associated task that further processing is required. The task is typically asleep, waiting on the flag to be set, at which point it wakes and goes about its business of processing the incoming data. In the context of a task, calling routines that can generate interrupts, such as buffered I/O, will not cause deadlock.
Unless otherwise specified, event flags are automatically reset when the wait routine releases the waiting task, so it is ready to be set again by the next interrupt. Up to 32 flags can be associated with a single osEventFlagsId: The next example uses only one. In a more advanced use case, the waiting task can wait on a selection of flags. The osEventFlagsWait() routine returns the flag(s) set that woke the task, so it can act appropriately.
A set flag is reset automatically on exit from osEventFlagsWait(), unless specified by setting osFlagsNoClear in the wait options. The flags themselves are powers of two (i.e., 1, 2, 4, 8, etc.), and you can use a suitable enum or #define to improve code clarity.
In my system, the WiFi module generates interrupts for things like incoming data, and this data is handled by a dedicated task. The ISR wakes that task, as shown in Listing 7.
Listing 7: Waking a Task with an ISR
01 #include "cmsis_os.h"
02 
03 static osEventFlagsId_t wifi_event_flags_id;
04 static const uint32_t wifi_interrupt_flag = 1;
05 
06 /** */
07 void wifi_interrupt_service_routine(void)
08 {
09   // tell the wifi task data is waiting
10   osEventFlagsSet(wifi_event_flags_id,wifi_interrupt_flag);
11 }
12 
13 /** */
14 void wifi_task(void)
15 {
16   wifi_event_flags_id = osEventFlagsNew(NULL);
17 
18   while(true)
19   {
20      osEventFlagsWait(wifi_event_flags_id,
21                       wifi_interrupt_flag,
22                       osFlagsWaitAny,osWaitForever);
23 
24     // run the wifi event handling code
25   }
26 }If you want to poll a flag rather than suspend, waiting for it to be set, you can set the last parameter in osEventFlagsWait() to a timeout in milliseconds; it returns the flag value if set or osErrorTimeout if the timeout occurs before the flag is set (Listing 8).
Listing 8: Polling a Flag
if(osEventFlagsWait(my_flags_id,my_flag,osFlagsWaitAny,my_timeout) == my_flag)
{
  // do something based on flag being set
}
else
{
  // timeout occurred, flag not set
}A Complex Use Case of Flags
In my port of the TCP/IP stack for the WiFi chip, actions such as connect() generate callback events when an incoming connection is established. The TCP/IP driver supports up to six sockets, so you could have up to six sockets waiting for a connect event. To distinguish between them, the socket handle itself (a number between 0 and 5) is used as the event flag. A separate flag outside this range indicates failure (Listing 9).
Listing 9: Multiple Tasks Waiting on Events with Socket Handle as Flag Value
01 #include "cmsis_os.h"
02 #include "socket.h"
03 
04 static osEventFlagsId_t tcp_event_flags_id;
05 
06 /** */
07 static void socket_event_callback(SOCKET callback_socket, uint8_t message, void *data)
08 {
09   // set the flag based on socket handle
10   uint32_t done = 1 << callback_socket;
11   uint32_t fail = 0x8000;
12 
13   switch(message)
14   {
15 
16   case SOCKET_MSG_CONNECT:
17   {
18     osEventFlagsSet(tcp_event_flags_id, done);
19     return;
20   }
21 
22   // other messages
23 }
24 
25 /** */
26 bool tcp_connect(SOCKET client_socket, struct sockaddr *addr)
27 {
28   // set the flag based on socket handle
29   uint32_t done = 1 << client_socket;
30   uint32_t fail = 0x8000;
31 
32   // if connect fails, set the fail flag so the wait falls straight through
33   if (connect(client_socket, addr, sizeof(struct sockaddr_in)) != SOCK_ERR_NO_ERROR)
34   {
35     osEventFlagsSet(tcp_event_flags_id, fail);
36   }
37 
38   // wait on either flag
39   if(osEventFlagsWait(tcp_event_flags_id, fail | done, osFlagsWaitAny, osWaitForever) == done)
40   {
41     return true;
42   }
43   else
44   {
45     return false;
46   }
47 }
48 
49 void tcp_init()
50 {
51   // the tcp/ip stack calls this when connect 
(or other message) completes
52   registerSocketEventCallback(socket_event_callback);
53   tcp_event_flags_id = osEventFlagsNew(NULL);
54 }  Thread flags are a more specialized version of event flags. Whereas event flags can be used to signal a number of threads globally, thread flags only send to a single, specific thread. Every thread instance can receive thread flags without any additional allocation of a thread flag object. The API consists of
uint32_t osThreadFlagsSet(osThreadId_t thread_id, uint32_t flags)
uint32_t osThreadFlagsClear (uint32_t flags)
uint32_t osThreadFlagsGet (void)
uint32_t osThreadFlagsWait (uint32_t flags, uint32_t options, uint32_t timeout)
with no flag object to create or initialize. Any thread waiting on a flag will wake only if osThreadFlagsSet() is called with that thread’s ID. Only osThreadFlagsSet() can be called from an ISR, which is a typical use case, signalling from an ISR that a event has occurred of which a specific task needs to be aware.
Mutexes
Mutex (mutual exclusion) describes a mechanism to prevent simultaneous access to a resource by multiple tasks. In other words, mutexes serialize access to a resource to ensure only one task can access that resource at a time. This method ensures the resource does not end up in an inconsistent state because several tasks try to change its state in a conflicting manner. A tasks tries to acquire the mutex, and if the mutex is already in use, the task either sleeps, or goes off to do something else and checks back later. Once a task acquires a mutex, it knows it has exclusive access to the resource and can safely modify its state.
In my system, a mutex serializes access to the flash filesystem, for the reasons given earlier. The flash filesystem in use is littlefs. Porting littlefs to my system involved providing a few system-specific routines, such as low-level access to flash memory by initializing a structure with pointers to these routines – in this case, aquire_mutex() and release_mutex().
The mutex itself is created in the initialization routine and stored in the context member of the same structure as a void pointer. I’ve included a snippet from the littlefs source (Listing 10) to show how the mutexes are used to protect the unmount routine. When the routines are called, the context is passed in and is then cast back to a mutex ID type.
Listing 10: Configuring littlefs to use FreeRTOS Mutexes
01 #include "cmsis_os.h"
02 #include "lfs.h"
03 
04 /** */
05 static int aquire_mutex(const struct lfs_config *c)
06 {
07   // the context passed in the mutex handle
08   return osMutexAcquire((osMutexId_t)c‑>context, 0U);
09 }
10 
11 /** */
12 static int release_mutex(const struct lfs_config *c)
13 {
14   // the context passed in the mutex handle
15   return osMutexRelease((osMutexId_t)c‑>context);
16 }
17 
18 struct lfs_config lfs_cfg =
19 {
20   .lock = aquire_mutex,
21   .unlock = release_mutex,
22   // other initialization
23 };
24 
25 /** */
26 void fs_init(void)
27 {
28   // create the mutex and save its handle as the context
29   lfs_cfg.context = (void *)osMutexNew(&fs_lock_mutex_attr);
30 
31   // other initialization
32 }
33 
34 // extracts from littlefs source:
35 
36 // macros for lock and unlock
37 #define LFS_LOCK(cfg)   cfg‑>lock(cfg)
38 #define LFS_UNLOCK(cfg) cfg‑>unlock(cfg)
39 
40 // the littlefs unmount routine
41 int lfs_unmount(lfs_t *lfs)
42 {
43   int err = LFS_LOCK(lfs‑>cfg);  // lock here
44   if (err)
45   {
46         return err;
47   }
48 
49   LFS_TRACE("lfs_unmount(%p)", (void*)lfs);
50 
51   err = lfs_rawunmount(lfs);
52 
53   LFS_TRACE("lfs_unmount ‑> %d", err);
54   LFS_UNLOCK(lfs‑>cfg);          // unlock here
55   return err;
56 }Semaphores
Semaphores are a more generalized form of the mutex used to control access to shared resources from a number of tasks (Figure 3). An example of such shared resources might be a fixed number of direct memory access (DMA) channels or radio frequencies. If the number of running tasks that require a resource exceeds the number of resources available, a means to control access must be found. The osSemaphoreNew() function creates the semaphore, specifying the maximum number of available resources. A task requiring one of the set or available resources then calls osSemaphoreAcquire() when it needs the resource and osSemaphoreRelease() when it has finished with it. If the number of tasks using a resource exceeds the maximum number available, calls to osSemaphoreAcquire() by a requesting task will block until osSemaphoreRelease() is called by another task.

Queues
Queues are a first in, first out (FIFO)-like facility that buffers blocks of data (of the same size) until they are retrieved. Generally, one task is allowed to push data onto the queue, and another task retrieves it at a later date. Queues have a defined depth, specified when the queue is created, so if the data is not retrieved before the queue fills, it will be lost.
In my example, sensor messages arrive by radio to be forwarded on over a network connection (Listing 11). The host computer polls the network every few minutes for sensor messages, so some sort of buffering is required. The radio task creates a queue and places messages into the queue as they arrive. When the host computer logs in over an SSH session, it checks the queue and waiting messages are returned to the host. You will see one way in which the queue might be emptied in the following sections.
Listing 11: Queueing Radio Messages in a FreeRTOS Queue
01 // structure to hold sensor data and a timestamp of when the data arrived
02 typedef struct
03 {
04   sensor_data_t sensor_data;
05   time_t timestamp;
06   ...
07 }
08 sensor_message_t;
09 
10 // global queue handle
11 static osMessageQueueId_t sensor_queue;
12 
13 /** */
14 uint32_t radio_mesage_count(void)
15 {
16   return osMessageQueueGetCount(sensor_queue);
17 }
18 
19 /** */
20 bool radio_get_sensor_message(sensor_message_t *sensor_message)
21 {
22   return osMessageQueueGet(sensor_queue, sensor_message, NULL, 0) == osOK;
23 }
24 
25 /** */
26 bool radio_put_sensor_message(sensor_message_t *sensor_message)
27 {
28   return osMessageQueuePut(sensor_queue, sensor_message, 0, 0) == osOK;
29 }
30 
31 /** */
32 void radio_task(void)
33 {
34   // create a queue to put sensor messages on
35   sensor_queue = osMessageQueueNew(50, sizeof(sensor_message_t), NULL);
36 
37   while (true)
38   {
39      // wait for radio messages
40    if(osEventFlagsWait(radio_flags_id,
41                        rx_flag,
42                        osFlagsWaitAny,
43                        osWaitForever) == rx_flag)
44     {
45        sensor_message_t sensor_message = radio_get_message();
46 
47        // add reception time
48        sensor_message.timestamp = utc_timestamp();
49 
50        // put the message on the queue
51       if(radio_put_sensor_message(&sensor_message) != osOK)
52       {
53         print_log("radio queue full\n");
54       }
55      }
56   }
57 
58   // other task processing
59 }Timers
Many microcontrollers provide hardware timers of varying levels of sophistication, some able to generate pulse-width modulated (PWM) waveforms and the like. To use a hardware timer to generate a simple delay involves a number of steps: setting up the timer, providing an ISR to indicate the time delay has been reached, and sending some form of signal, such as an RTOS flag, which the waiting code can check. (Remember, you should not do much processing in an interrupt context.) In some cases, a provided RTOS timer is easier to use.
FreeRTOS provides osTimerNew() to create the timer, which may be made periodic (i.e., repeating) by specifying osTimerPeriodic or one-shot by specifying osTimerOnce(). You provide a callback function, which is called when the desired delay is reached, and the osTimerStart() function to start the timer with the specified delay or period.
In Listing 12, a periodic timer calls the periodic_callback() routine to examine a queue (shown later) and forwards over WiFi any pending messages. (This method could be described as polling on steroids.) The periodic timer itself is set up in the main() routine when the application starts, after which the callback function will be called at the specified interval. You can stop the timer at any time by calling osTimerStop(). In the listing, a timer callback periodically checks whether messages are available in a queue and, if so, forwards them over WiFi.
Listing 12: A Periodic Timer Empties a Queue and Forwards the Messages Over WiFi
01 /** callback to check for new radio data */
02 void periodic_callback (void *argument)
03 {
04   if(radio_message_count())
05   {
06     sensor_queue_message_t sensor_queue_message;
07 
08     radio_get_sensor_message(&sensor_queue_message);
09 
10     uint8_t recv_buffer[32];
11     uint32_t recv_length;
12 
13     // send the message to the host
14     if(wifi_transact(_htonl(inet_addr(DEFAULT_HOST)),DEFAULT_PORT,                                      (uint8_t*)&sensor_queue_message,
15                  sizeof(sensor_queue_message),
16                  recv_buffer, &recv_length))
17     {
18       print_log("msg forwarded\n");
19     }
20     else
21     {
22       print_log("msg NOT forwarded\n");
23     }
24   }
25 
26   if(recv_length > 3 && strncmp(recv_buffer, "ACK", 3) == 0)
27   {
28       print_log("msg acknowledged\n");
29   }
30 }
31 
32 /** */
33 void main(void)
34 {
35   // .. other initialization
36 
37   osTimerId_t periodic_id = osTimerNew(periodic_callback, osTimerPeriodic, NULL, NULL);
38 
39   if(periodic_id)
40   {
41     if(osTimerStart(periodic_id, 10*1000U)!= osOK)// once every 10 seconds
42     {
43      print_log("could not start periodic timer");
44     }
45   }
46   else
47   {
48     print_log("failed to create periodic timer");
49   }
50 
51 
52   // .. etc...
53 }Finally, you should note that timer callbacks are called from a dedicated timer task, and this task must be configured in freertos.h (Listing 13). This task of course requires its own stack, which is another consideration when deciding whether or not to use software timers. If not, set the first #define to 0 to save some memory.
Listing 13: Configuring FreeRTOS to Include the Timer Task
/* Software timer definitions. */
#define configUSE_TIMERS                         1
#define configTIMER_TASK_PRIORITY                ( 2 )
#define configTIMER_QUEUE_LENGTH                 10
#define configTIMER_TASK_STACK_DEPTH             512My earlier comments about task priorities apply here, too. If you have multiple timers, be aware that lengthy processing in one callback may delay another being called. The FreeRTOS documentation specifically states: “Timer callback functions execute in the context of the timer service task. It is therefore essential that timer callback functions never attempt to block. For example, a timer callback function must not call vTaskDelay(), vTaskDelayUntil(), or specify a non zero block time when accessing a queue or a semaphore”.
Portability
FreeRTOS is designed to be portable (i.e., to be used on a variety of different hardware platforms), and supported devices are listed on their website. The platform-specific parts of FreeRTOS are located in the portable subdirectory of the FreeRTOS source tree. In these files you’ll find platform-specific implementations of things such as task switching, system tick handling, and memory allocation. If you are using an existing port of FreeRTOS, you do not need to look at these files except for academic interest, but if you are porting to unsupported hardware, these files contain the parts you would have to implement yourself: As you will see if you do take a look, some of the low-level stuff requires the use of inline assembly.
Debugging and Logging
Although not restricted to FreeRTOS, debugging in a multitasking environment can be tricky. Subtle timing effects between tasks can be difficult to pinpoint, and some debugging methods can be intrusive and change the behavior of a system, making it even more difficult to track down bugs. I suspect everyone has their own preferences when it comes to debugging, so here I present some of the methods I use, together with potential pitfalls. They may or may not work for you, depending on many factors, not least the available hardware and software tools.
If you can use a debugger that has support built into the target microcontroller, it is certainly very useful to be able to place breakpoints in your code and examine variable values and so on. You might, however, miss real-time events while the processor is halted at a breakpoint, so you need to understand whether your breakpoint pauses only the current task or the entire processor.
Also very useful is the ability to write messages to a log of some type. All of the components of my system that I have ported have some form of logging that can be configured as part of the porting process. Generally, logging levels can be selected with the appropriate #define, with a way to redirect the output to some form of I/O, generally a serial port. The logging itself is in the form of printfs() functions placed at strategic points in the code, often surrounded by a conditional compilation directive. Macros are often used to reduce the amount of typing required, as shown in the example from the littefs source code (Listing 14), which is called with 
LFS_DEBUG("Bad block at 0x%"PRIx32, dir->pair[1]);Listing 14: The littlefs LFS_DEBUG Macro
#ifndef LFS_DEBUG
#ifndef LFS_NO_DEBUG
#define LFS_DEBUG_(fmt, ...) printf("%s:%d:debug: " fmt "%s\n", __FILE__, __LINE__, __VA_ARGS__)
#define LFS_DEBUG(...) LFS_DEBUG_(__VA_ARGS__, "")
#else
#define LFS_DEBUG(...)
#endif
#endifAs you can see, you can suppress logging at compile time. How the printf() output is redirected is system dependent: Usually you can override a low-level output routine. In the case of STM32 microcontrollers, you can override the following to send characters to a serial port, for example:
extern int __io_putchar(int ch) __attribute__((weak));The standard library for an embedded system should have this marked with a weak attribute, so if you provide your own implementation, it will be linked in preference to the default.
It’s important that whatever approach you choose to implement here, it should not generate interrupts or call any FreeRTOS routines. Because you don’t generally know the context from which these logging functions are being called, doing so can cause deadlocks and other mysterious behavior, which can be very confusing at the very point you are trying to identify an already puzzling bug. The usual implementation is to send the supplied character to a serial port, use a polled output routine, and set the baud rate as high as you can to prevent printf() from causing excessive delays.
Notwithstanding my assertion about the undesirability of calling RTOS routines in debugging facilities, in a system with more than one task, you might want to protect each line of output with a mutex to avoid messages from different threads getting mixed up. Just be aware that acquiring a mutex could cause a context switch. It’s also important to be able to compile such debug features out of production code conditionally.
To debug time-critical code, where adding printf() could disturb timing and adding breakpoints might cause your code to miss events, I would recommend the use of GPIO pins and an oscilloscope or logic analyzer. You can use this approach to verify that a certain part of your code is executing, as well as its execution to other parts of the code. A line of code to toggle a GPIO line adds very little overhead and can be very illuminating. You can precisely time execution of segments of code and examine the juxtaposition of events in different threads.
Wrap-Up
I’ve only scratched the surface of FreeRTOS in this article. Each of the facilities I’ve described has more depth than I can go into in an article such as this: You are encouraged to read the excellent documentation available online and experiment. A number of microcontroller development boards are available at modest prices, and the major manufacturers’ development tools and IDEs have built-in support for FreeRTOS.
My main experience is with the ST Microelectronics STM32 series, and their STM32CubeIDE will add a pre-ported, preconfigured version of FreeRTOS to your project at the click of a mouse. FreeRTOS also has additional modules such as TCP/IP stacks, command-line and HTTP interfaces, and MQTT handlers that can be added to the core. Because these modules are written with FreeRTOS in mind, they might be easier to integrate with your FreeRTOS project. I have built microcontroller systems with and without an RTOS, and I think there is room for both approaches; although, as your application becomes more complex and has to handle more disparate types of work, the RTOS approach begins to look more attractive.
RTOS comes at a price, however. The memory demands are higher both in terms of data RAM for stacks and heaps and the ROM footprint of the kernel itself. There is a learning curve associated with the use of an RTOS, and debugging can be more challenging. As a result, I would say it is important to decide at the earliest possible point whether you intend to use an RTOS in your project, because it will influence the choice of microcontroller.
You also might want to check that any libraries you intend to use are suitable for a multitasking environment. If not, you might need to protect each call to the library with a mutex, unless you are sure that the call does not change the state of the library (i.e., it has no side effects). Of course, if you confine your use of a library to a single task, this step might be unnecessary.
The full project (Figure 4) used in these examples can be found online at my GitHub site.

 
                     
            