\title{Essay: Applying Contemporary C++ in Enviroments Without Free-Store} \documentclass[11pt]{article} \usepackage{graphicx} %\usepackage{xcolor} \usepackage{fancyhdr} \usepackage{listings} \usepackage{subfig} \usepackage{biblatex} \addbibresource{references.bib} \renewcommand{\floatpagefraction}{.8}% %\renewcommand{\thesubfigure}{Figure \arabic{subfigure}} \captionsetup[subfigure]{labelformat=simple, labelsep=colon} \pagestyle{fancy} \author{Bent Bisballe Nyeng University of Aarhus} \begin{document} \maketitle \begin{abstract} %\section*{Abstract} In this essay I want to examine to which extend it is possible to use free-store allocating constructs from the standard template library (STL) and C++ core-language in enviroments without access to a free-store. \end{abstract} \section{Introduction} C++ contains a lot of helpful constructs that can be widely used, including in environments without a free-store, such as \texttt{concepts}, \texttt{module}s, \texttt{template}s in general, and functions from \texttt{algorithm} in particular but some parts of the language and the STL is off-limits when building applications in environments without free-store such as the perhaps obvious, but useful \texttt{std::vector} or \texttt{std::string}, but also the less obvious co-routines\cite{belson} or storing lambdas in \texttt{std::function}s\cite{elbeno}. This also inherently means that RAII cannot be used for managing memory allocations (such as smart-pointers), but can still be used for managing other types of resources, such as locks or hardware peripheral access. \subsection{Dynamic Memory Allocation}\label{dyn} There can be many reasons for not allocating on the free-store, either by convention; ``no allocations allowed after the engines has started'', or because the hardware or operating system doesn't have a virtual memory abstraction, ie. doesn't have a memory management unit (MMU)\cite{tannenbaum}, and therefore, over time, is at risk at fragmenting the memory available ultimately leading to memory depletion\cite{weis}. In the case of memory fragmentation one might argue that it is not the allocation that is the problem but rather the free'ing since this is when the fragmentation happens. The problem is shown in figure \ref{frag}, which might be possible to circumvent in singular concrete cases, but cannot be solved in general without the page indirections of the virtual memory\cite{weis}. \begin{figure} \makebox[\textwidth][c]{% \includegraphics[scale=0.8]{fragmentation.pdf}} \caption{\textit{(a) visualizes the full, free, memory of a system. Then, in (b), 4 equal-sized chunks of memory has been allocated filling up the whole memory. In (c) chunk 2 and 4 has been free'd and finally, in (d), a chunk which can fit in the total amount of free memory, is being allocated but fails because of memory fragmentation.}} \label{frag} \end{figure} This can to some degree be prevented by monotonic allocations where each allocation always has the same size and therefore can be re-used directly after being freed. This might work for some special usecases, where objects of similar size are being stored in a pool may not be applicable in software in general and certainly not for the dynamic allocations in the STL or the core-language. In particular, a lambda stored in a \texttt{std::function} might allocate memory on the free-store if the lambda captures exceeds the size of the (compiler dependent) small-buffer optimization (SBO) buffer inside the \texttt{std::functions}\cite{elbeno}. In much the same way as the \texttt{std::string} has its small string optimization (SSO) which are also compiler dependent in size. \subsection{Free-Standing} Work is being done to modify the ``free-standing C++'' towards, among other things, making it run on systems without free-store by isolating the parts of the STL that can be used entirely without allocating along with not supporting exceptions and run-time type information\cite{craig}. Working with the resulting small sub-set of the available components, however, is not well suited for making contemporary C++ applications; too many core components are missing so be able to really call it contemporary. The ideal solution would be to find ways to be able to use all (or at least more) features, but with a potential known set of restictions or limitations. \section{Method} In the following, 3 methods for managing memory allocations will be investigated, and their suitability for real-life applications be evaluated: \begin{itemize} \item Support from the compiler to fail compilation if an unintentional \texttt{new} or \texttt{delete} is being called, at least preventing accidental allocations. \item Using Custom Allocators for the STL components that supports it. \item Overloading \texttt{new}/\texttt{delete} to use stack allocated memory instead of the free-store for all allocation. \end{itemize} \noindent{}Each of the three will be evaluated with \texttt{std::string}, \texttt{std::vector}, \texttt{std::function} and co-routines.\\ The experiments are done on a linux PC using the gcc-11.2 compiler. \section{Experiments} This section describes the work done during the three experiments along with any numerical or analytical results found. \subsection{Detecting Allocations}\label{detect} The most common way of addressing the problem of allocation with no free-store is to simply only use stack allocations, or store all objects as static globals. But in certain areas of the C++ language dynamic allocation might occur without the developer knowing about it. %\texttt{std::string}s of sizes that doesn't fit in the SSO buffer is %one example, but even more devious is the capture clause of a %lambda, which might allocate extra memory, if more than $N$ members %are captured, where $N$ is compiler dependent. % Searches on The Internet\texttrademark{} has identified no ways of instructing the compiler that ``no allocations allowed, fail if one is made''. The reworked free-standing specification might make this possible in the future, similarly to how exceptions and run-time type information can be disabled with \texttt{-fno-exceptions} and \texttt{-fno-rtti}. There exists a \texttt{-ffreestanding} argument, but this follows the old (current) definition of free-standing and as such allows dynamic allocation so it cannot be used. As mentiond in section \ref{dyn}, some constructs have built-in optimizations that make them store their data in local members, instead of on the free-store through a pointer, until som limit is reached making it possible to use for example \texttt{std::string}s or \texttt{std::function}s as long as they don't require more than $N$ bytes, where $N$ is a compiler dependent. To detect if or when a free-store allocation is done a simple overload of \texttt{new} is made, which simply throws an exeption if called. This will lead to a run-time error and not a compile-time one, which would otherwise have been the ideal solution, but at least it can assists in finding the $N$s for a specific compiler. The code could simply look something like this: \footnotesize\begin{lstlisting}[language=C++] void* operator new(std::size_t) { throw std::bad_alloc(); } \end{lstlisting}\normalsize The complete code for this experiment can be found in the \texttt{noalloc.cc} file. The following table shows the impirically deduced sizes of $N$ wherever SSO or SBO is available along with comments about the general component behaviour. The sizes were simply increased until the compiled program started throwing the \texttt{std::bad\_alloc} exception.\\ \noindent\begin{tabular}{| l | c | l |} \hline Component & $N$ & Comments \\ \hline \texttt{std::string} & $16$ & Exception propagates to caller. \\ \texttt{std::vector} & N/A & No SBO, exception propagates to caller. \\ \texttt{std::function} & $16$ & Exception doesn't seem to propagate to caller\\ & & (possible gcc bug?). \\ co-routine & N/A & No SBO, exception propagates to caller. \\ \hline \end{tabular} \subsection{Custom Allocator} Writing a custom allocator is only a solution to a sub-set of the allocations in an application, for example if all allocations are guaranteed to always be of the same size, in which can no fragmentation will occur. But for most applications (or at least most parts on an application) this is not the case, and therefore other means need to be taken into use. Write allocator that uses stack-buffer and fails when depleted and try it out (monotonic allocations) \noindent{}The following code was heavily inspired the answer to this article \cite{mapo}, heavily modified to take a stack size as a template argument: \footnotesize\begin{lstlisting}[language=C++] template struct StackAllocator { ... pointer allocate(size_type n) { if(n > S) throw std::bad_alloc(); return buf; } void deallocate(void*, size_type) {} private: T buf[S]; }; \end{lstlisting}\normalsize This allocator can now be used in the following ways: \footnotesize\begin{lstlisting}[language=C++] std::basic_string, StackAllocator> str(31, 'a'); std::vector> vec{42}; // Pre-c++17 syntax (not verified) //std::function, int()> f; \end{lstlisting}\normalsize These experiments can be found in the \texttt{custom.cc} file. \texttt{std::string} works but need to use the \texttt{std::basic\_string} for the instantiation, which is a bit clumsy. This can be made a bit easier to read, using a templated type indirection only exposing the buffer size as the template argument: \footnotesize\begin{lstlisting}[language=C++] template using String = std::basic_string, StackAllocator>; \end{lstlisting}\normalsize \texttt{std::vector} on the other hand works works beautifully. The syntax is clean and easy to understand. In C++17 the custom allocator support in \texttt{std::function}s were removed - so this is no longer supported\cite{P0302R0}. Co-routines doesn't support custom allocator directly but they can be made to use a similar construct by adding a \texttt{operator new} implementation inside the \texttt{promise\_type}\cite{coroutines}. This was done by modifying the \texttt{Generator} class, adding the operator along with a template argument specifying the size. This new implementation resides in \texttt{generator\_stack.h} and the main change is these few lines: \footnotesize\begin{lstlisting}[language=C++] template struct Generator { ... struct promise_type // required { ... void* operator new(std::size_t n) { static char buf[S]; if(n < S) throw std::bad_alloc(); return buf; } }; }; \end{lstlisting}\normalsize The allocator, and therefore the buffer it carries, is stored within the object, so the lifetime of the buffer is guaranteed to live as long as the object using it. The pitfall, though, is that the developer need to guarantee that no re-allocations are ever done with any of the objects. Otherwise the same memory would be re-used leading to undefined behaviour. A more clever custom allocator could be written that can keep track of how much of the buffer has actually been used and make new allocation in the style of the monotonic allocator, but then the same problems as with the original free-store fragmentation will apply to the stack buffer itself. \subsection{Use \texttt{new} with Stack Buffer} Expanding on the overloaded \texttt{new} and \texttt{delete} from section \ref{detect} a mechanism for using a stack-buffer for exactly one call to \texttt{new} is made, looking something like this: \footnotesize\begin{lstlisting}[language=C++] namespace memory { void* ptr{}; std::size_t n{}; } void* operator new(std::size_t n)// throw(std::bad_alloc) { if(memory::ptr == nullptr) throw std::bad_alloc(); if(n > memory::n) throw std::bad_alloc(); auto ptr = memory::ptr; memory::ptr = nullptr; // use only once memory::n = 0; return ptr; } \end{lstlisting}\normalsize The \texttt{ptr} pointer and accompanying size \texttt{n} is put in a namespace to avoid name clashes but is otherwise left ``naked''. The pointer is used once and reset to \texttt{nullptr} to avoid being used multiple times. The stack object itself can then be made with the following type: \footnotesize\begin{lstlisting}[language=C++] template class StackNew { public: StackNew() { memory::ptr = buf; memory::n = S * sizeof(T); } private: T buf[S]; }; \end{lstlisting}\normalsize The code for these experiments can be found in the \texttt{stack\_new.cc} file. It both owns the memory and registers its pointer and size in the constructor which will be used in the firstcoming call to \texttt{new}. This means that objects can be made with the following syntax: \footnotesize\begin{lstlisting}[language=C++] StackNew<10, int> buffer; // stack buffer for 10 integers std::vector vec{1,2,3,4,5,6,7,8,9,10}; \end{lstlisting}\normalsize The upside that it works for all allocating object types, including \texttt{std::string}, \texttt{std::vector}, \texttt{std::function} and co-routines because the \texttt{new} operator is overridden for the entire application. The dangers of the mechanism is that it is up to the developer to ensure that the stack object outlives the objects that use it. The compiler cannot help with this so it must be done by convention. \section{Summing Up} Can be done, but is by no means elegant and can in some solution be downright dangerous. But if this is the price to pay to be able to use co-routines in microcontrollers perhaps it is worth the price? \printbibliography \end{document}