a6/au_BentBisballeNyeng_A6.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366

\title{Essay: Applying Contemporary C++ in Enviroments Without
Free-Store}
\documentclass[11pt]{article}
\usepackage{graphicx}
%\usepackage{xcolor}
\usepackage{fancyhdr}
\usepackage{listings}
\usepackage{subfig}
\usepackage{biblatex}
\addbibresource{references.bib}
\renewcommand{\floatpagefraction}{.8}%
%\renewcommand{\thesubfigure}{Figure \arabic{subfigure}}
\captionsetup[subfigure]{labelformat=simple, labelsep=colon}
\pagestyle{fancy}
\author{Bent Bisballe Nyeng <deva@aasimon.org> University of Aarhus}
\begin{document}
\maketitle

\begin{abstract}
%\section*{Abstract}

In this essay I want to examine to which extend it is possible to use
free-store allocating constructs from the standard template library
(STL) and C++ core-language in enviroments without access to a
free-store.
\end{abstract}

\section{Introduction}
C++ contains a lot of helpful constructs that can be widely used,
including in environments without a free-store, such
as \texttt{concepts}, \texttt{module}s, \texttt{template}s in general,
and functions from \texttt{algorithm} in particular but some parts of
the language and the STL is off-limits when building applications in
environments without free-store such as the perhaps obvious, but
useful \texttt{std::vector} or \texttt{std::string}, but also the less
obvious co-routines\cite{belson} or storing lambdas
in \texttt{std::function}s\cite{elbeno}.
This also inherently means that RAII cannot be used for managing memory
allocations (such as smart-pointers), but can still be used for
managing other types of resources, such as locks or hardware
peripheral access.

\subsection{Dynamic Memory Allocation}

There can be many reasons for not allocating on the free-store, either
by convention; ``no allocations allowed after the engines has
started'', or because the hardware or operating system doesn't
have a virtual memory abstraction, ie. doesn't have a memory
management unit (MMU)\cite{tannenbaum}, and therefore, over time, is
at risk at fragmenting the memory available ultimately leading to
memory depletion\cite{weis}.

In the case of memory fragmentation one might argue that it is not the
allocation that is the problem but rather the free'ing since this is
when the fragmentation happens. This problem is shown in
figure \ref{frag}, which might be possible to circumvent in singular
concrete cases, but cannot be solved in general without the page
indirections of the virtual memory\cite{weis}.

\begin{figure}
\makebox[\textwidth][c]{%
\includegraphics[scale=0.8]{fragmentation.pdf}}
\caption{\textit{(a) visualizes the full, free, memory of a system.
Then, in (b), 4 equal-sized chunks of memory has been allocated
filling up the whole memory. In (c) chunk 2 and 4 has been free'd and
finally, in (d), a chunk which can fit in the total amount of
free memory, is being allocated but fails because of memory
fragmentation.}}
\label{frag}
\end{figure}

This can to some degree be prevented by monotonic allocations which
might work for  not very practical in real-world
software and certainly not for the dynamic allocations in the STL or
the core-language.

In particular, a lambda stored in a \texttt{std::function} might
allocate memory on the free-store if the lambda exceeds the size of
the (compiler dependent) small-buffer optimization (SBO) buffer inside
the \texttt{std::functions}\cite{elbeno}. In much the same way as
the \texttt{std::string} has its small string optimization (also
compiler dependent).

\subsection{Free-Standing}

Work is being done to modify the ``free-standing C++'' towards, among
other things, making it run on systems without free-store by isolating
the parts of the STL that can be used entirely without allocating
along with not supporting exceptions and run-time type
information\cite{craig}.

Working with the resulting small sub-set of the available components,
however, is not well suited for making contemporary C++ applications;
too many core components are missing so be able to really call it
contemporary.
The ideal solution would be to find ways to be able to use all (or at
least more) features, but with a potential known set of restictions or
limitations.

\section{Method}
In the following, 3 methods for managing memory allocations will be
investigated, and their suitability for real-life applications be
evaluated:
\begin{itemize}
\item Support from the compiler to fail compilation if an
      unintentional \texttt{new} or \texttt{delete} is being called,
      at least preventing accidental allocations.
\item Using Custom Allocators for the STL components that supports it.
\item Overloading \texttt{new}/\texttt{delete} to use stack allocated
      memory instead of the free-store for all allocation.
\end{itemize}

\noindent{}Each of the three will be evaluated with
\texttt{std::string}, \texttt{std::vector}, \texttt{std::function} and
co-routines.\\
The experiments are done on a linux PC using the gcc-11.2 compiler.

\section{Experiments}

This section describes the work done during the three experiments
along with any numerical or analytical results found.

\subsection{Detecting Allocations}\label{detect}

The most common way of addressing the problem of allocation with no
free-store is to simply only use stack allocations, or store all
objects as static globals.
But in certain areas of the C++ language dynamic allocation might
occur without the developer knowing about it.

%\texttt{std::string}s of sizes that doesn't fit in the SSO buffer is
%one example, but even more devious is the capture clause of a
%lambda, which might allocate extra memory, if more than $N$ members
%are captured, where $N$ is compiler dependent.
%
Searches on The Internet\texttrademark{} has identified no ways of instructing
the compiler that ``no allocations allowed, fail if one is made''.
The reworked free-standing specification might make this possible in
the future, similarly to how exceptions and run-time type information
can be disabled with \texttt{-fno-exceptions} and \texttt{-fno-rtti}.
There exists a \texttt{-ffreestanding} argument, but this follows the
old (current) definition of free-standing and as such allows
dynamic allocation so it cannot be used.

Some constructs have built-in optimizations that make them store their
data in local members, instead of on the free-store through a pointer,
until som limit is reached making it possible to use for
example \texttt{std::string}s or \texttt{std::function}s as long as
they don't require more than $N$ bytes, where $N$ is a compiler
dependent.

To detect if a free-store allocation is done a simple overload
of \texttt{new} is made, which simply throws an exeption if called.
This will lead to a run-time error and not a compile-time one as would
have been the ideal solution, but at least it can assists in finding
the $N$ for a specific compiler.

The code could simply look something like this:

\footnotesize\begin{lstlisting}[language=C++]
void* operator new(std::size_t)
{
  throw std::bad_alloc();
}
\end{lstlisting}\normalsize

The code for this experiment can be found in the \texttt{noalloc.cc}
file.
The following table shows the impirically deduced sizes of $N$ wherever
SSO or SBO is available along with comments about the general component
behaviour.\\

\noindent\begin{tabular}{| l | c | l |}
\hline
Component & $N$ & Comments \\
\hline
\texttt{std::string} & $16$ & Exception propagates to caller. \\
\texttt{std::vector} & N/A  & No SBO, exception propagates to caller. \\
\texttt{std::function} & $16$ & Exception doesn't seem to propagate to caller\\
                       &      & (possible gcc bug?). \\
co-routine & N/A & No SBO, exception propagates to caller. \\
\hline
\end{tabular}

\subsection{Custom Allocator}

Writing a custom allocator is only a solution to a sub-set of the
allocations in an application, for example if all allocations are
guaranteed to always be of the same size, in which can no
fragmentation will occur.

But for most applications (or at least most parts on an application)
this is not the case, and therefore other means need to be taken into
use.

Write allocator that uses stack-buffer and fails when depleted and try
it out (monotonic allocations)

\noindent{}The following code was heavily inspired the answer to this
article \cite{mapo}, heavily modified to take a stack size as a template
argument:

\footnotesize\begin{lstlisting}[language=C++]
template <typename T, std::size_t S>
struct StackAllocator
{
  ...
  pointer allocate(size_type n) {
    if(n > S) throw std::bad_alloc();
    return buf;
  }

  void deallocate(void*, size_type) {}

private:
  T buf[S];
};
\end{lstlisting}\normalsize

This allocator can now be used in the following ways:

\footnotesize\begin{lstlisting}[language=C++]
std::basic_string<char, std::char_traits<char>,
                  StackAllocator<char, 32>> str(31, 'a');

std::vector<int, StackAllocator<int, 10>> vec{42};

// Pre-c++17 syntax (not verified)
//std::function<std::allocator_arg, StackAllocator<int, 10>, int()> f;
\end{lstlisting}\normalsize

These experiments can be found in the \texttt{custom.cc} file.
\texttt{std::string} works but need to use
the \texttt{std::basic\_string} for the instantiation, which is a bit
clumsy. This can be made a bit easier to read, using a templated type
indirection only exposing the buffer size as the template argument:

\footnotesize\begin{lstlisting}[language=C++]
template<std::size_t S>
using String = std::basic_string<char, std::char_traits<char>,
                                 StackAllocator<char, S>>;
\end{lstlisting}\normalsize

\texttt{std::vector} on the other hand works works beautifully. The
syntax is clean and easy to understand.

In C++17 the custom allocator support in \texttt{std::function}s were
removed - so this is no longer supported\cite{P0302R0}.

Co-routines doesn't support custom allocator directly but they can be
made to use a similar construct by adding a \texttt{operator new}
implementation inside the \texttt{promise\_type}\cite{coroutines}.
This was done by modifying the \texttt{Generator} class, adding the
operator along with a template argument specifying the size.
This new implementation resides in \texttt{generator\_stack.h} and the
main change is these few lines:

\footnotesize\begin{lstlisting}[language=C++]
template <typename T, std::size_t S>
struct Generator
{
  ...
  struct promise_type // required
  {
    ...
    void* operator new(std::size_t n)
    {
      static char buf[S];
      if(n < S) throw std::bad_alloc();
      return buf;
    }
  };
};
\end{lstlisting}\normalsize

The allocator, and therefore the buffer it carries, is stored within
the object, so the lifetime of the buffer is guaranteed to live as
long as the object using it.
The pitfall, though, is that the developer need to guarantee that no
re-allocations are ever done with any of the objects. Otherwise the
same memory would be re-used leading to undefined behaviour.
A more clever custom allocator could be written that can keep track of how
much of the buffer has actually been used and make new allocation in
the style of the monotonic allocator, but then the same problems
as with the original free-store fragmentation will apply to the stack
buffer itself.

\subsection{Use \texttt{new} with Stack Buffer}

Expanding on the overloaded \texttt{new} and \texttt{delete} from
section \ref{detect} a mechanism for using a stack-buffer for exactly
one call to \texttt{new} is made, looking something like this:

\footnotesize\begin{lstlisting}[language=C++]
namespace memory
{
  void* ptr{};
  std::size_t n{};
}

void* operator new(std::size_t n)// throw(std::bad_alloc)
{
  if(memory::ptr == nullptr) throw std::bad_alloc();
  if(n > memory::n) throw std::bad_alloc();

  auto ptr = memory::ptr;
  memory::ptr = nullptr; // use only once
  memory::n = 0;

  return ptr;
}
\end{lstlisting}\normalsize

The \texttt{ptr} pointer and accompanying size \texttt{n} is put in a
namespace to avoid name clashes but is otherwise left ``naked''. The
pointer is used once and reset to \texttt{nullptr} to avoid being used
multiple times.

The stack object itself can then be made with the following type:

\footnotesize\begin{lstlisting}[language=C++]
template<std::size_t S, typename T = char>
class StackNew
{
public:
  StackNew()
  {
    memory::ptr = buf;
    memory::n = S * sizeof(T);
  }

private:
  T buf[S];
};
\end{lstlisting}\normalsize

The code for these experiments can be found in the \texttt{stack\_new.cc} file.

It both owns the memory and registers its pointer and size in the
constructor which will be used in the firstcoming call
to \texttt{new}.
This means that objects can be made with the following syntax:

\footnotesize\begin{lstlisting}[language=C++]
StackNew<10, int> buffer; // stack buffer for 10 integers
std::vector<int> vec{1,2,3,4,5,6,7,8,9,10};
\end{lstlisting}\normalsize

The upside that it works for all allocating object types, including
\texttt{std::string}, \texttt{std::vector}, \texttt{std::function} and
co-routines because the \texttt{new} operator is overridden for the
entire application.
The dangers of the mechanism is that it is up to the developer to
ensure that the stack object outlives the objects that use it.
The compiler cannot help with this so it must be done by convention.

\section{Summing Up}

Can be done, but is by no means elegant and can in some solution be
downright dangerous.
But if this is the price to pay to be able to use co-routines in
microcontrollers perhaps it is worth the price?

\printbibliography

\end{document}