\title{A3: Concurrency} \input{preamble.tex} In this assignment experiments with doing multithreaded searches in containers is performed measured and evaluated. A few attempts at using concepts for some generalized utility functions were made and any comments on the (success or lack thereof) on these is highly welcome. The utility functions are the following: \begin{itemize} \item The \texttt{measure} function and \texttt{Measure} class from previous assignment, doing time measurement and printing the result. \item \texttt{get\_random\_vector} which is a templated function that generates and returns a \texttt{std::vector} of a given size with random values in a specified range. \item \texttt{get\_random\_strings} which is a function that generates a \texttt{std::vector} og \texttt{std::string}s of a specified length (all the same length) and with random characters picked from a specified range. \end{itemize} The two generator function are likely far more generalized than necessary for this exercise, but the idea is to be able to re-use them in future exercises, facilitate good readability of the code in the \texttt{main} function and play around a bit with concepts. \bigskip The first task is to implement a \texttt{find\_all} function that returns a \texttt{std::vector} of found keys in the supplied input container. As the wording later in the assignment are ``How large must the \textbf{array} be,...'', I decided to make the function work on any container type by using the range concept instead of hard-coding it to \texttt{std::vector}. Any comments on how this has been done are more than welcome. At the top of the \texttt{main} function there are two examples of other container types I tried using with the functions (commented out); \texttt{std::array} and c-style array. \bigskip The next task was to write a 10-way parallel \texttt{find\_all} with the same functionality as the first one but executing 10 threads working on a 10th of the input container. I generalized this to \textit{n} threads in my first implementation (I hadn't seen that this was a requirement further down in the assignment). The threads are running through \texttt{std::async(std::launch::async, ...)} and each thread is initially blocking on a \texttt{std::condition\_variable} to make it possible to start measureing the execution time only after spawning all the threads. Each threads then simply executes the initial non-threaded implementation of \texttt{find\_all} with their respective sub-ranges and returns their result in each its own \texttt{std::vector} through a \texttt{std::future}. Finally all sub-range results are accumulated in on \texttt{std::vector} and returned to the \texttt{main} function. \bigskip Vectors of 10 million elements of both \texttt{int}s and \texttt{std::string}s are used for the tests and each of the two are tested in single-threaded \texttt{find\_all}, and multithreaded \texttt{find\_all} with 1 through 16 threads. Each run is timed and printed to the console and since the output is rather large is is attached in the \texttt{console.log} file alongside this report. \bigskip The machine used for the tests is an Intel i7 with 4 cores, each hyper threaded, so 8 ``virtual'' cores, each sharing parts of the cache with one of the others. \bigskip Looking at the results for \texttt{int}; non-threaded search took about 100ms and (not surprising really) so did the 1-threaded search. From there the search performed better and better flattening out. \begin{figure} \caption{\label{ints} Showing execution times vs number of threads searching for \texttt{int}s.} \includegraphics[scale=0.55]{a3/int.pdf} \end{figure} \begin{figure} \caption{\label{strings} Showing execution times vs number of threads searching for \texttt{std::string}s.} \includegraphics[scale=0.55]{a3/string.pdf} \end{figure} As seen in figure \ref{ints} the performance increase stagnates already at around 4 threads, suggesting that the memory pipeline/cache is a bottleneck that makes the threads do a lot of waiting for data rather than execute instructions. \bigskip Similar results are seen on the \texttt{std::string} searches. Although there seem to be more calculation going on so the memory/cache bottleneck is less dominating. This can be seen in figure \ref{strings} the graph seem to flatten out at bit at around 4 threads but keeps performing better until all 8 hyperthreads are active. \end{document}