From b0b2eb6ff181ae84d7a83807f47c3cec25451969 Mon Sep 17 00:00:00 2001
From: Brett Weiland <brett_weiland@gmail.com>
Date: Fri, 12 Apr 2024 15:41:55 -0500
Subject: need to re-read and spellcheck

---
 report/report.tex | 96 +++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 82 insertions(+), 14 deletions(-)

(limited to 'report/report.tex')

diff --git a/report/report.tex b/report/report.tex
index afa7680..23fc494 100644
--- a/report/report.tex
+++ b/report/report.tex
@@ -8,6 +8,12 @@
 \usepackage{float}
 \usepackage{listings}
 \usepackage{xcolor}
+\usepackage{caption}
+\usepackage{subcaption}
+\usepackage{layouts}
+\usepackage{pdflscape}
+\usepackage{rotating}
+\usepackage{longtable}
 
 \usepackage{pgf}
 
@@ -53,7 +59,19 @@ Bit 1 & Bit 0 & Action   \\
 \bottomrule
 \end{tabular}
 \end{table}
-After all iterations are complete, the result is arithmaticlly shifted once to the left, and the process repeats for the number of bits in an operand.\\
+After all iterations are complete, the result is arithmaticlly shifted once to the left, and the process repeats for the number of bits in an operand. The pseudo code for this algorithim is below:\\
+      \begin{verbatim}
+Booth:
+  result = multiplier << 1
+  loop (operand length) times:
+    if last two bits are 01:
+      result(upper half) += multiplicand
+    if last two bits are 10:
+      result(upper half) += twos_comp(multiplicand)
+    remove extra bits from result
+    arithmatic shift result right
+result >> 1
+      \end{verbatim}
 
 Modified booth's algorithim functions similar to Booth's algorithim, but checks the last \textit{three} bits instead. As such, there are a larger selection of actions for each iteration:
 \begin{table}[H]
@@ -74,17 +92,51 @@ Bit 2 & Bit 1 & Bit 0 & Action \\
 \end{tabular}
 \end{table}
 Because some operations require doubling the multiplicand, an extra bit is added to the most significant side of the workspace to avoid overflow. After each iteration, the result is arithmaticlly shifted right twice. The number of iterations is only half of the length of the operands. After all iterations, the workspace is shifted right once, and the second most significant bit is set to the first most significant bit as the result register does not include the extra bit.
+Pseudo code for this algorithim is listed below:
+\begin{verbatim}
+Modified booth:
+  multiplicand(MSB) = multiplicand(second MSB)
+  result = multiplier << 1
+  loop (operand length / 2) times:
+    if last two bits are 001 or 010:
+      result(upper half) += multiplicand
+    if last two bits are 011:
+      result(upper half) += multiplicand * 2
+    if last two bits are 100:
+      result(upper half) += twos_comp(multiplicand) * 2
+    if last two bits are 101 or 110:
+      result(upper half) += twos_comp(multiplicand)
+    remove extra bits from result
+    arithmatic shift result right twice
+  result >> 1
+  result(second MSB) = result(MSB)
+  result(MSB) = 0
+\end{verbatim}
+
 \par
 \section*{Simulation Implimentation}
 Both algorithims were simulated in Python in attempts to utalize its high level nature for rapid development. The table for Booth's algorithim was preformed with a simple if-then loop, while a switch case was used in modified booth's algorithim. Simple integers were used to represent registers. 
 \par
 One objective of this paper is to analyze and compare the peformance of these two algorithms for various operand lengths. As such, the length of operands had to be constantly accounted for. Aritmatic bitwise operations, including finding two's compliment, were all implimented using functions that took length as an input. Further more, extra bits were cleared after each iteration.
 \par
-To track down issues and test the validity of the multipliers, a debug function was written. To allow Python to natively work with the operands, each value is calculated from its two's compliment format. The converted numbers are then multiplied, and the result is compared to both Booth's Algorithim and Modified Booth's Algorithim. To ensure that the debugging function itself doesn't malfunction, all converted operands and expected results are put into a single large table for checking. The exported version of this table can be seen in table X. % TODO
+To track down issues and test the validity of the multipliers, a debug function was written. To allow Python to natively work with the operands, each value is calculated from its two's compliment format. The converted numbers are then multiplied, and the result is compared to both Booth's Algorithim and Modified Booth's Algorithim. To ensure that the debugging function itself doesn't malfunction, all converted operands and expected results are put into a single large table for checking. The exported version of this table can be seen on the last page, in table \ref{debug_table}. % TODO
 \section*{Analysis}
-Modified Booth's algorithim only requires half the iterations as Booth's algorithim. As such, it can be expected that the benifit of modified Booth's algorithim increases two fold with bit length. This can be shown by comparing the two curves in figure X. 
+Modified Booth's algorithim only requires half the iterations as Booth's algorithim. As such, it can be expected that the benifit of modified Booth's algorithim increases two fold with bit length. This can be shown by comparing the two curves in figure \ref{igraph}. 
+\begin{figure}[H]
+  \centering
+  \input{iterations.pgf}\\
+    \captionof{figure}{Add and Subtract operations of various Operand Lengths}
+    \label{igraph} 
+\end{figure}
+  
 \par
-Despite this, the nature of both algorithims dictate that modified booth's algorithim is not explicitly faster. Iteration count translates to the \textit{maxiumum} number of additions and subtractions. Figure X shows the performance of the two algorithims given different input lengths, while table x shows the actual data made to generate the plot. There are some interesting things to note. When operands contain repeating zeros or ones, both operations preform similarly, as only shifting is required. Operands containing entirely ones or zeros result in idential preformance. On the contrary, alternating bits within operands demonstrate where the two algorithims differ, as almost no bits can be skipped over. Operands made entirely of alternating bits result in the maximum performance diffrence, in which modified booth's algorithim is potentially two times faster.
+Despite this, the nature of both algorithims dictate that modified booth's algorithim is not explicitly faster. Iteration count translates to the \textit{maxiumum} number of additions and subtractions. Figure \ref{pgraph} shows the performance of the two algorithims given different input lengths, while table x shows the actual data made to generate the plot. There are some interesting things to note. When operands contain repeating zeros or ones, both operations preform similarly, as only shifting is required. Operands containing entirely ones or zeros result in idential preformance. On the contrary, alternating bits within operands demonstrate where the two algorithims differ, as almost no bits can be skipped over. Operands made entirely of alternating bits result in the maximum performance diffrence, in which modified booth's algorithim is potentially two times faster.
+\begin{figure}[H]
+  \centering
+    \input{performance.pgf}\\
+    \captionof{figure}{Add and Subtract operations of various Operand Lengths}
+    \label{pgraph} 
+\end{figure}
 \par
 All of this needs to be considered when designing an ALU. Modified booth's algorithim may improve speed, but requires substantially more hardware to impliment. One must consider if die space is to be allocated to optimize multiplication. In many applications, fast multiplication is unnessesary; many early single-chip processors and microcontrollers didn't impliment multiplication, as they were intended for simple embeded applications.
 \section*{Conclusion}
@@ -96,6 +148,21 @@ Hardware multipliers can help accellerate applications in which multiplication i
 \newpage
 \section*{Appendix}
 
+% efficiency gets comparitively better over length
+% not much for the smaller operands
+% lots of repeated 1s and 0s very good for both algorithims
+\begin{figure}[h]
+  \centering
+  \input{speed_table.tex}
+  \captionof{table}{Number of additions and subtractions for various inputs}
+  \label{speed_table}
+\end{figure}
+\begin{figure}[H]
+  \input{result_table.tex}
+  \captionof{table}{Results of multiplication according to simulated multipliers}
+  \label{result_table}
+\end{figure}
+
 \lstdefinestyle{mystyle}{
     backgroundcolor=\color{backcolour},
     commentstyle=\color{codegreen},
@@ -115,16 +182,17 @@ Hardware multipliers can help accellerate applications in which multiplication i
     tabsize=2
 }
 
+
 \lstset{style=mystyle}
+\newpage
+\subsection*{Code listing}
 \lstinputlisting[language=Python]{../booth_multiplier.py}
-% efficiency gets comparitively better over length
-% not much for the smaller operands
-% lots of repeated 1s and 0s very good for both algorithims
-\begin{center}
-  \input{performance.pgf}\\
-  \input{iterations.pgf}\\
-  \input{speed_table.tex}\\
-  \input{result_table.tex}\\
-\end{center}
-    
+\begin{sidewaysfigure}
+  \captionof{table}{Simulator self checking}
+  \label{debug_table}
+\input{debug_table.tex}
+\end{sidewaysfigure}
+
+
+
 \end{document}
-- 
cgit v1.2.3