archive/Compression/Algorithm Details.tex

126 lines
7.8 KiB
TeX

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[margin=1in]{geometry}
\usepackage{amsfonts, amsmath, amssymb}
\usepackage[none]{hyphenat}
\usepackage{fancyhdr}
\usepackage{titling}
\usepackage{changepage}
\usepackage{graphicx}
\usepackage{wrapfig}
\renewcommand\maketitlehooka{\null\mbox{}\vfill}
\renewcommand\maketitlehookd{\vfill\null}
\pagestyle{fancy}
\fancyhead[L]{\slshape Algorithm Details}
\fancyhead[R]{Kenneth Jao}
\fancyfoot[C]{\thepage}
\title{Lossy Compression through Multi-dimensional Fourier Transforms }
\author{Kenneth Jao}
\date{February 2019}
\begin{document}
\maketitle
\pagebreak
\section{Introduction}
This document covers the details of the mathematics of the compression algorithm. Each main mathematical application will be explained in detail. Firstly an overview of the algorithm as a whole will be given.
\subsection{Notation}
\begin{adjustwidth}{.5in}{}
Data Matrix: \[ \left[ {\begin{array}{cccc} A & B & C & \dots \end{array}} \right] or \left[\ A,\ B,\ C,\ \dots \ \right]\]
Will mean that the dimensions 0, 1, 2, \dots will correspond to A, B, C, \dots and represents an array of data, not a mathematical matrix.\\
\newline
N-Dimensional Polynomial:\\
\begin{adjustwidth}{.5in}{}
$\vec{e_i}=(e_1,\ e_2,\ e_3, \dots e_n) \in \mathbb{N}^n$ will represent the i-th permutation of $\vec{e}$.\\
$\vec{x}^{\vec{e}} = x_1^{e_1}\cdot x_2^{e_2}\cdot \dots x_n^{e_n}$ where $\vec{x}=(x_1,\ x_2,\ x_3, \dots x_n) \in \mathbb{R}^n$\\
An arbitrary n-dimensional polynomial of degree m $P(\ x_1,\ x_2,\ x_3,\dots,\ x_n)$ will equal: \[ a_1x^{\vec{e_1}} + a_2x^{\vec{e_2}} + a_3x^{\vec{e_3}} + \dots + a_{max}x^{\vec{e_{max}}}\]
Where $max=(m+1)^n$
\end{adjustwidth}
\end{adjustwidth}
\subsection{Algorithm Overview}
\begin{adjustwidth}{.5in}{}
A set of video data in given in the format: \[ \left[ {\begin{array}{cccc} Frame & Y Pixel & X Pixel & RGB \end{array}} \right] \]
After receiving this video data, it will be processed into square blocks of dimension $N$, denoted by $Block_{w_1,w_2,w_3,...}$ where ${w_1, w_2, w_3,...}$ represents the sorting of $N$ dimensional blocks in a format that correspond to the location of the data in the ordered video. For instance, given a data in format $[\ x,\ y,\ z\ ]$, square blocks of dimension $N$ can be extracted such that the new data structure is now in the form $[\frac{x}{N},\frac{y}{N},\frac{z}{N}, N, N, N]$, and thus, in this case, $w_1 = \frac{x}{N},\ w_2 = \frac{y}{N},\ w_3 = \frac{z}{N}$. Afterwards, there are two possible steps.
\begin{enumerate}
\item The first option is to purely fit a polynomial to the data of each block. That is, find a polynomial of degree $N-1$ and of dimension $N$ where each $Block_{w_1,w_2,w_3,...}[\ x_1,\ x_2,\ x_3,\ \dots \ ]$ will correspond to some $P(\ x_1,\ x_2,\ x_3,\ \dots,\ x_n)$. The output will be the coefficients of the new polynomial.
\item The second option is to attempt to create a polynomial of degree $N-1$ and of dimension $N$ where its coefficients match first $N$ terms of the Taylor Polynomial of $cos(x_1+x_2+x_3+\dots +x_n)$. The coefficients of the new polynomial will attempt to as close to the Taylor Polynomial. The output will be the coefficients of the new polynomial along with a separate function, its usage explained later in the methodology.
\end{enumerate}
\noindent An $N$ dimensional Discrete Fourier Transform (DFT) will now be performed on these blocks. The output is the first $k$ coefficients, where $k$ is determined when the sum of a sufficient $k$ coefficients is greater than 90\% of the convergence of the $N$ dimensional Fourier Series. The 'first' components will be determined by a counting method discussed later.\\
\newline
Now, these coefficients corresponding to each $Block_{w_1,w_2,w_3,...}$ will be flattened to 2 dimensions. A Discrete Cosine Transform (DCT) will now be applied onto these coefficients, with the rounding threshold higher, for the sake of precision on coefficients.\\
\newline
The algorithm is now complete. The final products will include: a 2 dimensional DCT of the coefficients of an $N$ dimensional DFT, function for each block. The raveling process will be standard for a given dimensional size, so this needs not be stored.
\end{adjustwidth}
\pagebreak
\section{Changing Dimensions}
Firstly, the video is turned into a basic 3 dimensions. This is done by stacking the Frame and RGB dimensions and swapping the Y Pixel and X Pixel to make it more intuitively oriented. The new form becomes: \[ \left[ {\begin{array}{ccc} X Pixel & Y Pixel & Frame+RGB \end{array}} \right] \]
That is to say, $[\ x,\ y,\ 3f\ ]$, $[\ x,\ y,\ 3f+1\ ]$, $[\ x,\ y,\ 3f+2\ ]$ would lie in frame $f$, and represent the R, G, and B values, respectively. From here, we can start slicing the data in certain ways to obtain our desired block dimension.
\begin{figure}[h]
\centering
\includegraphics[width=2in]{images/RGB.png}
\caption{Visualization of flattened video data.}
\end{figure}
\subsection{3-Dimensional Block}
\begin{adjustwidth}{.5in}{}
To obtain 3-Dimensional blocks, in this case, we simply can just extract cubes from the already flattened video data. (See Figure 2.) This arises in a data format of $[\frac{x}{N},\frac{y}{N},\frac{z}{N}, N, N, N]$. In this algorithm, $N=6$ was chosen.
\begin{figure}[h]
\centering
\includegraphics[width=3in]{images/3DBlocks.png}
\caption{3D Blocked Video Data}
\end{figure}
\end{adjustwidth}
\subsection{4-Dimensional Block}
To obtain 4-Dimensional blocks, in this case, we extract a $NxNxN$ cube and then take $N$ cubes downward. (See Figure 3.) This arises in a data format of $[\frac{x}{N},\frac{y}{N^2},\frac{z}{N}, N, N, N, N]$. In this algorithm, $N=6$ was chosen.
\begin{figure}[h]
\centering
\includegraphics[width=3in]{images/4DBlocks.png}
\caption{4D Blocked Video Data}
\end{figure}
\section{Polynomial Fitting}
There are two potential outputs of polynomial fitting. FINISH
\subsection{Pure Polynomial Fitting}
The given data $D$ is in format: \[ \left[ \begin{array}{ccccc} x_1 & x_2 & x_3 & \dots & x_n \end{array} \right] \]
Where its size is $(m+1,\ m+1,\, m+1,\ \dots,\ m+1)$. To find an n-dimensional polynomial of degree m, all that is necessary is find a solution to this system of linear equations: \
\[ \left[ \begin{array}{ccccc}
\vec{e_1}^{\vec{e_1}} & \vec{e_1}^{\vec{e_2}} & \vec{e_1}^{\vec{e_3}} & \dots & \vec{e_1}^{\vec{e_{max}}}\\
\vec{e_2}^{\vec{e_1}} & \vec{e_2}^{\vec{e_2}} & \vec{e_2}^{\vec{e_3}} & \dots & \vec{e_2}^{\vec{e_{max}}}\\
\vec{e_3}^{\vec{e_1}} & \vec{e_3}^{\vec{e_2}} & \vec{e_3}^{\vec{e_3}} & \dots & \vec{e_3}^{\vec{e_{max}}}\\
\vdots & \vdots & \vdots & \vdots & \vdots \\
\vec{e_{max}}^{\vec{e_1}} & \vec{e_{max}}^{\vec{e_2}} & \vec{e_{max}}^{\vec{e_3}} & \dots & \vec{e_{max}}^{\vec{e_{max}}}
\end{array} \right]
\left[ \begin{array}{c} a_1 \\ a_2 \\ a_3 \\ \vdots \\ a_{max} \end{array} \right] =
\left[ \begin{array}{c} D[\vec{e_1}] \\ D[\vec{e_2}] \\ D[\vec{e_3}] \\ \vdots \\ D[\vec{e_{max}}] \end{array} \right]\]
Thus, to find the coefficients, all that is necessary is to compute:
\[ {\left[ \begin{array}{ccccc}
\vec{e_1}^{\vec{e_1}} & \vec{e_1}^{\vec{e_2}} & \vec{e_1}^{\vec{e_3}} & \dots & \vec{e_1}^{\vec{e_{max}}}\\
\vec{e_2}^{\vec{e_1}} & \vec{e_2}^{\vec{e_2}} & \vec{e_2}^{\vec{e_3}} & \dots & \vec{e_2}^{\vec{e_{max}}}\\
\vec{e_3}^{\vec{e_1}} & \vec{e_3}^{\vec{e_2}} & \vec{e_3}^{\vec{e_3}} & \dots & \vec{e_3}^{\vec{e_{max}}}\\
\vdots & \vdots & \vdots & \vdots & \vdots \\
\vec{e_{max}}^{\vec{e_1}} & \vec{e_{max}}^{\vec{e_2}} & \vec{e_{max}}^{\vec{e_3}} & \dots & \vec{e_{max}}^{\vec{e_{max}}}
\end{array} \right]}^{-1}
\left[ \begin{array}{c} D[\vec{e_1}] \\ D[\vec{e_2}] \\ D[\vec{e_3}] \\ \vdots \\ D[\vec{e_{max}}] \end{array} \right]\]
\noindent Thus, $a_1,\ a_2,\ a_3, \dots,\ a_3$ has now been found.
\subsection{Cosine Taylor Polynomial Fitting}
\end{document}