BME | Machine Learning - Independent Component Analysis (ICA)

Concept

Purpose: Source separation

input: $x\subset \mathbb{R}^d$ , with n x’s. $ x^{(i)}.\text{size} = (d,1) $

output: $s\subset \mathbb{R}^d$ , where s should not be Gaussian distributed, otherwise s cannot be solved

function: $f(x)=s, \text{ s.t. } x^{(i)}=As^{(i)}\text{ or }x=As$

Here, A is called the mixing matrix, W is called the unmixing matrix, $W=A^{-1}$ , $s=Wx$ , $s_j^{(i)}=\omega_j^Tx^{(i)}$

W = \left[ \begin{matrix} -\omega_1^T-\\-\omega_2^T-\\...\\-\omega_d^T- \end{matrix} \right]

ICA Algorithm

Maximum Likelihood Estimation｜MLE Algorithm

Assuming sources are independent of each other, then

p(s) = \prod_{j=1}^dp_s(s_j)

Background Knowledge 1: $p_x(x)$ is the density of x, $p_s$ is the density of s, and since $x=W^{-1}s$ , then $p_x(x)=p_s(Wx)|W|$

Based on Knowledge 1, substitute $x=As=W^{-1}s$ into

p(s) =\prod_{j=1}^dp_s(\omega_j^Tx)\cdot|W|

Background Knowledge 2: Cumulative Distribution Function (CDF) F is defined as $F(z_0)=P(z\geq z_0)=\int_{-\infty}^{z_0}p_z(z)dz$ , density $p_z(z)=F'(z)$

Background Knowledge 3: For the sigmoid function, $g'(s)=g(s)(1-g(s))$

Background Knowledge 4: $\nabla_W|W|=|W|(W^{-1})^T$

To specify the density of $s_i$ , we choose a specific CDF, which can be any CDF other than Gaussian. For computational convenience (as per Knowledge 3), we choose the sigmoid function $g(s)=1/(1+e^{-s})$ , then as per Knowledge 2, substitute it to get $p_s(s)=g'(s)$ , and obtain the likelihood

\mathscr{l}(W;x)=\sum_{i=1}^n(\sum_{j=1}^d\log g'(\omega_j^Tx^{(i)})+\log|W|)

$|W|$ refers to $det(W)$

Iteration

To maximize $\mathscr{l}(W;x)$ , using Knowledge 3 and 4, iterate on W. The iterative algorithm formula (for a fixed i) is:

W:=W+\alpha\nabla_W\mathscr{l}(W;x) =W+\alpha\left(\left[\begin{matrix} 1-2g(\omega_1^Tx^{(i)}) \\ 1-2g(\omega_2^Tx^{(i)}) \\ ...\\ 1-2g(\omega_d^Tx^{(i)}) \end{matrix}\right] x^{(i)T}+(W^T)^{-1} \right)

Convergence

Continue until convergence of the algorithm, i.e., W no longer changes. Then, we can obtain the original sources through $s^{(i)}=Wx^{(i)}$

References

https://www.emerald.com/insight/content/doi/10.1016/j.aci.2018.08.006/full/html

Note: The content in this blog is class notes shared for educational purposes only. Some images and content are sourced from textbooks, teacher materials, and the internet. If there is any infringement, please contact aursus.blog@gmail.com for removal.