Concept
Purpose: Source separation
input: x⊂Rd, with n x’s. $ x^{(i)}.\text{size} = (d,1) $
output: s⊂Rd, where s should not be Gaussian distributed, otherwise s cannot be solved
function: f(x)=s, s.t. x(i)=As(i) or x=As
Here, A is called the mixing matrix, W is called the unmixing matrix, W=A−1, s=Wx, sj(i)=ωjTx(i)
W=⎣⎢⎢⎢⎡−ω1T−−ω2T−...−ωdT−⎦⎥⎥⎥⎤
ICA Algorithm
Maximum Likelihood Estimation|MLE Algorithm
Assuming sources are independent of each other, then
p(s)=j=1∏dps(sj)
Background Knowledge 1: px(x) is the density of x, ps is the density of s, and since x=W−1s, then px(x)=ps(Wx)∣W∣
Based on Knowledge 1, substitute x=As=W−1s into
p(s)=j=1∏dps(ωjTx)⋅∣W∣
Background Knowledge 2: Cumulative Distribution Function (CDF) F is defined as F(z0)=P(z≥z0)=∫−∞z0pz(z)dz, density pz(z)=F′(z)
Background Knowledge 3: For the sigmoid function, g′(s)=g(s)(1−g(s))
Background Knowledge 4: ∇W∣W∣=∣W∣(W−1)T
To specify the density of si, we choose a specific CDF, which can be any CDF other than Gaussian. For computational convenience (as per Knowledge 3), we choose the sigmoid function g(s)=1/(1+e−s), then as per Knowledge 2, substitute it to get ps(s)=g′(s), and obtain the likelihood
l(W;x)=i=1∑n(j=1∑dlogg′(ωjTx(i))+log∣W∣)
∣W∣ refers to det(W)
Iteration
To maximize l(W;x), using Knowledge 3 and 4, iterate on W. The iterative algorithm formula (for a fixed i) is:
W:=W+α∇Wl(W;x)=W+α⎝⎜⎜⎜⎛⎣⎢⎢⎢⎡1−2g(ω1Tx(i))1−2g(ω2Tx(i))...1−2g(ωdTx(i))⎦⎥⎥⎥⎤x(i)T+(WT)−1⎠⎟⎟⎟⎞
Convergence
Continue until convergence of the algorithm, i.e., W no longer changes. Then, we can obtain the original sources through s(i)=Wx(i)
References
https://www.emerald.com/insight/content/doi/10.1016/j.aci.2018.08.006/full/html
Note: The content in this blog is class notes shared for educational purposes only. Some images and content are sourced from textbooks, teacher materials, and the internet. If there is any infringement, please contact aursus.blog@gmail.com for removal.