Maximum Likelihood Estimation and Maximum A Posterior

Given observed data, MLE and MAP find the parameters of the distribution that maximize the probability of observed data. Compares to MLE, MAP considers prior information of theta while MLE does not

July 22, 2022 · 2 min · 414 words · Carter Yifeng CHENG

A Briefing of Cross-Entropy and KL Divergence

Cross entropy and KL divergence measure how different two distributions are. In information theory, KL Divergence measures the information loss / extra information when using distribution q to approximate the distribution p KL(p|q)

July 19, 2022 · 3 min · 530 words · Carter Y. CHENG

Fisher Information

Introduction We may measure a random variable indirectly. However, the indirect measurement is affected by the noise following a certain distribution. Thus, the measurement is represented by a random variable that follows a certain distribution. For example, we want to measure $\theta$, the temperate of the stomach, and it can be measured by measuring the mouth’s temperature $Y$. $$Y = \theta + W$$ $W$ is the noise that follows the normal distribution $N\sim(0,\sigma)$....

July 19, 2022 · 1 min · 149 words · Carter Y. CHENG