Carter's Blog

Regular Expression

The regular expression is an algebraic expression for text search strings. It defines a representation to match a set of strings. The basic format of regex is /<regex>/ e.g. /love/ matches lover, but not Love. Regex is case-sensitive. Disjunction of A character The [] is used to represent character disjunction (or). e.g. /l[ai]ke/ matches like, lack. The place at the second character is either i or a. Match a range of characters Inside the [], using - can represent a range of characters, for instance....

The Lottery Ticket Hypothesis

Summary Research Objective There are recent found that, from a large neural trained network, we can prune and obtain a small sub network (even 90% of the parameters is being pruned), without compromising the performance. It natural to think that, if we could have a way, train a small network from the scratch, also obtains similar performance as the large network, saving the energy for training. According to current experience, a pruned sparse network is hard to train from start....

Pac Net a Model Pruning Approach

1. Abstract When using an over-parameterized model, the author found the model can be pruned without losing the most accuracy. The author identifies the essential weight using LWM to obtain the mask. For pruned model, train on source domain with regularization. Then, transfer the model to the target domain, freeze the un-pruned parameter on the source domain, and train only the pruned parameter on the target domain. 2. Contribution Very first using pruning in transfer learning....

Maximum Likelihood Estimation and Maximum A Posterior

Given observed data, MLE and MAP find the parameters of the distribution that maximize the probability of observed data. Compares to MLE, MAP considers prior information of theta while MLE does not

A Briefing of Cross-Entropy and KL Divergence

Cross entropy and KL divergence measure how different two distributions are. In information theory, KL Divergence measures the information loss / extra information when using distribution q to approximate the distribution p KL(p|q)