期望最大化算法(Expectation-Maximization Algorithm,简称EM算法)是一种迭代优化算法,主要用于估计含有隐变量(latent variables)的概率模型参数。它在机器学习和统计学中有着广泛的应用,包括但不限于高斯混合模型(Gaussian Mixture Model, GMM)、隐马尔可夫模型(Hidden Markov Model, HMM)以及各种聚类和分类问题。
算法由两步组成:E-step(期望步骤)和M-step(最大化步骤)。
首先,我们需要定义一些必要的数学函数和类。这里是一个简化版的EM算法实现,用于估计高斯混合模型的参数:
using System; using System.Linq; public class GaussianMixtureModel { private double[][] data; private double[] weights; private double[] means; private double[] variances; public GaussianMixtureModel(double[][] data, int numComponents) { this.data = data; weights = Enumerable.Repeat(1.0 / numComponents, numComponents).ToArray(); means = new double[numComponents]; variances = new double[numComponents]; // Initialize means and variances randomly. Random random = new Random(); for (int i = 0; i < numComponents; i++) { means[i] = random.NextDouble() * 10; variances[i] = random.NextDouble() * 10 + 1; } } private double GaussianPdf(double x, double mean, double variance) { double exponent = Math.Exp(-Math.Pow(x - mean, 2) / (2 * variance)); return (1 / Math.Sqrt(2 * Math.PI * variance)) * exponent; } public void ExpectationMaximization(int maxIterations) { for (int iteration = 0; iteration < maxIterations; iteration++) { // E-step double[,] responsibilities = new double[data.Length, weights.Length]; for (int i = 0; i < data.Length; i++) { double denominator = 0; for (int k = 0; k < weights.Length; k++) { responsibilities[i, k] = weights[k] * GaussianPdf(data[i][0], means[k], variances[k]); denominator += responsibilities[i, k]; } for (int k = 0; k < weights.Length; k++) { responsibilities[i, k] /= denominator; } } // M-step for (int k = 0; k < weights.Length; k++) { double weightDenominator = 0; double meanNumerator = 0; for (int i = 0; i < data.Length; i++) { weightDenominator += responsibilities[i, k]; meanNumerator += responsibilities[i, k] * data[i][0]; } means[k] = meanNumerator / weightDenominator; variances[k] = data.Sum(i => responsibilities[i, k] * Math.Pow(data[i][0] - means[k], 2)) / weightDenominator; weights[k] = weightDenominator / data.Length; } } } }
这个类GaussianMixtureModel初始化了一个具有指定数量组件的高斯混合模型,并通过ExpectationMaximization方法执行了EM算法。