Probability Distributions

简介: Refer to R Tutorial andExercise Solution A probability distribution describes how the values of a random variable is distributed.
Refer to R Tutorial andExercise Solution

A probability distribution describes how the values of a random variable is distributed.


Binomial Distribution, 二项分布

The binomial distribution is a discrete probability distribution. It describes the outcome of n independent trials in an experiment. Each trial is assumed to have only two outcome, labeled as success or failure. If the probability of a successful trial is p, then the probability of having x successful trials in an experiment is as follows.


一个简单的例子如下:掷一枚骰子十次,那么掷得4的次数就服从n = 10、p = 1/6的二项分布。

we apply the function pbinom with x = 4, n = 12, p = 0.2.

> pbinom(4, size=12, prob=0.2)  
[1] 0.92744

Poisson Distribution, 泊松分布

The Poisson distribution is the probability distribution of independent events occurrence in an interval. If λ is the mean occurrence per interval, then the probability of having x occurrence within a given interval is:


If there are twelve cars crossing a bridge per minute on average, find the probability of having sixteen or more cars crossing the bridge in a particular minute.

We compute the upper tail probability of the Poisson distribution with the function ppois.

> ppois(16, lambda=12, lower=FALSE)   # find upper tail  
[1] 0.10129

If there are twelve cars crossing a bridge per minute on average, the probability of having sixteen or more cars crossing the bridge in a particular minute is 10.1%.




Continuous Uniform Distribution, 连续均匀分布

The continuous uniform distribution is the probability distribution of random number selection from the continuous interval between a and b. Its density function is defined by the following.

Here is a graph of the continuous uniform distribution with a = 1, b = 3.

Exponential Distribution, 指数分布

The exponential distribution describes the arrival time of a randomly recurring independent event sequence. If μ is the mean waiting time for the next event recurrence, its probability density function is:

Here is a graph of the exponential distribution with μ = 1.

指数分布(Exponential distribution)是一种连续概率分布。指数分布可以用来表示独立随机事件发生的时间间隔,比如旅客进机场的时间间隔、中文维基百科新条目出现的时间间隔等等。

Suppose the mean checkout time of a supermarket cashier is three minutes. Find the probability of a customer checkout being completed by the cashier in less than two minutes.

The checkout processing rate is equals to one divided by the mean checkout completion time. Hence the processing rate is 1/3 checkouts per minute. We then apply the function pexp of the exponential distribution with rate=1/3.

> pexp(2, rate=1/3)  
[1] 0.48658


Normal Distribution, 正态分布

The normal distribution is defined by the following probability density function, where μ is the population mean and σ2 is thevariance.

In particular, the normal distribution with μ = 0 and σ = 1 is called the standard normal distribution, and is denoted as N(0,1). It can be graphed as follows.

正态分布Normal distribution)又名高斯分布Gaussian distribution), 很重要的一种分布...因为中心极限定理

中心极限定理(Central Limit Theorem)


  • 参数为np二项分布,在n相当大而且p不接近1或者0时近似于正态分布(有的参考书建议仅在npn(1 − p)至少为5时才能使用这一近似). 近似正态分布平均数为μ = np且方差为σ2 = np(1 − p).
  • 泊松分布带有参数λ当取样样本数很大时将近似正态分布λ. 近似正态分布平均数为μ = λ且方差为σ2 = λ.

Assume that the test scores of a college entrance exam fits a normal distribution. Furthermore, the mean test score is 72, and the standard deviation is 15.2. What is the percentage of students scoring 84 or more in the exam?

We apply the function pnorm of the normal distribution with mean 72 and standard deviation 15.2. Since we are looking for the percentage of students scoring higher than 84, we are interested in the upper tail of the normal distribution.

> pnorm(84, mean=72, sd=15.2, lower.tail=FALSE)  
[1] 0.21492


Chi-squared Distribution, 卡方分布

If X1,X2,…,Xm are m independent random variables having the standard normal distribution, then the following quantity follows a Chi-Squared distribution with m degrees of freedom. Its mean is m, and its variance is 2m.

Here is a graph of the Chi-Squared distribution 7 degrees of freedom.


Find the 95th percentile of the Chi-Squared distribution with 7 degrees of freedom.

We apply the quantile function qchisq of the Chi-Squared distribution against the decimal values 0.95.

> qchisq(.95, df=7)        # 7 degrees of freedom  
[1] 14.067


Student t Distribution, 学生t分布

Assume that a random variable Z has the standard normal distribution, and another random variable V has the Chi-Squared distribution with m degrees of freedom. Assume further that Z and V are independent, then the following quantity follows a Student t distribution with m degrees of freedom.

Here is a graph of the Student t distribution with 5 degrees of freedom.

Find the 2.5th and 97.5th percentiles of the Student t distribution with 5 degrees of freedom.

> qt(c(.025, .975), df=5)   # 5 degrees of freedom  
[1] -2.5706  2.5706


F Distribution, 费雪分布

If V 1 and V 2 are two independent random variables having the Chi-Squared distribution with m1 and m2 degrees of freedom respectively, then the following quantity follows an F distribution with m1 numerator degrees of freedom and m2denominator degrees of freedom, i.e., (m1,m2) degrees of freedom.

Here is a graph of the F distribution with (5, 2) degrees of freedom.

Find the 95th percentile of the F distribution with (5, 2) degrees of freedom.

> qf(.95, df1=5, df2=2)  
[1] 19.296


卡方分布(χ2分布)、t分布和F分布合称三大抽样分布, 因为他们都是基于正态分布的


Gym 102394 I. Interesting Permutation(DP)
47 7
损失函数大全Cross Entropy Loss/Weighted Loss/Focal Loss/Dice Soft Loss/Soft IoU Loss
损失函数大全Cross Entropy Loss/Weighted Loss/Focal Loss/Dice Soft Loss/Soft IoU Loss
220 2
DFNet: Enhance Absolute Pose Regression withDirect Feature Matching
DFNet: Enhance Absolute Pose Regression withDirect Feature Matching
162 0
机器学习/深度学习 异构计算
COVID-19 Cases Prediction (Regression)(二)
COVID-19 Cases Prediction (Regression)
475 0
COVID-19 Cases Prediction (Regression)(二)
COVID-19 Cases Prediction (Regression)(一)
COVID-19 Cases Prediction (Regression)
544 0
COVID-19 Cases Prediction (Regression)(一)
Minimal Square
Minimal Square
94 0
Minimal Square
成功解决 ConvergenceWarning: Objective did not converge. You might want to inc
成功解决 ConvergenceWarning: Objective did not converge. You might want to inc
Linux TensorFlow 算法框架/工具
Using side features: feature preprocessing
One of the great advantages of using a deep learning framework to build recommender models is the freedom to build rich, flexible feature representations.
168 0
存储 传感器 计算机视觉
gamma correction是什么
gamma correction是什么 在看传统cv的时候遇到几个比较有意思的之前不了解的东西,比如gamma correction,gamma是几乎所有数字成像系统中都很重要但很少被人理解的概念,它定义了像素的数值与其实际亮度之间的关系。
2207 0

