贾俊平《统计学》常用公式
统计学公式数据的概括性度量概率与概率分布统计量及其抽样分布参数估计一个总体参数的区间估计两个总体参数的区间估计样本量的确定假设检验一个总体参数的假设检验统计量两个总体参数的假设检验统计量分类数据分析方差分析单因素方差分析双因素方差分析无交互作用有交互作用一元线性回归多元线性回归时间序列分析和预测统计学公式
数据的概括性度量
中位数 MeM_eMe
Me={x(n+12),n为奇数12{x(n2)+x(n2+1)},n为偶数M_e = \begin{cases} x_{(\frac{n+1}{2})}, &n为奇数 \\ \frac{1}{2}\left\{ x_{(\frac{n}{2})} + x_{(\frac{n}{2} + 1)} \right\}, &n为偶数 \end{cases} Me={x(2n+1),21{x(2n)+x(2n+1)},n为奇数n为偶数
简单样本平均数 x‾\overline{x}x
x‾=1n∑i=1nxi\begin{aligned} \overline{x} = \frac{1}{n}\sum_{i=1}^n x_i \end{aligned} x=n1i=1∑nxi
加权样本平均数 x‾\overline{x}x
x‾=1n∑i=1kMifi其中n=∑i=1kfi\begin{aligned} \overline{x} = \frac{1}{n}\sum_{i=1}^k M_i f_i \quad 其中n=\sum_{i=1}^k f_i\end{aligned} x=n1i=1∑kMifi其中n=i=1∑kfi
几何平均数 GGG
G=∏i=1nxin\begin{aligned} G = \sqrt[n]{\prod_{i=1}^n x_i} \end{aligned} G=ni=1∏nxi
异众比率 VrV_rVr
Vr=∑i=1kfi−fm∑i=1kfi=1−fm∑i=1kfi\begin{aligned} V_r &= \frac{\sum_{i=1}^k f_i - f_m}{\sum_{i=1}^k f_i} \\ &=1-\frac{f_m}{\sum_{i=1}^k f_i} \end{aligned} Vr=∑i=1kfi∑i=1kfi−fm=1−∑i=1kfifm
四分位差 QdQ_dQd
Qd=QU−QL\begin{aligned} Q_d = Q_U-Q_L \end{aligned} Qd=QU−QL
极差 RRR
R=max(x1,x2,⋯,xn)−min(x1,x2,⋯,xn)\begin{aligned} R = max(x_1,x_2,\cdots,x_n) - min(x_1,x_2,\cdots, x_n) \end{aligned} R=max(x1,x2,⋯,xn)−min(x1,x2,⋯,xn)
简单平均差 MdM_dMd
Md=1n∑i=1n∣xi−x‾∣\begin{aligned} M_d = \frac{1}{n}\sum_{i=1}^n|x_i-\overline{x}| \end{aligned} Md=n1i=1∑n∣xi−x∣
加权平均差 MdM_dMd
Md=1n∑i=1k∣Mi−x‾∣fi\begin{aligned} M_d = \frac{1}{n}\sum_{i=1}^k|M_i-\overline{x}|f_i \end{aligned} Md=n1i=1∑k∣Mi−x∣fi
简单样本方差 s2s^2s2
s2=1n−1∑i=1n(xi−x‾)2\begin{aligned} s^2 = \frac{1}{n-1}\sum_{i=1}^n(x_i - \overline{x})^2 \end{aligned} s2=n−11i=1∑n(xi−x)2
简单样本标准差 sss
s2=1n−1∑i=1k(Mi−x‾)2fi\begin{aligned} s^2 = \frac{1}{n-1}\sum_{i=1}^k(M_i - \overline{x})^2 f_i \end{aligned} s2=n−11i=1∑k(Mi−x)2fi
加权样本方差 s2s^2s2
s=1n−1∑i=1n(xi−x‾)2\begin{aligned} s = \sqrt{\frac{1}{n-1}\sum_{i=1}^n(x_i - \overline{x})^2} \end{aligned} s=n−11i=1∑n(xi−x)2
加权样本标准差 sss
s=1n−1∑i=1k(Mi−x‾)2fi\begin{aligned} s = \sqrt{\frac{1}{n-1}\sum_{i=1}^k(M_i - \overline{x})^2 f_i} \end{aligned} s=n−11i=1∑k(Mi−x)2fi
标准分数 ziz_izi
zi=xi−x‾s\begin{aligned} z_i = \frac{x_i - \overline{x}}{s} \end{aligned} zi=sxi−x
离散系数 vsv_svs
vs=sx‾\begin{aligned} v_s = \frac{s}{\overline{x}} \end{aligned} vs=xs
未分组数据的偏态系数 SKSKSK
SK=n∑i=1n(xi−x‾)3(n−1)(n−2)s3\begin{aligned} SK = \frac{n \sum_{i=1}^n(x_i - \overline{x})^3}{(n-1)(n-2)s^3} \end{aligned} SK=(n−1)(n−2)s3n∑i=1n(xi−x)3
分组数据的偏态系数 SKSKSK
SK=∑i=1k(Mi−x‾)3fins3\begin{aligned} SK = \frac{\sum_{i=1}^k(M_i - \overline{x})^3 f_i}{ns^3} \end{aligned} SK=ns3∑i=1k(Mi−x)3fi
未分组数据的峰态系数 KKK
K=n(n+1)∑i=1n(xi−x‾)4−3(n−1)[∑i=1n(xi−x‾)2]2(n−1)(n−2)(n−3)s4\begin{aligned} K = \frac{n(n+1)\sum_{i=1}^n(x_i - \overline{x})^4 - 3(n-1)[\sum_{i=1}^n(x_i - \overline{x})^2]^2}{(n-1)(n-2)(n-3)s^4} \end{aligned} K=(n−1)(n−2)(n−3)s4n(n+1)∑i=1n(xi−x)4−3(n−1)[∑i=1n(xi−x)2]2
分组数据的峰态系数 KKK
K=∑i=1k(Mi−x‾)4fins4−3\begin{aligned} K = \frac{\sum_{i=1}^k(M_i - \overline{x})^4 f_i}{ns^4} - 3 \end{aligned} K=ns4∑i=1k(Mi−x)4fi−3
概率与概率分布
概率的古典定义
P(A)=事件A所包含的基本事件个数样本空间所包含的基本事件个数\begin{aligned} P(A) = \frac{事件A所包含的基本事件个数}{样本空间所包含的基本事件个数} \end{aligned} P(A)=样本空间所包含的基本事件个数事件A所包含的基本事件个数
概率的统计定义
P(A)=mn相同条件下随机试验n次,事件A出现m次\begin{aligned} &P(A) = \frac{m}{n} \\ 相同条件下随机&试验n次,事件A出现m次 \end{aligned} 相同条件下随机P(A)=nm试验n次,事件A出现m次
离散型随机变量的期望值
E(X)=∑k=0∞k⋅P(X=k)\begin{aligned} E(X) = \sum_{k = 0}^{\infty} k \cdot P(X=k) \end{aligned} E(X)=k=0∑∞k⋅P(X=k)
离散型随机变量的方差
Var(X)=∑k=0∞(k−E(X))2⋅P(X=k)\begin{aligned} Var(X) = \sum_{k = 0}^{\infty} (k-E(X))^2 \cdot P(X=k) \end{aligned} Var(X)=k=0∑∞(k−E(X))2⋅P(X=k)
二项分布 b(n,p)b(n, p)b(n,p) 的概率
P(X=k)=Cnkpk(1−p)n−k\begin{aligned} P(X = k) = C_n^k p^k(1-p)^{n-k} \end{aligned} P(X=k)=Cnkpk(1−p)n−k
二项分布的期望值
E(X)=∑k=0nk⋅P(X=k)=∑k=0nk⋅Cnkpk(1−p)n−k=np\begin{aligned} E(X) &= \sum_{k=0}^{n} k \cdot P(X=k) \\ &= \sum_{k=0}^{n} k \cdot C_n^k p^k(1-p)^{n-k} \\ &= np \end{aligned} E(X)=k=0∑nk⋅P(X=k)=k=0∑nk⋅Cnkpk(1−p)n−k=np
二项分布的方差
Var(X)=∑k=0n(k−E(X))2⋅P(X=k)=∑k=0n(k−E(X))2⋅Cnkpk(1−p)n−k=np(1−p)\begin{aligned} Var(X) &= \sum_{k=0}^{n} (k-E(X))^2 \cdot P(X=k) \\ &= \sum_{k=0}^{n} (k-E(X))^2 \cdot C_n^k p^k(1-p)^{n-k} \\ &= np(1-p) \end{aligned} Var(X)=k=0∑n(k−E(X))2⋅P(X=k)=k=0∑n(k−E(X))2⋅Cnkpk(1−p)n−k=np(1−p)
泊松分布 P(λ)P(\lambda)P(λ) 的概率
P(X=k)=λkk!e−λ\begin{aligned} P(X = k) = \frac{\lambda ^ k}{k!} e^{- \lambda} \end{aligned} P(X=k)=k!λke−λ
泊松分布的期望值
E(X)=∑k=0∞k⋅P(X=k)=∑k=0∞k⋅λkk!e−λ=λ\begin{aligned} E(X) &= \sum_{k=0}^{\infty} k \cdot P(X=k) \\ &= \sum_{k=0}^{\infty} k \cdot \frac{\lambda ^ k}{k!} e^{- \lambda} \\ &= \lambda \end{aligned} E(X)=k=0∑∞k⋅P(X=k)=k=0∑∞k⋅k!λke−λ=λ
泊松分布的方差
Var(X)=∑k=0∞(k−E(X))2⋅P(X=k)=∑k=0∞(k−E(X))2⋅λkk!e−λ=λ\begin{aligned} Var(X) &= \sum_{k=0}^{\infty} (k-E(X))^2 \cdot P(X=k) \\ &= \sum_{k=0}^{\infty} (k-E(X))^2 \cdot \frac{\lambda ^ k}{k!} e^{- \lambda} \\ &= \lambda \end{aligned} Var(X)=k=0∑∞(k−E(X))2⋅P(X=k)=k=0∑∞(k−E(X))2⋅k!λke−λ=λ
连续型随机变量的期望值
E(X)=∫−∞+∞xf(x)dx\begin{aligned} E(X) = \int_{- \infty}^{+ \infty} xf(x) \, dx \end{aligned} E(X)=∫−∞+∞xf(x)dx
连续型随机变量的方差
Var(X)=∫−∞+∞(x−E(X))2f(x)dx\begin{aligned} Var(X) = \int_{- \infty}^{+ \infty} (x-E(X))^2f(x) \, dx \end{aligned} Var(X)=∫−∞+∞(x−E(X))2f(x)dx
正态分布 N(μ,σ2)N(\mu , \sigma ^ 2)N(μ,σ2) 的概率密度函数
f(x)=12πσexp{−(x−μ)22σ2}\begin{aligned} f(x) = \frac{1}{\sqrt{2 \pi} \sigma} exp \left\{ -\frac{(x-\mu)^2}{2 \sigma ^2} \right\} \end{aligned} f(x)=2πσ1exp{−2σ2(x−μ)2}
标准正态分布的概率密度函数
f(x)=12πexp{−x22}\begin{aligned} f(x) = \frac{1}{\sqrt{2 \pi}} exp \left\{ -\frac{x^2}{2} \right\} \end{aligned} f(x)=2π1exp{−2x2}
标准正态分布的分布函数
F(x)=∫−∞x12πexp{−t22}dt\begin{aligned} F(x) = \int_{- \infty}^x \frac{1}{\sqrt{2 \pi}} exp \left\{ -\frac{t^2}{2}\right\} \, dt \end{aligned} F(x)=∫−∞x2π1exp{−2t2}dt
标准化公式
zi=xi−x‾s\begin{aligned} z_i = \frac{x_i - \overline{x}}{s} \end{aligned} zi=sxi−x
统计量及其抽样分布
X‾\overline{X}X 抽样分布的期望值
E(x‾)=μ\begin{aligned} E(\overline{x}) = \mu \end{aligned} E(x)=μ
X‾\overline{X}X 抽样分布的方差
Var(X‾)=σ2n\begin{aligned} Var(\overline{X}) = \frac{\sigma^2}{n} \end{aligned} Var(X)=nσ2
样本方差S2S^2S2
S2=1n−1∑i=1n(Xi−X‾)2\begin{aligned} S^2 = \frac{1}{n-1} \sum_{i=1}^n (X_i - \overline{X})^2 \end{aligned} S2=n−11i=1∑n(Xi−X)2
样本变异系数VVV
V=SX‾\begin{aligned} V = \frac{S}{\overline{X}} \end{aligned} V=XS
样本kkk阶矩mkm_kmk
mk=1n∑i=1nXik\begin{aligned} m_k = \frac{1}{n} \sum_{i=1}^n X_i^k \end{aligned} mk=n1i=1∑nXik
样本kkk阶中心距vkv_kvk
vk=1n−1∑i=1n(Xi−X‾)k\begin{aligned} v_k = \frac{1}{n-1} \sum_{i=1}^n (X_i - \overline{X})^k \end{aligned} vk=n−11i=1∑n(Xi−X)k
样本偏度α3\alpha_3α3
α3=n−1∑i=1n(Xi−X‾)3∑i=1n[(Xi−X‾)2]3/2\begin{aligned} \alpha_3 = \frac{\sqrt{n-1} \sum_{i=1}^n(X_i - \overline{X})^3}{\sum_{i=1}^n[(X_i - \overline{X})^2]^{3/2}} \end{aligned} α3=∑i=1n[(Xi−X)2]3/2n−1∑i=1n(Xi−X)3
样本峰度α4\alpha_4α4
α4=(n−1)∑i=1n(Xi−X‾)4∑i=1n[(Xi−X‾)2]2−3\begin{aligned} \alpha_4 = \frac{(n-1) \sum_{i=1}^n(X_i - \overline{X})^4}{\sum_{i=1}^n[(X_i - \overline{X})^2]^2} - 3 \end{aligned} α4=∑i=1n[(Xi−X)2]2(n−1)∑i=1n(Xi−X)4−3
参数估计
一个总体参数的区间估计
总体均值的置信区间(正态总体,σ\sigmaσ已知)
x‾±zα/2σn\begin{aligned} \overline{x} \pm z_{\alpha /2} \frac{\sigma}{\sqrt{n}} \end{aligned} x±zα/2nσ
总体均值的置信区间(σ\sigmaσ未知,大样本)
x‾±zα/2sn\begin{aligned} \overline{x} \pm z_{\alpha /2} \frac{s}{\sqrt{n}} \end{aligned} x±zα/2ns
总体均值的置信区间(正态总体,σ\sigmaσ未知,小样本)
x‾±tα/2(n−1)sn\begin{aligned} \overline{x} \pm t_{\alpha /2}(n-1) \frac{s}{\sqrt{n}} \end{aligned} x±tα/2(n−1)ns
总体比例的置信区间
p±zα/2p(1−p)n\begin{aligned} p \pm z_{\alpha /2} \sqrt{\frac{p(1-p)}{n}} \end{aligned} p±zα/2np(1−p)
总体方差的置信区间
((n−1)s2χα/22(n−1),(n−1)s2χ1−α/22(n−1))\begin{aligned} \left( \frac{(n-1)s^2}{\chi^2_{\alpha /2}(n-1)}, \frac{(n-1)s^2}{\chi^2_{1-\alpha /2}(n-1)} \right) \end{aligned} (χα/22(n−1)(n−1)s2,χ1−α/22(n−1)(n−1)s2)
两个总体参数的区间估计
均值之差的区间估计(独立大样本,σ12\sigma_1^2σ12 和 σ22\sigma_2^2σ22 已知)
x1‾−x2‾±zα/2σ12n1+σ22n2\begin{aligned} \overline{x_1} - \overline{x_2} \pm z_{\alpha / 2} \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}} \end{aligned} x1−x2±zα/2n1σ12+n2σ22
均值之差的区间估计(独立大样本,σ12\sigma_1^2σ12 和 σ22\sigma_2^2σ22 未知)
x1‾−x2‾±zα/2s12n1+s22n2\begin{aligned} \overline{x_1} - \overline{x_2} \pm z_{\alpha / 2} \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}} \end{aligned} x1−x2±zα/2n1s12+n2s22
均值之差的区间估计(独立小样本,σ12\sigma_1^2σ12 和 σ22\sigma_2^2σ22 未知但相等)
x1‾−x2‾±tα/2(n1+n2−2)sp2(1n1+1n2)其中sp2=(n1−1)s12+(n2−1)s22n1+n2−2\begin{aligned} &\overline{x_1} - \overline{x_2} \pm t_{\alpha / 2}(n_1+n_2-2) \sqrt{s_p^2 \left(\frac{1}{n_1} + \frac{1}{n_2}\right)} \\ &其中s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2} \end{aligned} x1−x2±tα/2(n1+n2−2)sp2(n11+n21)其中sp2=n1+n2−2(n1−1)s12+(n2−1)s22
均值之差的区间估计(独立小样本,σ12\sigma_1^2σ12 和 σ22\sigma_2^2σ22 未知且不相等,两个样本的容量相等)
类似于匹配样本,记Y=X1−X2∼N(μ1−μ2,σ12+σ22),SY2=1n−1∑i=1n(Yi−Y‾)2,则T=(X1‾−X2‾)−(μ1−μ2)SY/n(X1‾−X2‾)±tα/2(n−1)SYn\begin{aligned} 类似于匹配样本,记Y = X_1 - X_2 &\sim N(\mu_1 - \mu_2, \sigma_1^2 + \sigma_2^2), S_Y^2 = \frac{1}{n-1} \sum_{i=1}^n (Y_i - \overline{Y})^2,则\\ T &= \frac{(\overline{X_1} - \overline{X_2}) - (\mu_1 - \mu_2)}{S_Y / \sqrt{n}} \\ & (\overline{X_1} - \overline{X_2}) \pm t_{\alpha / 2}(n-1) \frac{S_Y}{\sqrt{n}} \end{aligned} 类似于匹配样本,记Y=X1−X2T∼N(μ1−μ2,σ12+σ22),SY2=n−11i=1∑n(Yi−Y)2,则=SY/n(X1−X2)−(μ1−μ2)(X1−X2)±tα/2(n−1)nSY
均值之差的区间估计(独立小样本,σ12\sigma_1^2σ12 和 σ22\sigma_2^2σ22 未知且不相等,两个样本的容量不相等)
x1‾−x2‾±tα/2(v)s12n1+s22n2其中v=(s12/n1+s22/n2)2(s12/n1)2n1−1+(s22/n2)2n2−1\begin{aligned} &\overline{x_1} - \overline{x_2} \pm t_{\alpha / 2}(v) \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}} \\ &其中v = \frac{\left( s_1^2/n_1 + s_2^2/n_2 \right)^2}{ \frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1}} \end{aligned} x1−x2±tα/2(v)n1s12+n2s22其中v=n1−1(s12/n1)2+n2−1(s22/n2)2(s12/n1+s22/n2)2
均值之差的置信区间(匹配大样本)
d‾±zα/2σdn\begin{aligned} \overline{d} \pm z_{\alpha / 2} \frac{\sigma_d}{\sqrt{n}} \end{aligned} d±zα/2nσd
均值之差的置信区间(匹配小样本)
d‾±tα/2(n−1)sdn\begin{aligned} \overline{d} \pm t_{\alpha / 2}(n-1) \frac{s_d}{\sqrt{n}} \end{aligned} d±tα/2(n−1)nsd
两个总体比例之差的置信区间
(p1−p2)±zα/2p1(1−p1)n1+p2(1−p2)n2\begin{aligned} (p_1 -p_2) \pm z_{\alpha / 2} \sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}} \end{aligned} (p1−p2)±zα/2n1p1(1−p1)+n2p2(1−p2)
两个总体方差比的置信区间
(s12s22⋅Fα/2(n1−1,n2−1),s12s22⋅F1−α/2(n1−1,n2−1))\begin{aligned} \left( \frac{s_1^2}{s_2^2 \cdot F_{\alpha / 2}(n_1-1,n_2-1)}, \frac{s_1^2}{s_2^2 \cdot F_{1 - \alpha / 2}(n_1-1,n_2-1)} \right) \end{aligned} (s22⋅Fα/2(n1−1,n2−1)s12,s22⋅F1−α/2(n1−1,n2−1)s12)
样本量的确定
估计总体均值时的样本量
n=zα/22⋅σ2E2\begin{aligned} n = \frac{z_{\alpha / 2}^2 \cdot \sigma^2}{E^2} \end{aligned} n=E2zα/22⋅σ2
估计总体比例时的样本量
n=zα/22⋅p(1−p)E2\begin{aligned} n = \frac{z_{\alpha / 2}^2 \cdot p(1-p)}{E^2} \end{aligned} n=E2zα/22⋅p(1−p)
假设检验
一个总体参数的假设检验统计量
总体均值检验的统计量(正态总体,σ\sigmaσ 已知)
nx‾−μ0σ∼N(0,1)\begin{aligned} \sqrt{n} \frac{\overline{x} - \mu_0}{\sigma} \sim N(0, 1) \end{aligned} nσx−μ0∼N(0,1)
总体均值检验的统计量(σ\sigmaσ 未知,大样本)
nx‾−μ0s∼N(0,1)\begin{aligned} \sqrt{n} \frac{\overline{x} - \mu_0}{s} \sim N(0, 1) \end{aligned} nsx−μ0∼N(0,1)
总体均值检验的统计量(正态总体,σ\sigmaσ 未知,小样本)
nx‾−μ0s∼t(n−1)\begin{aligned} \sqrt{n} \frac{\overline{x} - \mu_0}{s} \sim t(n-1) \end{aligned} nsx−μ0∼t(n−1)
总体比例的检验统计量
np−πp(1−p)∼N(0,1)\begin{aligned} \sqrt{n} \frac{p - \pi}{\sqrt{p(1-p)}} \sim N(0, 1) \end{aligned} np(1−p)p−π∼N(0,1)
总体方差的检验统计量
(n−1)s2σ2∼χ2(n−1)\begin{aligned} \frac{(n-1)s^2}{\sigma^2} \sim \chi^2(n-1) \end{aligned} σ2(n−1)s2∼χ2(n−1)
两个总体参数的假设检验统计量
两个总体均值之差检验的统计量(σ12\sigma_1^2σ12,σ22\sigma_2^2σ22 已知)
x1‾−x2‾−(μ1−μ2)σ12n1+σ22n2∼N(0,1)\begin{aligned} \frac{\overline{x_1} - \overline{x_2} - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}} \sim N(0,1) \end{aligned} n1σ12+n2σ22x1−x2−(μ1−μ2)∼N(0,1)
两个总体均值之差检验的统计量(σ12\sigma_1^2σ12,σ22\sigma_2^2σ22 未知但相等,小样本)
x1‾−x2‾−(μ1−μ2)sp2(1n1+1n2)∼t(n−1)\begin{aligned} \frac{\overline{x_1} - \overline{x_2} - (\mu_1 - \mu_2)}{\sqrt{s_p^2 \left( \frac{1}{n_1} + \frac{1}{n_2} \right)}} \sim t(n-1) \end{aligned} sp2(n11+n21)x1−x2−(μ1−μ2)∼t(n−1)
两个总体比例之差检验的统计量(检验两个总体比例相等的假设)
p1−p2−(π1−π2)p(1−p)(1n1+1n2)∼N(0,1)其中p=n1p1+n2p2n1+n2\begin{aligned} &\frac{p_1 - p_2 - (\pi_1 - \pi_2)}{\sqrt{p(1-p) \left( \frac{1}{n_1} + \frac{1}{n_2} \right)}} \sim N(0,1) \\ &其中p = \frac{n_1p_1 + n_2p_2}{n_1+n_2} \end{aligned} p(1−p)(n11+n21)p1−p2−(π1−π2)∼N(0,1)其中p=n1+n2n1p1+n2p2
两个总体比例之差检验的统计量(检验两个总体比例之差不为0的假设)
p1−p2−(π1−π2)p1(1−p1)n1+p2(1−p2)n2∼N(0,1)\begin{aligned} \frac{p_1 - p_2 - (\pi_1 - \pi_2)}{\sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}} \sim N(0,1) \end{aligned} n1p1(1−p1)+n2p2(1−p2)p1−p2−(π1−π2)∼N(0,1)
两个样本方差比检验的统计量
s12/σ12s22/σ22∼F(n1−1,n2−1)\begin{aligned} \frac{s_1^2/\sigma_1^2}{s_2^2/\sigma_2^2} \sim F(n_1-1, n_2-1) \end{aligned} s22/σ22s12/σ12∼F(n1−1,n2−1)
分类数据分析
χ2\chi^2χ2 统计量
χ2=∑i=1r∑j=1s(nij−npij^)2npij^∼χ2((r−1)(s−1))=∑i=1r∑j=1s(nij−npi⋅^⋅p⋅j^)2npi⋅^⋅p⋅j^=∑i=1r∑j=1s(nij−ni⋅⋅n⋅j/n)2ni⋅⋅n⋅j/n\begin{aligned} \chi^2 &= \sum_{i = 1}^r \sum_{j = 1}^s \frac{(n_{ij} - n\hat{p_{ij}})^2}{n\hat{p_{ij}}} \sim \chi^2((r-1)(s-1)) \\ &= \sum_{i = 1}^r \sum_{j = 1}^s \frac{(n_{ij} - n \hat{p_{i \cdot}} \cdot \hat{p_{\cdot j}})^2}{n \hat{p_{i \cdot}} \cdot \hat{p_{\cdot j}}} \\ &= \sum_{i = 1}^r \sum_{j = 1}^s \frac{(n_{ij} - n_{i \cdot} \cdot n_{\cdot j} / n)^2}{n_{i \cdot} \cdot n_{\cdot j} / n} \end{aligned} χ2=i=1∑rj=1∑snpij^(nij−npij^)2∼χ2((r−1)(s−1))=i=1∑rj=1∑snpi⋅^⋅p⋅j^(nij−npi⋅^⋅p⋅j^)2=i=1∑rj=1∑sni⋅⋅n⋅j/n(nij−ni⋅⋅n⋅j/n)2
φ\varphiφ 相关系数
φ=χ2/n\begin{aligned} \varphi = \sqrt{\chi^2 / n} \end{aligned} φ=χ2/n
列联相关系数
c=χ2χ2+n\begin{aligned} c = \sqrt{\frac{\chi^2}{\chi^2 + n}} \end{aligned} c=χ2+nχ2
VVV 相关系数
V=χ2n×min{r−1,s−1}显然若有一维为2,则V值就等于φ值\begin{aligned} V &= \sqrt{\frac{\chi^2}{n \times min\{r-1, s-1 \}}} \\ &显然若有一维为2,则V值就等于\varphi值 \end{aligned} V=n×min{r−1,s−1}χ2显然若有一维为2,则V值就等于φ值
方差分析
单因素方差分析
总平方和SSTSSTSST
SST=∑i=1k∑j=1ni(xij−x‾‾)2\begin{aligned} SST=\sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij} - \overline{\overline{x}})^2 \end{aligned} SST=i=1∑kj=1∑ni(xij−x)2
组间平方和SSASSASSA
SSA=∑i=1kni(x‾i−x‾‾)2\begin{aligned} SSA=\sum_{i=1}^k n_i (\overline{x}_i - \overline{\overline{x}})^2 \end{aligned} SSA=i=1∑kni(xi−x)2
组内平方和SSESSESSE
SSE=∑i=1k∑j=1ni(xij−x‾i)2\begin{aligned} SSE=\sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij} - \overline{x}_i)^2 \end{aligned} SSE=i=1∑kj=1∑ni(xij−xi)2
组间方差MSAMSAMSA
MSA=SSAk−1\begin{aligned} MSA = \frac{SSA}{k - 1} \end{aligned} MSA=k−1SSA
组间方差MSEMSEMSE
MSE=SSEn−k\begin{aligned} MSE = \frac{SSE}{n - k} \end{aligned} MSE=n−kSSE
检验统计量
F=SSA/(k−1)SSE/(n−k)∼F(k−1,n−k)=MSAMSE\begin{aligned} F &= \frac{SSA/(k-1)}{SSE / (n-k)} \sim F(k-1, n-k) \\ &= \frac{MSA}{MSE} \end{aligned} F=SSE/(n−k)SSA/(k−1)∼F(k−1,n−k)=MSEMSA
关系强度的测量R2R^2R2
R2=SSASST\begin{aligned} R^2 = \frac{SSA}{SST} \end{aligned} R2=SSTSSA
多重比较的LSDLSDLSD
LSD=tα/2(n−k)MSE(1ni+1nj)\begin{aligned} LSD = t_{\alpha /2}(n-k) \sqrt{MSE(\frac{1}{n_i} + \frac{1}{n_j})} \end{aligned} LSD=tα/2(n−k)MSE(ni1+nj1)
双因素方差分析
无交互作用
总平方和SSTSSTSST
SST=∑i=1k∑j=1r(xij−x‾‾)2\begin{aligned} SST = \sum_{i=1}^k \sum_{j=1}^r(x_{ij} - \overline{\overline{x}})^2 \end{aligned} SST=i=1∑kj=1∑r(xij−x)2
行因素平方和SSRSSRSSR
SSR=∑i=1k∑j=1r(x‾i⋅−x‾‾)2\begin{aligned} SSR = \sum_{i=1}^k \sum_{j=1}^r(\overline{x}_{i \cdot} - \overline{\overline{x}})^2 \end{aligned} SSR=i=1∑kj=1∑r(xi⋅−x)2
列因素平方和SSCSSCSSC
SSC=∑i=1k∑j=1r(x‾⋅j−x‾‾)2\begin{aligned} SSC = \sum_{i=1}^k \sum_{j=1}^r(\overline{x}_{\cdot j} - \overline{\overline{x}})^2 \end{aligned} SSC=i=1∑kj=1∑r(x⋅j−x)2
误差平方和SSESSESSE
SSE=∑i=1k∑j=1r(xij−x‾i⋅−x‾⋅j+x‾‾)2=SST−SSR−SSC\begin{aligned} SSE &= \sum_{i=1}^k \sum_{j=1}^r(x_{ij} - \overline{x}_{i \cdot} - \overline{x}_{\cdot j} + \overline{\overline{x}})^2 \\ &= SST - SSR - SSC \end{aligned} SSE=i=1∑kj=1∑r(xij−xi⋅−x⋅j+x)2=SST−SSR−SSC
行因素的均方MSRMSRMSR
MSR=SSRk−1\begin{aligned} MSR = \frac{SSR}{k - 1} \end{aligned} MSR=k−1SSR
列因素的均方MSCMSCMSC
MSC=SSCr−1\begin{aligned} MSC = \frac{SSC}{r - 1} \end{aligned} MSC=r−1SSC
随机误差项的均方MSEMSEMSE
MSE=SSE(k−1)(r−1)\begin{aligned} MSE = \frac{SSE}{(k-1)(r-1)} \end{aligned} MSE=(k−1)(r−1)SSE
行因素的检验统计量FRF_RFR
FR=MSRMSE∼F(k−1,(k−1)(r−1))\begin{aligned} F_R = \frac{MSR}{MSE} \sim F(k-1, (k-1)(r-1)) \end{aligned} FR=MSEMSR∼F(k−1,(k−1)(r−1))
列因素的检验统计量FCF_CFC
FC=MSCMSE∼F(r−1,(k−1)(r−1))\begin{aligned} F_C = \frac{MSC}{MSE} \sim F(r-1, (k-1)(r-1)) \end{aligned} FC=MSEMSC∼F(r−1,(k−1)(r−1))
关系强度的测量R2R^2R2
R2=SSR+SSCSST\begin{aligned} R^2 = \frac{SSR + SSC}{SST} \end{aligned} R2=SSTSSR+SSC
有交互作用
总平方和SSTSSTSST
SST=∑i=1k∑j=1r∑l=1m(xijl−x‾‾)2\begin{aligned} SST = \sum_{i=1}^k \sum_{j=1}^r \sum_{l=1}^m (x_{ijl} - \overline{\overline{x}})^2 \end{aligned} SST=i=1∑kj=1∑rl=1∑m(xijl−x)2
行因素平方和SSRSSRSSR
SSR=rm∑i=1k(x‾i⋅−x‾‾)2\begin{aligned} SSR = rm\sum_{i=1}^k(\overline{x}_{i \cdot} - \overline{\overline{x}})^2 \end{aligned} SSR=rmi=1∑k(xi⋅−x)2
列因素平方和SSCSSCSSC
SSC=km∑j=1r(x‾⋅j−x‾‾)2\begin{aligned} SSC = km \sum_{j=1}^r(\overline{x}_{\cdot j} - \overline{\overline{x}})^2 \end{aligned} SSC=kmj=1∑r(x⋅j−x)2
交互作用平方和SSRCSSRCSSRC
SSRC=m∑i=1k∑j=1r(x‾ij−x‾i⋅−x‾⋅j+x‾‾)2\begin{aligned} SSRC = m \sum_{i=1}^k \sum_{j=1}^r(\overline{x}_{ij} - \overline{x}_{i \cdot} - \overline{x}_{\cdot j} + \overline{\overline{x}})^2 \end{aligned} SSRC=mi=1∑kj=1∑r(xij−xi⋅−x⋅j+x)2
误差平方和SSESSESSE
SSE=SST−SSR−SSC−SSRC\begin{aligned} SSE = SST - SSR - SSC - SSRC \end{aligned} SSE=SST−SSR−SSC−SSRC
行因素的均方MSRMSRMSR
MSR=SSRk−1\begin{aligned} MSR = \frac{SSR}{k-1} \end{aligned} MSR=k−1SSR
列因素的均方MSCMSCMSC
MSC=SSCr−1\begin{aligned} MSC = \frac{SSC}{r-1} \end{aligned} MSC=r−1SSC
交互作用的均方MSRCMSRCMSRC
MSRC=SSRC(k−1)(r−1)\begin{aligned} MSRC = \frac{SSRC}{(k-1)(r-1)} \end{aligned} MSRC=(k−1)(r−1)SSRC
随机误差项的均方MSEMSEMSE
MSE=SSEkr(m−1)\begin{aligned} MSE = \frac{SSE}{kr(m-1)} \end{aligned} MSE=kr(m−1)SSE
行因素的检验统计量FRF_RFR
FR=MSRMSE∼F(k−1,kr(m−1))\begin{aligned} F_R = \frac{MSR}{MSE} \sim F(k-1, kr(m-1)) \end{aligned} FR=MSEMSR∼F(k−1,kr(m−1))
列因素的检验统计量FCF_CFC
FC=MSCMSE∼F(r−1,kr(m−1))\begin{aligned} F_C = \frac{MSC}{MSE} \sim F(r-1, kr(m-1)) \end{aligned} FC=MSEMSC∼F(r−1,kr(m−1))
交互作用的检验统计量FRCF_{RC}FRC
FRC=MSRCMSE∼F((k−1)(r−1),kr(m−1))\begin{aligned} F_{RC} = \frac{MSRC}{MSE} \sim F((k-1)(r-1), kr(m-1)) \end{aligned} FRC=MSEMSRC∼F((k−1)(r−1),kr(m−1))
一元线性回归
相关系数 rrr
r=∑i=1n(xi−x‾)(yi−y‾)∑i=1n(xi−x‾)2∑i=1n(yi−y‾)2=∑i=1nxi⋅yi−nx‾⋅y‾∑i=1nxi2−nx‾2∑i=1nyi2−ny‾2=Cov(X,Y)σX⋅σY=lxylxxlyy\begin{aligned} r &= \frac{\sum_{i=1}^n(x_i - \overline{x})(y_i - \overline{y})}{\sqrt{\sum_{i=1}^n(x_i - \overline{x})^2} \sqrt{\sum_{i=1}^n(y_i - \overline{y})^2}} \\ &= \frac{\sum_{i=1}^nx_i \cdot y_i - n\overline{x} \cdot \overline{y}}{\sqrt{\sum_{i=1}^n x_i ^2 - n \overline{x} ^2} \sqrt{\sum_{i=1}^n y_i ^2 - n \overline{y} ^2}} \\ &= \frac{Cov(X,Y)}{\sqrt{\sigma_X\cdot\sigma_Y}} \\ &= \frac{l_{xy}}{\sqrt{l_{xx}l_{yy}}} \end{aligned} r=∑i=1n(xi−x)2∑i=1n(yi−y)2∑i=1n(xi−x)(yi−y)=∑i=1nxi2−nx2∑i=1nyi2−ny2∑i=1nxi⋅yi−nx⋅y=σX⋅σYCov(X,Y)=lxxlyylxy
相关系数的检验统计量
t=rn−21−r2∼t(n−2)\begin{aligned} t = r\sqrt{\frac{n-2}{1-r^2}} \sim t(n-2) \end{aligned} t=r1−r2n−2∼t(n−2)
一元线性回归模型
y=β0+β1x+ϵ\begin{aligned} y = \beta_0 + \beta_1 x + \epsilon \end{aligned} y=β0+β1x+ϵ
一元线性回归方程
E(y)=β0+β1x\begin{aligned} E(y) = \beta_0 + \beta_1 x \end{aligned} E(y)=β0+β1x
估计的一元线性回归方程
y^=β0^+β1^x\begin{aligned} \hat{y} = \hat{\beta_0} + \hat{\beta_1} x \end{aligned} y^=β0^+β1^x
回归方程的斜率(回归系数)β1^\hat{\beta_1}β1^
β1^=∑i=1n(xi−x‾)(yi−y‾)∑i=1n(xi−x‾)2=∑i=1nxi⋅yi−nx‾⋅y‾∑i=1nxi2−nx‾2=Cov(X,Y)Var(X)=lxylxx\begin{aligned} \hat{\beta_1} &= \frac{\sum_{i=1}^n(x_i - \overline{x})(y_i - \overline{y})}{\sum_{i=1}^n(x_i - \overline{x})^2} \\ &= \frac{\sum_{i=1}^nx_i \cdot y_i - n\overline{x} \cdot \overline{y}}{\sum_{i=1}^n x_i ^2 - n \overline{x} ^2} \\ &= \frac{Cov(X, Y)}{Var(X)} \\ &= \frac{l_{xy}}{l_{xx}} \end{aligned} β1^=∑i=1n(xi−x)2∑i=1n(xi−x)(yi−y)=∑i=1nxi2−nx2∑i=1nxi⋅yi−nx⋅y=Var(X)Cov(X,Y)=lxxlxy
回归方程的截距 β0^\hat{\beta_0}β0^
β0^=y‾−β1^x‾\begin{aligned} \hat{\beta_0} = \overline{y} - \hat{\beta_1}\overline{x} \end{aligned} β0^=y−β1^x
总平方和SSTSSTSST
SST=∑i=1n(yi−y‾)2=lyy\begin{aligned} SST = \sum_{i=1}^n(y_i - \overline{y})^2 = l_{yy} \end{aligned} SST=i=1∑n(yi−y)2=lyy
回归平方和SSRSSRSSR
SSR=∑i=1n(yi^−y‾)2=∑i=1n(β0^+β1^xi−β0^−β1^x‾)2=β1^2∑i=1n(xi−x‾)2=β1^2lxx\begin{aligned} SSR &= \sum_{i=1}^n(\hat{y_i} - \overline{y})^2 \\ &= \sum_{i=1}^n(\hat{\beta_0} + \hat{\beta_1}x_i - \hat{\beta_0} - \hat{\beta_1} \overline{x})^2 \\ &= \hat{\beta_1}^2 \sum_{i=1}^n (x_i - \overline{x})^2 \\ &= \hat{\beta_1}^2 l_{xx} \end{aligned} SSR=i=1∑n(yi^−y)2=i=1∑n(β0^+β1^xi−β0^−β1^x)2=β1^2i=1∑n(xi−x)2=β1^2lxx
离差平方和SSESSESSE
SSE=∑i=1n(yi−yi^)2\begin{aligned} SSE = \sum_{i=1}^n(y_i - \hat{y_i})^2 \end{aligned} SSE=i=1∑n(yi−yi^)2
判定系数R2R^2R2
R2=∑i=1n(yi^−y‾)2∑i=1n(yi−y‾)2=∑i=1n(β0^+β1^xi−β0^−β1^x‾)2∑i=1n(yi−y‾)2=β1^2∑i=1n(xi−x‾)2∑i=1n(yi−y‾)2=β1^2lxxlyy\begin{aligned} R^2 &= \frac{\sum_{i=1}^n(\hat{y_i} - \overline{y})^2}{\sum_{i=1}^n(y_i - \overline{y})^2} \\ &= \frac{\sum_{i=1}^n(\hat{\beta_0} + \hat{\beta_1}x_i - \hat{\beta_0} - \hat{\beta_1} \overline{x})^2}{\sum_{i=1}^n(y_i - \overline{y})^2} \\ &= \hat{\beta_1}^2 \frac{\sum_{i=1}^n(x_i - \overline{x})^2}{\sum_{i=1}^n(y_i - \overline{y})^2} \\ &= \hat{\beta_1}^2 \frac{l_{xx}}{l_{yy}} \end{aligned} R2=∑i=1n(yi−y)2∑i=1n(yi^−y)2=∑i=1n(yi−y)2∑i=1n(β0^+β1^xi−β0^−β1^x)2=β1^2∑i=1n(yi−y)2∑i=1n(xi−x)2=β1^2lyylxx
估计标准误差ses_ese
se=∑i=1n(yi−yi^)2n−2=SSEn−2\begin{aligned} s_e &= \sqrt{\frac{\sum_{i=1}^n(y_i - \hat{y_i})^2}{n-2}} \\ &= \sqrt{\frac{SSE}{n-2}} \end{aligned} se=n−2∑i=1n(yi−yi^)2=n−2SSE
线性关系检验的统计量
F=SSR/1SSE/(n−2)∼F(1,n−2)=(n−2)SSRSST−SSR=(n−2)R21−R2\begin{aligned} F &= \frac{SSR / 1}{SSE / (n-2)} \sim F(1, n-2) \\ &= (n-2)\frac{SSR}{SST - SSR} \\ &= (n-2)\frac{R^2}{1-R^2} \end{aligned} F=SSE/(n−2)SSR/1∼F(1,n−2)=(n−2)SST−SSRSSR=(n−2)1−R2R2
估计的回归系数 β1^\hat{\beta_1}β1^ 的标准差σβ1^\sigma_{\hat{\beta_1}}σβ1^
σβ1^=σ^∑i=1n(xi−x‾)2\begin{aligned} \sigma_{\hat{\beta_1}} = \frac{\hat{\sigma}}{\sqrt{\sum_{i=1}^n(x_i - \overline{x})^2}} \end{aligned} σβ1^=∑i=1n(xi−x)2σ^
β1^\hat{\beta_1}β1^的估计的标准差sβ1^s_{\hat{\beta_1}}sβ1^
sβ1^=se∑i=1n(xi−x‾)2=selxx\begin{aligned} s_{\hat{\beta_1}} &= \frac{s_e}{\sqrt{\sum_{i=1}^n(x_i - \overline{x})^2}} \\ &= \frac{s_e}{\sqrt{l_{xx}}} \end{aligned} sβ1^=∑i=1n(xi−x)2se=lxxse
回归系数检验的统计量
t=β1^sβ1^∼t(n−2)\begin{aligned} t = \frac{\hat{\beta_1}}{s_{\hat{\beta_1}}} \sim t(n-2) \end{aligned} t=sβ1^β1^∼t(n−2)
y0^\hat{y_0}y0^ 的标准差的估计量sy0^s_{\hat{y_0}}sy0^
sy0^=1n+(x0−x‾)2∑i=1n(xi−x‾)2=1n+(x0−x‾)2lxx\begin{aligned} s_{\hat{y_0}} &= \sqrt{\frac{1}{n} + \frac{(x_0 - \overline{x})^2}{\sum_{i=1}^n (x_i - \overline{x})^2}} \\ &= \sqrt{\frac{1}{n} + \frac{(x_0 - \overline{x})^2}{l_{xx}}} \end{aligned} sy0^=n1+∑i=1n(xi−x)2(x0−x)2=n1+lxx(x0−x)2
y0y_0y0 的标准差的估计量sinds_{ind}sind
sind=1+1n+(x0−x‾)2∑i=1n(xi−x‾)2=1+1n+(x0−x‾)2lxx\begin{aligned} s_{ind} &= \sqrt{1 + \frac{1}{n} + \frac{(x_0 - \overline{x})^2}{\sum_{i=1}^n (x_i - \overline{x})^2}} \\ &= \sqrt{1 + \frac{1}{n} + \frac{(x_0 - \overline{x})^2}{l_{xx}}} \end{aligned} sind=1+n1+∑i=1n(xi−x)2(x0−x)2=1+n1+lxx(x0−x)2
yyy 的平均值的置信区间
y‾±1n+(x0−x‾)2lxx\begin{aligned} \overline{y} \pm \sqrt{\frac{1}{n} + \frac{(x_0 - \overline{x})^2}{l_{xx}}} \end{aligned} y±n1+lxx(x0−x)2
yyy 的个别值的预测区间
y0±1+1n+(x0−x‾)2lxx\begin{aligned} y_0 \pm \sqrt{1 + \frac{1}{n} + \frac{(x_0 - \overline{x})^2}{l_{xx}}} \end{aligned} y0±1+n1+lxx(x0−x)2
残差eie_iei
ei=yi−yi^\begin{aligned} e_i = y_i - \hat{y_i} \end{aligned} ei=yi−yi^
标准化残差zeiz_{e_{i}}zei
zei=yi−yi^se\begin{aligned} z_{e_i} = \frac{y_i - \hat{y_i}}{s_e} \end{aligned} zei=seyi−yi^
参数β0\beta_0β0的最小二乘估计量的分布
β0^∼N(β0,σ2(1n+x‾2lxx))\begin{aligned} \hat{\beta_0} \sim N\left(\beta_0, \sigma^2 \left( \frac{1}{n} + \frac{\overline{x}^2}{l_{xx}} \right) \right) \end{aligned} β0^∼N(β0,σ2(n1+lxxx2))
参数β1\beta_1β1的最小二乘估计量的分布
β1^∼N(β1,σ2lxx)\begin{aligned} \hat{\beta_1} \sim N\left(\beta_1, \frac{\sigma^2}{l_{xx}} \right) \end{aligned} β1^∼N(β1,lxxσ2)
参数σ2\sigma^2σ2的最小二乘估计量的分布
(n−2)σ2^σ2∼χ2(n−2)\begin{aligned} \frac{(n-2)\hat{\sigma^2}}{\sigma^2} \sim \chi^2(n-2) \end{aligned} σ2(n−2)σ2^∼χ2(n−2)
多元线性回归
多元线性回归模型
y=β0+β1x1+⋯+βkxk+ϵ\begin{aligned} y=\beta_0+\beta_1 x_1+ \cdots+\beta_k x_k+\epsilon \end{aligned} y=β0+β1x1+⋯+βkxk+ϵ
多元线性回归方程
E(y)=β0+β1x1+⋯+βkxk\begin{aligned} E(y)=\beta_0+\beta_1 x_1+ \cdots+\beta_k x_k \end{aligned} E(y)=β0+β1x1+⋯+βkxk
估计的多元线性回归方程
y^=β0^+β1^x1+⋯+βk^xk\begin{aligned} \hat{y}=\hat{\beta_0}+\hat{\beta_1} x_1+ \cdots+\hat{\beta_k} x_k \end{aligned} y^=β0^+β1^x1+⋯+βk^xk
多重判定系数
R2=SSRSST=1−SSESST=∑i=1n(yi^−y‾)2∑i=1n(yi−y‾)2\begin{aligned} R^2=\frac{SSR}{SST} = 1-\frac{SSE}{SST}=\frac{\sum_{i=1}^n(\hat{y_i} - \overline{y})^2}{\sum_{i=1}^n(y_i - \overline{y})^2} \end{aligned} R2=SSTSSR=1−SSTSSE=∑i=1n(yi−y)2∑i=1n(yi^−y)2
调整的多重判定系数
Ra2=1−(1−R2)n−1n−k−1\begin{aligned} R_a^2 = 1-(1-R^2)\frac{n-1}{n-k-1} \end{aligned} Ra2=1−(1−R2)n−k−1n−1
估计标准误差
se=∑i=1n(yi−yi^)2n−k−1=SSEn−k−1\begin{aligned} s_e &= \sqrt{\frac{\sum_{i=1}^n(y_i - \hat{y_i})^2}{n-k-1}} \\ &= \sqrt{\frac{SSE}{n-k-1}} \end{aligned} se=n−k−1∑i=1n(yi−yi^)2=n−k−1SSE
线性关系检验的统计量
F=SSR/kSSE/(n−k−1)∼F(k,n−k−1)=n−k−1kSSRSST−SSR=n−k−1kR21−R2\begin{aligned} F &= \frac{SSR / k}{SSE / (n-k-1)} \sim F(k, n-k-1) \\ &= \frac{n-k-1}{k} \frac{SSR}{SST - SSR} \\ &= \frac{n-k-1}{k} \frac{R^2}{1-R^2} \end{aligned} F=SSE/(n−k−1)SSR/k∼F(k,n−k−1)=kn−k−1SST−SSRSSR=kn−k−11−R2R2
回归系数βi^\hat{\beta_i}βi^的抽样分布标准差 sβi^s_{\hat{\beta_i}}sβi^
sβi^=selxx=MSElxx=SSElxx(n−k−1)\begin{aligned} s_{\hat{\beta_i}} &= \frac{s_e}{\sqrt{l_{xx}}} \\ &= \sqrt{\frac{MSE}{l_{xx}}} \\ &= \sqrt{\frac{SSE}{l_{xx}(n-k-1)}} \end{aligned} sβi^=lxxse=lxxMSE=lxx(n−k−1)SSE
回归系数检验的统计量
ti=βi^sβi^∼t(n−k−1)\begin{aligned} t_i = \frac{\hat{\beta_i}}{s_{\hat{\beta_i}}} \sim t(n-k-1) \end{aligned} ti=sβi^βi^∼t(n−k−1)
时间序列分析和预测
环比增长率
Gi=Yi−Yi−1Yi−1=YiYi−1−1\begin{aligned} G_i = \frac{Y_i - Y_{i-1}}{Y_{i-1}} = \frac{Y_i}{Y_{i-1}} - 1 \end{aligned} Gi=Yi−1Yi−Yi−1=Yi−1Yi−1
定基增长率
Gi=Yi−Y0Y0=YiY0−1\begin{aligned} G_i = \frac{Y_i - Y_{0}}{Y_{0}} = \frac{Y_i}{Y_{0}} - 1 \end{aligned} Gi=Y0Yi−Y0=Y0Yi−1
平均增长率
G‾=(Y1Y0)(Y2Y1)⋯(YnYn−1)n−1=YnY0n−1\begin{aligned} \overline{G} = \sqrt[n]{\left( \frac{Y_1}{Y_0} \right) \left( \frac{Y_2}{Y_1} \right) \cdots \left( \frac{Y_n}{Y_{n-1}} \right)} - 1 = \sqrt[n]{\frac{Y_n}{Y_0}} - 1 \end{aligned} G=n(Y0Y1)(Y1Y2)⋯(Yn−1Yn)−1=nY0Yn−1
平均误差MEMEME
ME=1n∑i=1n(Yi−Fi)\begin{aligned} ME = \frac{1}{n} \sum_{i=1}^n(Y_i - F_i) \end{aligned} ME=n1i=1∑n(Yi−Fi)
平均绝对误差MADMADMAD
MAD=1n∑i=1n∣Yi−Fi∣\begin{aligned} MAD = \frac{1}{n} \sum_{i=1}^n|Y_i - F_i| \end{aligned} MAD=n1i=1∑n∣Yi−Fi∣
均方误差MSEMSEMSE
MSE=1n∑i=1n(Yi−Fi)2\begin{aligned} MSE = \frac{1}{n} \sum_{i=1}^n(Y_i - F_i)^2 \end{aligned} MSE=n1i=1∑n(Yi−Fi)2
平均百分比误差MPEMPEMPE
1n∑i=1n(Yi−FiYi×100)\begin{aligned} \frac{1}{n} \sum_{i=1}^n \left( \frac{Y_i-F_i}{Y_i} \times 100 \right) \end{aligned} n1i=1∑n(YiYi−Fi×100)
平均绝对百分比误差MAPEMAPEMAPE
1n∑i=1n(∣Yi−Fi∣Yi×100)\begin{aligned} \frac{1}{n} \sum_{i=1}^n \left( \frac{|Y_i-F_i|}{Y_i} \times 100 \right) \end{aligned} n1i=1∑n(Yi∣Yi−Fi∣×100)
简单平均法预测
Ft+1=1t∑i=1tYi\begin{aligned} F_{t+1} = \frac{1}{t} \sum_{i=1}^t Y_i \end{aligned} Ft+1=t1i=1∑tYi
移动平均法预测
Ft+1=1k∑i=1kYt−k+i\begin{aligned} F_{t+1} = \frac{1}{k} \sum_{i=1}^k Y_{t-k+i} \end{aligned} Ft+1=k1i=1∑kYt−k+i
指数平滑法预测
Ft+1=αYt+(1−α)Ft=Ft+α(Yt−Ft)\begin{aligned} F_{t+1} &= \alpha Y_t + (1-\alpha)F_t \\ &= F_t + \alpha(Y_t - F_t) \end{aligned} Ft+1=αYt+(1−α)Ft=Ft+α(Yt−Ft)
线性趋势方程的截距和斜率
对于趋势方程Y^=b0+b1t,其参数的计算公式如下b1=n∑tY−∑t∑Yn∑t2−(∑t)2b0=Y‾−b1t‾\begin{aligned} 对于趋势方程\hat{Y} &= b_0 + b_1t,其参数的计算公式如下 \\ b_1 &= \frac{n\sum tY - \sum t \sum Y}{n\sum t^2 - (\sum t)^2}\\ b_0 &= \overline{Y} - b_1\overline{t} \end{aligned} 对于趋势方程Y^b1b0=b0+b1t,其参数的计算公式如下=n∑t2−(∑t)2n∑tY−∑t∑Y=Y−b1t
指数曲线的标准方程组
{∑lnY=nlnb0+lnb1∑t∑tlnY=lnb0∑t+lnb1∑t2\begin{cases} \sum ln Y = n ln b_0 + ln b_1 \sum t \\ \sum t ln Y = ln b_0 \sum t + ln b_1 \sum t^2 \end{cases} {∑lnY=nlnb0+lnb1∑t∑tlnY=lnb0∑t+lnb1∑t2
kkk 阶曲线方程
Yt^=b0+b1t+b2t2+⋯+bktk\begin{aligned} \hat{Y_t} = b_0 + b_1t + b_2 t^2 + \cdots + b_k t^k \end{aligned} Yt^=b0+b1t+b2t2+⋯+bktk
分离季节成分的公式表示
YS=T×S×IS=T×I\begin{aligned} \frac{Y}{S} = \frac{T \times S \times I}{S} = T \times I \end{aligned} SY=ST×S×I=T×I