Approximating the Beta Functions
and Related Distributions
Daniel D. Warner
July 31, 1999
The Beta Function
The Beta function is defined by
=
,
where
and
.
Note that the Beta function is symmetric in its arguments,
.
For modest integer values the Beta function can be computed using the simple recurrence for the Binomial coefficient.
For other values of the arguments the Beta function should be computed using the lnGamma function, to avoid overflow.
As a practical matter, the effort to detect the integer case is not worth the trouble since it won't avoid roundoff anyway.
>
B := proc(aa::algebraic, bb::algebraic)
local a, b, m, n, y, k;
a := eval(aa);
b := eval(bb);
m := min(a-1,b-1);
if (type(a,integer) and type(b,integer) and m<26) then
n := a+b-2;
y := 1;
for k from 1 to m do
y := y*(n-k+1);
y := y/k;
od;
RETURN(1/(y*(n+1)));
else
y := exp(lnGAMMA(a)+lnGAMMA(b)-lnGAMMA(a+b));
RETURN(evalf(y));
fi;
end:
>
Bcheck := (a,b)->GAMMA(a)*GAMMA(b)/GAMMA(a+b);
>
Digits := 30;
B(2,3),Bcheck(2,3);
B(3,2),Bcheck(3,2);
B(2,4),Bcheck(2,4);
B(1/2,3),evalf(Bcheck(1/2,3));
B(20,40),Bcheck(20,40);
B(30,40),evalf(Bcheck(30,40));
> plot3d((Bcheck - B), 0..30, 0..10, title="Plot of the Absolute Errors",axes=BOX);
> plot3d(((Bcheck - B)/Bcheck), 0..30, 0..10, title="Plot of the Relative Errors",axes=BOX);
The Incomplete Beta Function
The Incomplete Beta function is defined by
=
,
where
,
,
, and
.
The Incomplete Beta function satisfies the symmetry relationship
.
The following plot displays the Incomplete Beta function for five sets of values,
> IB := (x,a,b) -> GAMMA(a+b)/(GAMMA(a)*GAMMA(b))*int(t^(a-1)*(1-t)^(b-1),t = 0 .. x);
> plot({IB(x,0.5,5),IB(x,1,3),IB(x,8,10),IB(x,0.5,0.5),IB(x,5,0.5)},x=0..1);
The Incomplete Beta Function can be computed from the continued fraction
where
and
.
As noted in Abramowitz and Stegun, the best results are obtained when
.
However the authors of the Numerical Recipes suggest the condition
,
which has the advantage that the condition is always less than 1. Testing indicates
that this is a more robust test.
Note that if
, then
and we can exploit the symmetry
relationship
.
Also, the
and
convergents are less than
and the
and
convergents are larger than
.
>
IBeta := proc(xx::algebraic, aa::algebraic, bb::algebraic)
local x, a, b, c, d, y, p0, q0, p1, q1, p2, q2, p3, q3, d0, d1, d2, d3, k, t0, t1, s, tol;
tol := 0.5*10^(-15);
x := eval(xx);
a := eval(aa);
b := eval(bb);
# Note that we must have 0 <= x <= 1, 0 < a, and 0 < b.
# Compute the continued fraction
if (x < (a+1)/(a+b+2)) then
y := x;
c := a;
d := b
else # Use I[x](a,b) = 1 - I[1-x](b,a)
y := 1-x;
c := b;
d := a;
fi;
p2 := 1;
q2 := 1;
t0 := 1;
p3 := 1;
d1 := -(c+d)/(c+1) * y;
q3 := 1 + d1;
t1 := p3/q3;
k := 0;
while (tol*(t0+t1)/2 < abs(t1-t0)) do
# Note that the alternate odd convergents bracket the answer.
t0 := t1;
k := k + 1;
p0 := p2;
q0 := q2;
p1 := p3;
q1 := q3;
d2 := k*(d-k)/((c+2*k-1)*(c+2*k)) * y;
p2 := p1 + d2*p0;
q2 := q1 + d2*q0;
d3 := -((c+k)*(c+d+k))/((c+2*k)*(c+2*k+1)) * y;
p3 := p2 + d3*p1;
q3 := q2 + d3*q1;
t1 := p3/q3;
od;
if (y = 0) or (y = 1) then
s := 0;
else
s := exp(c*log(y) + d*log(1-y) - log(c) + lnGAMMA(c+d) - lnGAMMA(c) -lnGAMMA(d));
fi;
if (x < (a+1)/(a+b+2)) then
RETURN(evalf(s*t1));
else
RETURN(evalf(1 - s*t1));
fi;
end:
>
IB1 := x -> IBeta(x,0.5,5);
IB2 := x -> IBeta(x,1,3);
IB3 := x -> IBeta(x,8,10);
IB4 := x -> IBeta(x,0.5,0.5);
IB5 := x -> IBeta(x,5,0.5);
plot({IB1,IB2,IB3,IB4,IB5}, 0..1);
>
IB1 := x -> (IB(x,0.5,5) - IBeta(x,0.5,5))/IB(x,0.5,5);
plot(IB1, 0..1);
>
IB2 := x -> (IB(x,1,3) - IBeta(x,1,3))/IB(x,1,3);
plot(IB2, 0..1);
>
IB3 := x -> (IB(x,8,10) - IBeta(x,8,10))/IB(x,8,10);
plot(IB3, 0..1);
>
IB4 := x -> (IB(x,0.5,0.5) - IBeta(x,0.5,0.5))/IB(x,0.5,0.5);
plot(IB4, 0..1);
>
IB5 := x -> (IB(x,5,0.5) - IBeta(x,5,0.5))/IB(x,5,0.5);
plot(IB5, 0..1);
It was necessary to request 30 digits of precision to insure that the
a priori
calculations were done to sufficient accuracy.
Comparing the intermediate results of the following calculation in 20 and 30 digit accuracy show that the continued fraction
is computed to full accuracy but that the integral in the definition is not.
>
Digits := 20;
tol := 0.5*10^(-15);
y := 0.01;
c := 5;
d := 0.5;
p2 := 1:
q2 := 1:
t0 := 1:
p3 := 1:
d1 := -(c+d)/(c+1) * y:
q3 := 1 + d1:
t1 := p3/q3;
k := 0;
while (tol < abs(t1-t0)) do
# Note that the alternate odd convergents bracket the answer.
t0 := t1:
k := k + 1;
p0 := p2:
q0 := q2:
p1 := p3:
q1 := q3:
d2 := k*(d-k)/((c+2*k-1)*(c+2*k)) * y:
p2 := p1 + d2*p0:
q2 := q1 + d2*q0:
d3 := -((c+k)*(c+d+k))/((c+2*k)*(c+2*k+1)) * y:
p3 := p2 + d3*p1:
q3 := q2 + d3*q1:
t1 := p3/q3;
od:
s0 := t1;
s := exp(c*log(y) + d*log(1-y) - log(c) + lnGAMMA(c+d) - lnGAMMA(c) -lnGAMMA(d));
s2 := evalf(s*t1);
s1 := IB(y,c,d);
evalf((s1-s2)/s1);
evalf(s1/s);
>
Digits := 30;
tol := 0.5*10^(-15);
y := 0.01;
c := 5;
d := 0.5;
p2 := 1:
q2 := 1:
t0 := 1:
p3 := 1:
d1 := -(c+d)/(c+1) * y:
q3 := 1 + d1:
t1 := p3/q3;
k := 0;
while (tol < abs(t1-t0)) do
# Note that the alternate odd convergents bracket the answer.
t0 := t1:
k := k + 1;
p0 := p2:
q0 := q2:
p1 := p3:
q1 := q3:
d2 := k*(d-k)/((c+2*k-1)*(c+2*k)) * y:
p2 := p1 + d2*p0:
q2 := q1 + d2*q0:
d3 := -((c+k)*(c+d+k))/((c+2*k)*(c+2*k+1)) * y:
p3 := p2 + d3*p1:
q3 := q2 + d3*q1:
t1 := p3/q3;
od:
s0 := t1;
s := exp(c*log(y) + d*log(1-y) - log(c) + lnGAMMA(c+d) - lnGAMMA(c) -lnGAMMA(d));
s2 := evalf(s*t1);
s1 := IB(y,c,d);
evalf((s1-s2)/s1);
evalf(s1/s);
>
Student-t Distribution
Student-t Probability Density
The Student-t Probability Density,
, is given by
,
> Tpd := (x,df) -> (GAMMA((df+1)/2)/GAMMA(df/2))*(1+x^2/df)^(-(df+1)/2)/sqrt(Pi*df);
>
Tpd(1,2);
evalf(%);
>
evalf(Tpd(0,1));
evalf(Tpd(0,2));
evalf(Tpd(0,5));
evalf(Tpd(0,10));
evalf(Tpd(0,25));
> plot({Tpd(x,1),Tpd(x,2),Tpd(x,5),Tpd(x,10),Tpd(x,25)},x=-4..4);
Student-t Cumulative Distribution
The Student-t Probability Distribution,
, is given by
,
where
is the lower boundary and
is the upper boundary.
Note that the case
, is special since the definition reduces to
=
(
).
> Tcd0 := (a,b,df) -> GAMMA((df+1)/2)/(GAMMA(df/2)*sqrt(Pi*df))*int((1+x^2/df)^(-(df+1)/2),x=a..b);
>
Tcd1 := proc(aa::algebraic, bb::algebraic, df::algebraic)
local x, a, b, d, tlo, thi;
a := eval(aa);
b := eval(bb);
d := eval(df);
if (d = 1) then
tlo := 0.5 - arctan(abs(a))/Pi;
thi := 0.5 - arctan(abs(b))/Pi;
else
tlo := (1/2)*IBeta(df/(df+a^2),df/2,1/2);
thi := (1/2)*IBeta(df/(df+b^2),df/2,1/2);
fi;
if (0<=a) then
tlo := 1 - tlo;
fi;
if (b<0) then
thi := 1 - thi;
fi;
RETURN(evalf(1-thi-tlo));
end:
>
Ttest0 := x -> Tcd0(-10000, x, df);
Ttest1 := x -> Tcd1(-10000, x, 1); Ttest1(3.0);
Ttest2 := x -> Tcd1(-10000, x, 2); Ttest2(3.0);
Ttest3 := x -> Tcd1(-10000, x, 10); Ttest3(3.0);
Ttest4 := x -> Tcd1(-10000, x, 25); Ttest4(3.0);
plot({Ttest1, Ttest2, Ttest3, Ttest4}, -3..3, title="Plots of the Approximated Cumulative Student-t Distribution");
>
df := 1;
plot((Ttest0 - Ttest1), -3..3, title="Detailed Plot of the Errors");
>
df := 2;
plot(((Ttest0 - Ttest2)/Ttest0), -3..3, title="Detailed Plot of the Relative Errors");
>
df := 10;
plot(((Ttest0 - Ttest3)/Ttest0), -3..3, title="Detailed Plot of the Relative Errors");
>
df := 25;
plot(((Ttest0 - Ttest4)/Ttest0), -3..3, title="Detailed Plot of the Relative Errors");
>
Inverse Cumulative t Distribution
Given a confidence level,
, we want to find
such that
=
=
where
.
Now if
and
=
=
,
then Abramowotz and Stegun provide the following asymptotic connection between
and
.
~
+ . . ., where
,
, and
.
To solve this find
such that
.
Then
,
Since we are only getting an approximation to
, we will start with the simple rational approximation to the inverse error function (see Erf.mws). Then calculate the approximation to
, and finally apply a few steps of the Newton-Raphson iteration.
>
InvTcd := proc( yy::algebraic, df::algebraic )
local a, b, c, d, p, nu, x, y, z, zp, yp, tp, told, tol, k;
a := [ 0.886226899, -1.645349621, 0.914624893, -0.140543331];
b := [-2.118377725, 1.442710462, -0.329097515, 0.012229801];
c := [-1.970840454, -1.624906493, 3.429567803, 1.641345311];
d := [ 3.543889200, 1.637067800];
p := evalf(yy);
# p corresponds to a confidence level, e.g. 95%
nu := eval(df);
if (nu = 1) then
RETURN( evalf(tan(p*Pi/2)) );
fi;
tol := 0.5*10^(-15);
y := p;
if abs(y) < 0.7 then
z := y^2;
zp := y*(((a[4]*z+a[3])*z+a[2])*z+a[1]) / ((((b[4]*z+b[3])*z+b[2])*z+b[1])*z+1);
elif (0 < y) then
z := sqrt(-log((1-y)/2));
zp := (((c[4]*z+c[3])*z+c[2])*z+c[1]) / ((d[2]*z+d[1])*z+1);
else
z := sqrt(-log((1+y)/2));
zp := -(((c[4]*z+c[3])*z+c[2])*z+c[1]) / ((d[2]*z+d[1])*z+1);
fi;
yp := evalf(sqrt(2)*zp);
tp := yp + (yp^3+yp)/(4*nu) + (5*yp^5+16*yp^3+3*yp)/(96*nu^2)
+ (3*yp^7+18*yp^5+17*yp^3-15*yp)/(384*nu^3);
# Newton-Raphson iteration
for k from 1 to 100 do
told := tp;
tp := evalf(tp - (Tcd1(-tp,tp,nu)-p)/(2*Tpd(tp,nu)));
if (abs(tp-told) < tol*tp) then
break;
fi;
od;
RETURN( tp );
end:
The following calculations illustrate the accuracy of the initial approximations and the convergence of the Newton-Raphson iteration. For each of these calculations the maximum number of iterations was set as indicated. Note that the initial approximation is better for higher degrees of freedom, except in the case of
, where we exploit the arctan defnition. Note that all the cases, (
= 2, 5,10, and 25) terminate with more than 15 digits in five or less iterations.
>
# Simple check - no iteration
degF := 1;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 2;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 5;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 10;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 25;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
>
# One iteration
degF := 2;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 5;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 10;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 25;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
>
# Two iterations
degF := 2;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 5;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 10;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 25;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
>
# Up to 5 iterations
degF := 2;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 5;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 10;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 25;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
>
# Up to 10 iterations
degF := 2;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 5;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 10;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
degF := 25;tp := InvTcd(0.95,degF);evalf(Tcd0(-tp,tp,degF));Tcd1(-tp,tp,degF);
>
# Compare with the most extreme tabulated values.
degF := 1;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 2;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 3;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 4;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 5;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 10;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 15;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 20;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 25;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 30;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
degF := 120;tp := InvTcd(0.999999,degF);Tcd1(-tp,tp,degF);
> plot3d(InvTcd, 0.5..0.95, 2..25);
>
Id0 := (x,df) -> x:
Id1 := proc(x,df)
local t;
t := InvTcd(x,df);
RETURN( Tcd1(-t,t,df) );
end:
> plot3d((Id0 - Id1)/Id0, 0.5..0.95, 2..25, title="Plot of Relative Errors");
>