Enliven: Bioinformatics

Impact of Large Likelihood Distribution Shift on Bayesian Estimation
General Information

Research Article

Michael Hubig1*, Holger Muggenthaler2, and Gita Mall3

1Department of biomechanics, Institute of Legal Medicine, Jena University Hospital – Friedrich Schiller University Jena


2Head, Department of biomechanics, Institute of Forensic Medicine at the University Hospital Jena of the Friedrich Schiller University Jena


3Director, Institute of Forensic Medicine at the University Hospital Jena of the Friedrich Schiller University Jena


Corresponding author


Michael Hubig, Mathematician, Department of biomechanics, Institute of Legal Medicine, Jena University Hospital – Friedrich Schiller University Jena, Fürstengraben 23 07743 Jena, Germany, Tel: +49-3461-935551; Fax: +49-3461-935552; E-mail: Michael.Hubig@med.uni-jena.de

 

Received Date: 8th July 2014

Accepted Date: 12th August 2014

Published Date: 15th August 2014


Citation


Hubig M, Muggenthaler H, Mall G (2014) Impact of Large Likelihood Distribution Shift on Bayesian Estimation. Enliven: Bioinform 1(3): 003.

Copyright


@ 2014 Dr. Michael Hubig. This is an Open Access article published and distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.

Abstract

A major problem arising frequently in Bayesian estimation (BE) is to cope with a shift (e.g. a bias) of the likelihood probability. This paper investigates the impact of a severely shifted continuous likelihood probability on the output of BE in the real numbers. The result can be interpreted as a classification of asymptotic conditional probability distributions induced on a bounded interval by the far tail of probability distributions. It can be applied to a wide range of probability distributions. A hypothetical example of BE in death time estimation in a homicide trial, which is designed similar to a real case, illustrates the practical relevance of the results.

 

Keywords


Bayesian estimation; Likelihood shift; Bias as shift; Asymptotic classification; Auto Asymptotic Limit (AAL)


1.Introduction


Bayesian estimation is a well known technique in applied science and as such was addressed by a large number of theoretic studies. Besides notorious problems in prior determination, the method is subjected to errors in the likelihoods. One class of likelihood error is induced by bias or shift in the argument of the likelihood leading to a shift of the functional graph as well as of the expectation. The present study investigates the effect of likelihood shifts. It focuses on the case of large shifts of an estimators error probability distribution (pd) on the real numbers and deals with the question, which conditional pd is induced on a bounded interval in the far tail of the estimators error pd. For a large pd class, containing many important pds for application problems, the induced conditional pds on a bounded interval in the far tail are computed and classified.

 

Our study is motivated by a forensic science expertise in a murder trial where a Bayesian technique called Conditional Probability Distribution (CPD), published by Biermann and Potente [1], was used to include witness reports into confidence interval estimation for temperature based death time estimation. We published a study [2] on the errors caused by large input biases in the CPD method, dealing with the special case of the Gaussian distribution. The present article generalizes the asymptotic result of [2] to a wide spectrum of distributions and gives a classification theorem.

 

2.Terminology

Our considerations use the terminology of general probability theory, which defines the general conditional probability P(A|B) of an event A, given an event B, where A and B are sets or families of sets of a common probability space O which is never mentioned explicitly. The sets A and B are specified by stating arithmetical or analytical formulae for random variables e.g. “A = {w in O | t(w)=a}” means, that A comprises all w in O for which the random variable t takes the value a. We adopt the convention to write e.g. P(t^ | t) with two random variables t and t^ given for a type of probability distribution of the random variable t^ under a fixed but not specified condition given by an arithmetic statement (equivalent to a set A in O) concerning the random variable t as e.g. A = {w in O | t(w)=a}. If the condition is specified, we write e.g. t=a abbreviating the term for the event A above. For basic terminology see e.g. the textbook of Papoulis [3].

 

2.1 Bayesian Estimation

Let t be a random variable taking values in the real numbers IR and let t^ be an estimator of t. The pd of t^ given the true value of the variable t is denoted in a somewhat sloppy style by P(t^ | t) and is called the likelihood as usual in Bayesian estimation. Generally we will use this sloppy notation and specify values only if the reader would be mislead else. The probability distribution density (pdd) of t^, given the true value of the variable t, is labelled f(t^ | t) and will be called likelihood density. The (unconditional) pds of the variables t and t^ are named the prior P(t) and the marginal distribution P(t^) and we use g(t) and h(t^) as symbols for their pdds. We can now introduce the pdd f(t | t^) of the posterior P(t | t^). Since the posterior density f(t | t^) belongs to a conditional pd, it may be written:

 

f( t| t ^ )=f( t ^ |t ) g( t )/ h( t ^ ) (1)

 

The purpose of Bayesian estimation (BE) is to compute an estimator of the posterior f(t | t^) given an expression f(t^ | t) for the likelihood and another expression for the prior g(t). Expressing h(t^) as the integral of f(t^ | t) g(t) over all t makes it possible to eliminate h(t^) from (1) yielding:

 

f( t| t ^ )= f( t ^ |t )g( t )/ [ I?R f( t ^ |t )g( t )d?t ] (2)

 

Let E(t^ | t = t0) be the conditional expectation of the variable t^ given a value t0 of the variable t in case of existence of the expectation with respect to the likelihood pdd f(t^ | t = t0) given a fixed value t0 of the variable t. We will frequently omit the assignment of the value “ = t0”, sometimes the condition “t = t0” as a whole and sometimes even the argument brackets and the symbol t^ - writing simply E^ in cases where no confusion can occur. The symbol f(t^ | E^, t) will be used in case E(t^ | t) exists and if we want to emphasize or use this fact. Sometimes we even skip the argument t^ and write f(E^, t) or f(t) for f(t^ | E^, t). The symbols f(t^| t) or f(t^) will be used for short in more general considerations, not depending on the existence of E(t^ | t). Writing g(t) for the prior pdd we are now able to represent (2) in a form, where the conditional expectation E(t^ | t) explicitly appears:

 

f( t| t ^ )= f( t ^ |E , ^ t )g( t )/ [ I?R f( t ^ | E ^ ,t )g( t )d?t ] (3)

 

2.2 Shift Families of Probability Densities

Let f: IR→IR be a pdd on the real numbers IR and let Δt be any real number. Let further be fΔt:(s → f(s + Δt)): IR → IR, the Δt-shift of f and Γ be the shift family of f or the location family of f, which is the set of all possible Δt-shifts of f:

 

Γ:={ f Δt |ΔtI??R } (4)

 

Let E be the expectation of f, then we yield the expectation EΔt of any fΔt by straight forward computation and via substitution:

 

E Δt =EΔt (5)

 

So each fΔt is uniquely determined by its expectation EΔt = E - Δt and therefore the dependency on the expectation EΔt can be used to reparameterize the whole family Γ. This gives rise to a change of terminology integrating the expectation EΔt as second argument in the symbol for every family member fΔt of Γ:

 

sI??R:f( s| E Δt ):= f Δt ( s ) (6)

 

As a direct consequence of (5) and (6) we have the following formula, which allows to push the shift Δt from the second to the first argument of f changing Δt’s sign:

 

sI??R:f( s|EΔt )=f( s| E Δt )= f Δt ( s )=f( s+Δt|E ) (7)

 

2.3 Auto Asymptotic Limits

In the following we present some general assumptions to specify a certain class A of pdds f, defined on open infinite intervals in the real numbers, which is investigated here. We use the symbol “↓↓” to express “strictly monotonic decreasing” and “↑↑” to represent “strictly monotonic increasing” and “??” to denote strict monotonicity.

 

A pdd f, defined on an open infinite interval J in the real numbers IR is an element of the class A, iff there is a positive real number R with:

 

(A1) The function f is continuous on the rest J - ]-R,R[ of its range J with the interval ]-R,R[ removed.

 

(A2) (a) If +∞ lies in J, then f is strictly monotonic decreasing to 0 on the interval J ∩ ]R,+∞[:

 

x+f( x ) 0

 

b) If -∞ lies in J, then f is strictly monotonic decreasing to 0 on the interval J ∩ [-∞, -R[:

 

xf( x ) 0

 

(A3) f is differentiable and f’ / f is continuous on J – [-R, R] and fulfills the condition:

 

(a) If +∞ lies in J, then f’ / f is strictly monotonic on J ∩ ]R,+∞[:

 

x+ f'( x )/ f( x ) ??

(b) If -∞ lies in J, then f’ / f is strictly monotonic on J ∩ ]-∞, -R[:

 

xf'( x )/f( x )??

 

2.3.1 Remark

Condition (A3)(a) is equivalent to the existence of a positive constant e and a function α: J ∩ ]R,+∞[ → IR which is continuous and strictly monotonic and fulfills:

 

xJ] R,+ [:f( x )=ee?x?p( R x α( s )d?s )

 

The fact that f is a pdd implies an upper boundary to the descending rate of the integral over α.

 

During the investigation of the impact of large likelihood shifts on the result of BE a certain quantity – we call it the auto asymptotic limit - evolves, which measures the asymptotic decline intensity of a pdd on an unbounded interval J of the real numbers IR. We define:

 

2.3.2 Definition

Let J be an unbounded interval in the real numbers IR. The symbol x → +_∞ means x → +∞ in J if J is unbounded in positive direction and means x → -∞ in J if J is unbounded in negative direction. In case J = IR the symbol is used for both limit processes alternatively. Let f: J → IR be a pdd on J. The auto-asymptotic a[f] (AA) of the pdd f is defined:

 

xJ:x+δxJ:a[f]( x,δx ):= f( x+δx )/ f( x ) (8)

 

The auto-asymptotic limit A±[f](δx) (AAL) is defined in case of existence as:

 

{ I?f+J I?fJ }:δxI?R: A ± [f]( δx ):={ l?i?m x+ l?i?m x }a[f]( x,δx )={ l?i?m x+ l?i?m x } f( x+δx )/ f( x ) (9)

 

It is well known (see Pericchi and Sanso [4]) that the following proposition on AALs can be derived directly from their definition:

 

2.3.3 Proposition

For the pdd f on J there are λ+ in IR≥0 U {∞} and λ- in IR≤0 U {-∞} with: In case of existence:

 

δxI??R: A ± [f]( δx )=e?x?p( λ ± δx ) (10)

 

The following remarks are direct consequences of proposition 2.3.3 and of definition 2.3.2.

 

2.3.4 Remark

For a pdd f in the class A defined by (A1), (A2), (A3):

 

(a) δxI?R{ +, }: A ± [f]( δx )= 1/ A ± [f]( δx ) (11)

 

(b) δx>0:0 A + [f]( δx )<1< A [f]( δx ) δx<0:0 A [f]( δx )<1< A + [f]( δx ) (12)

 

(c) δx0: A ± [f]( δx )=0δx'0: A ± [f]( δx' )=0 (13)

 

(d) δx=0 A ± [f]( δx )=1 (14)

 

2.3.5 Examples

A) The normal distribution v(μ,σ2)

xI?R:ν( μ?,V )( x ):= 1 2π σ e?x?p( ( xμ ) 2 2? σ 2 )

 

xI??R:ν( μ, σ 2 )'( x ):=ν( μ, σ 2 )( x )( xμ σ 2 )

 

xI??R:α[ ν( μ, σ 2 ) ]( x ):= xμ σ 2

 

dI?? R >0 :xI??R:a[ ν( μ, σ 2 ) ]( x,±d )=e?x?p( ? d σ 2 x )e?x?p( ± μ?d σ 2 d 2 2 σ 2 )

 

dI?? R >0 : A ± [ ν( μ, σ 2 ) ]( d )=0

 

B) The log-normal distribution L(v):

x>0:L[ ν ]( x )= 1 2πV x e?x?p( 1 2 σ 2 ( l??n( x )M ) 2 )

 

x>0:L[ ν ]'( x )=L[ ν ]( x ) 1 x [ Ml??n( x ) V 2 1 ]

 

x>0:α[ L ]( x )= 1 x [ Ml??n( x ) V 2 1 ]

 

dI??R:x>| d |: a[ L ]( x,d )= x x+d e?x?p( 1 2 σ 2 ( l??n( x+d )μ ) 2 ) e?x?p( 1 2 σ 2 ( l??n( x )μ ) 2 ) = x x±d e?x?p( 1 2 σ 2 ( l??n ( x+d ) 2 l??n ( x ) 2 2μl??n( x+d x ) ) )

 

dI??R: A + [ L ]( d )= l??i?m x x x+d e?x?p( 1 2σ ( l??n ( x+d ) 2 l??n ( x ) 2 2μl??n( x+d x ) ) ) =1

 

C) The Gamma distribution Γa,b:

x>0: Γ a,b ( x )= 1 a b Γ( b ) x b1 e?x?p( x a )

 

x>0: Γ a,b '( x )= Γ a,b ( x )[ ( b1 ) x 1 a ]

 

x>0:α[ Γ a,b ]( x )= ( b1 ) x 1 a

 

d0:x>0:a[ Γ a,b ]( x,d )= ( x+d x ) b1 e?x?p( d a )

 

d0: A + [ Γ a,b ]( d )=e?x?p( d a )

 

D) The t-distribution tn

xI??R: t n ( x ):= Γ( ( n+1 )/2 ) πn Γ( n/2 ) 1 ( 1+ x 2 n ) ( n+1 )/2

 

xI??R: t n '( x )= t n ( x )( ( n+1 )x n+ x 2 )

 

xI??R:α[ t n ]( x )= ( n+1 )x n+ x 2

 

d0:xI??R:a[ t n ]( x,d )= ( 1+ ( x±d ) 2 n 1+ x 2 n ) n+1 2

 

d0: A ± [ t n ]( d )=1

 

E) The Laplace distribution La:

xI??R:L??a( x ):= 1 2 e?x?p( | x | )

 

xI??R:L?a'( x )=L?a( x )( ?1 )

 

xI??R:α[ L???a ]( x )=1

 

xI??R:d0:a[ L??a ]( x,±d )=e?x?p( | x±d |+| x | )

 

d0: A ± [ L??a ]( d )=e?x?p( ?d )

 

F) The logistic distribution lc:

xI??R:l??c( x ):= e?x?p( x ) ( 1+e?x?p( x ) ) 2

 

xI??R:l???c'( x )=l???c( x ) e??x???p( x )1 1+e???x??p( x )

 

xI??R:α[ l??c ]( x )= e?x?p( x )1 1+e?x??p( x )

 

xI??R:d0:a[ l??c ]( x,±d )= ( 1+e??x??p( x?d ) 1+e??x??p( x ) ) 2 e??x??p( ?d )

 

d0: A ± [ l??c ]( d )=e?x??p( ?d )

 

We will show now that every function f in the function class A, defined by the conditions (A1), (A2), (A3) above, guarantees the existence of the AALs.

 

2.3.6 Remark

Let J be an interval in IR containing +∞ and / or -∞. Let f: J → IR be a pdd with (A1), (A2), (A3). This implies for all δx in IR the unambiguous existence of the AAL A±[f](δx) in IR≥0 U {+∞}.

Remark 2.3.6 is proven in Appendix A.

 

3. Large Shifts

The following proposition investigates the case of large likelihood shifts which is caused e.g. by large bias. It provides asymptotic formulae for the probability PΔt(t ? [a,b]| t ? [c,d]) in case Δt → ±∞.

 

3.1 Proposition

Let J be an open interval in the real numbers IR and let be +∞ in J or -∞ in J. Let further be f: J → IR a pdd which fulfills (A1), (A2), (A3) and let [c,d] be a nonempty interval in the real numbers IR and [a,b] a partial interval of [c,d]. For all Δt in IR let fΔt(t) := f(t + Δt) and let PΔt(t ? [a,b] | t ? [c,d]) be the conditional probability of t in [a,b] under the condition t in [c,d] with respect to the pdd fΔt. Let λ± be the factor in the exponent of the AAL A±[f](δt) from (10). The following expressions can be derived for the asymptotic probability limΔt→±∞ PΔt(t ? [a,b] | t ? [c,d]):

 

Case (1):

One of the following alternative conditions (15) shall be fulfilled:
{ A + [f]( dc )1 A [f]( dc )1 } (15)

 

Case (1a):

If additionally to (15) the matching condition of the alternatives (16) is realized:
{ δt>0: A + [f]( δt )0 δt<0: A [f]( δt )0 } (16)

 

we yield:
l?i?m Δt+/ P Δt ( t[a,b]|t[c,d] )= e??x??p( λ ± b )e???x????p( λ ± a ) e??x?p( λ ± d )e???x???p( λ ± c ) (17)

 

Case (1b):

If else additionally to (15) a matching alternative condition of (18) is given:
{ δt>0: A + [f]( δt )=0 δt<0: A [f]( δt )=0 } (18)

 

The following expression represents our probability of interest:
l?i?m Δt+/ P Δt ( t[a,b]|t[c,d] ) ={ 0 1 i?f[ ( Δt± )( bd )( ac ) ] i?f[ ( Δt+ )( a=c ) ][ ( Δt )( b=d ) ] } (19)

 

Case (2):

If in contraposition to (15) one of the following conditions (20) is valid:

 

{ A + [f]( dc )=1 A [f]( dc )=1 } (20)

 

the conditional probability of the true value of t lying in a partial interval [a,b] of the greater interval [c,d] under the additional condition of the true value of t lying in [a,b] can now be rewritten as:

 

l?i?m Δt+/ P Δt ( t[a,b]|t[c,d] )= ba dc (21)

 

Proposition 3.1 is proven in appendix B.

The results developed in proposition 3.1 are easily applicable to the case of estimator biases in BE with constant priors:

 

3.2 Remark

Let the presuppositions of proposition 3.1 be given for the conditional pdd f(t^| t) of the unbiased estimator t^ of a real value t. Let for every real number Δt the shifted pdd fΔt be the pdd of an estimator t^Δt, which therefore has a bias of Δt. Let PΔt(t ? [a,b] | t^, t ? [c,d]) - which is computed using the shifted pdd fΔt - be the conditional probability of t in [a,b] under the conditions of t in [c,d] and the estimation value t^. The values of the variable t are realized by the transformation t = t0 - s, where t0 is an arbitrarily fixed value of t and s is a matching real random variable. This leads via application of (7) and the definition v:= t^ + s to:

 

P Δt ( t[a,b]| t ^ , t[c,d] ):= a b f ( t ^ |tΔt ) d?t c d f ( t ^ |tΔt ) d?t = t 0 a t 0 b f ( t ^ | t 0 sΔt ) d?s t 0 c t 0 d f ( t ^ | t 0 sΔt ) d?s = t 0 a t 0 b f ( t ^ +s| t 0 Δt ) d?s t 0 c t 0 d f ( t ^ +s| t 0 Δt ) d?s = t 0 t^b t 0 t^a f ( v| t 0 Δt ) d?v t 0 t^d t 0 t^c f ( v| t 0 Δt ) d?v = t 0 t^b t 0 t^a f ( v+Δt| t 0 ) d?v t 0 t^d t 0 t^c f ( v+Δt| t 0 ) d?v =: P Δt ( v[ t 0 t ^ b, t 0 t ^ a]|v[ t 0 t ^ d, t 0 t ^ c] ) (22)

 

Proposition 3.1 may now be applied to the probability PΔt(t ? [a,b] | t^, t ? [c,d]), where v plays the role of t and the interval limits a, b, c, d are realized by: t0 - t^ – b, t0 - t^ – a, t0 - t^ – d, t0 - t^ – c. Note that in all cases the results do not depend on t0 or on t^.

 

4 Application Example: Forensic Death Time Estimation

Death time estimation (DE) means reconstruction of the time difference tD between real death time t and time tM of measuring a value XM of a quantity X = X(s, θ) - e.g. in case of temperature based DE (TDE) the rectal temperature TM = XM - which monotonously depends on time s and whose value X0 = X(t) at time of death t is known. The parameter θ refers to any vector of measurable quantities (e.g. in the TDE approach of Marshall and Hoare [5] with the parameter definition of Henssge [6,7] the rectal temperature T0 at death time, the ambient temperature TA, the body mass m, and Henssges corrective factor cf), influencing the time evolution of the quantity X. Reconstruction is performed by solving the following equation system for tD:

 

X 0 =X( t ) X M =X( t M )=X( t+ t D ) (23)

 

The time of death estimator t^ can now easily be calculated by computing:

 

t ^ := t M t D (24)

 

Since DE uses real world measurements, it is prone to measurement errors of its input variables θ, tM, XM and X0 and to systematic errors of the model X(t) used. To cope with the resulting errors of the death time estimator t^, one usually represents t^ as a random variable associated with a pd P(t^) or with its pdd f(t^) respectively. Since the pd P(t^) is nearly always determined under the assumption of a fixed real value of death time t, one refers to the conditional pd P(t^ | t) of the estimator t^ under the condition of real death time t assuming a particular, though usually unknown, time value. This is the distribution most frequently stressed in application cases. The pd P(t^ | t) is also used as likelihood distribution for BE.

 

4.1 Temperature Based Death Time Determination

Choosing a special body temperature T as the quantity X leads to temperature based death time estimation (TDE). The most frequently used method of temperature based death time determination is the model based approach of Marshall and Hoare [4] with the parameter definition of Henssge [5,6]. It will be referred to here for short as the Henssge method. Relying on the central limit theorem it is implicitly taken for granted that the death time estimator t^ of the Henssge method is associated with a Gaussian distribution as its conditional pd P(t^ | t):

 

[r,s]I?R:P( t ^ [r,s]|t )= r s 1 2πV e?x?p( ( t ^ t ) 2 2V )d?? t ^ =N( st V )N( rt V ) (25)

 

with the associated conditional pdd f(t^ | t):

 

f( t ^ |t )= 1 2πV e?x?p( ( t ^ t ) 2 2V ) (26)

 

Where t is the true death time value, N is the probability function of the standard normal distribution or Gaussian, V is the variance value and [r,s] is any interval in the real numbers IR. It should be emphasized here, that the assumptions of (25) and (26) imply the estimator t^ to be unbiased, which means, that the expected value of P(t^ | t) is the true value t:

 

E( t ^ |t )=t (27)

 

Usually there is only one temperature value TM measured at a time tM. Therefore the estimated value t^ = tM - tD(TM, tM) has to be taken as the only valid estimator of the expectation in formula (25) for the likelihood. In practice the following likelihood formula is used:

 

f( t ^ |t ) 1 2πV e?x?p( ( t ^ t M + t D ( T M , t M ) ) 2 2V ) (28)

 

To compute the expression (3) for the application of the CPD-method.

 

4.2 Influence of Biased Death Time Estimators on Bayesian Estimation

We now assume the scenario of the time interval [c,d] with the particular partial interval [a,b] on the time axis and ask for the posterior probability P( t?[a,b] | t^, t?[c,d]) of the true death time t lying in the partial interval [a,b] of the time interval [a,b] under the conditions that the estimator t^ takes a fixed value and the true death time lying in [c,d]. The limit c of the last condition may be motivated by testimonies of witnesses who had seen the deceased at time c alive whereas the upper limit d is the time where the body was found. The question for P(t?[a,b] | t^, t?[c,d]) can be answered by a BE.

 

In cases where the conditional likelihood distribution P(t^ | t) used for BE is biased, the results are distorted in the typical way described in paragraph 3. Integration of the pdd f(t^ | t - Δt) over the intervals [a, b] and [c, d] of the true time of death t and taking the quotient and assuming a constant prior on [c, d] yields the conditional probability PΔt( t?[a,b] | t^, t?[c,d] ), which bears the index Δt to remind the reader of the existence of the likelihood probabilities’ bias:

 

P Δt ( t[a,b]| t ^ ,t[c,d] )= a b e?x?p( ( ( tΔt ) t ^ ) 2 2V ) d?t d c e?x?p( ( ( tΔt ) t ^ ) 2 2V )d?t (29)

 

Assuming a large bias Δt, we have to take into account the fact that with example (A) we have A+[ν(E,V)](δt) = 0 for all δt > 0 and A-[ν(E,V)](δt) = 0 for all δt < 0. We apply formula (19) and yield:

 

l?i?m Δt+/ P Δt ( t[a,b]| t ^ ,t[c,d] ) ={ 0 1 i?f[ ( Δt± )( bd )( ac ) ] i?f[ ( Δt+ )( a=c ) ][ ( Δt )( b=d ) ] } (30)

 

4.3 Bayesian Estimation in Temperature Based Death Time Estimation

The following hypothetical example (E) was constructed to show the power and the risks of the CPD [3] in case of its application to TDE in a court hearing of a homicide charge. Example (E) is similar to a real homicide case where we were took part as additional experts of the defense counsel.

 

Example (E)

There is non-temperature information from testimonies that the deceased was still alive at a time c = 10:00 a.m. and his body was found at d = 4:00 p.m. the same day, which makes the time of death lying in the interval [c,d] with a 100% probability a priori. The prime suspect has no alibi in a time interval [a,b] lying in the 100%-interval [c,d]. For a TDE approach there is a measurement consisting of the measurement time tM, the ambient temperature TA and the rectal temperature TM at tM. Backcalculation using Henssges TDE yields a time difference tD between death and measurement which leads to an estimated time of death t^: Now the judge asks for the probability of the real time of death lying in the interval [a,b] with respect to the back calculation value t^. In our terminology this is the conditional probability P(t ? [a,b] | t^, t ? [c,d]). As usual the likelihood distribution P(t^ | t) of the TDE result t^, given the fixed time of death t, is assumed to be a Gaussian with expectation E = t and variance V. Let us assume Henssge’s TDE yields a 95% confidence interval radius r = 2.8 h, which results in a variance V = (r / 2)2 = 1.96 h2 of the likelihood distribution P(t^ | t).

 

Let us assume further that there are two alternatives for the alibi time interval [a,b]:

(A) Between a = 10:00 a.m. and b = 11:00 a.m.
(B) Between a = 10:30 a.m. and b = 11:30 a.m.

There is only one back calculation value t^, but we present three versions (1), (2), (3) of this value in three cases of an estimator bias e.g. by measurement errors of the ambient temperature TA, - of the rectal temperature T0 at time of death or by nonstandard conditions, which were not taken into account by choosing an adequate corrective factor cf. In each one of the three alternatives, TDE backcalculation produces an estimated value t^, which lies outside the interval [c,d]:

(1) t^ = 9:00 a.m.
(2) t^ = 7:00 a.m.
(3) t^ = 5:00 a.m.

 

We compute the conditional probability P = PΔt(t ? [a,b] | t^, t ? [c,d]) for each of the six possible combinations of [a,b] and t^ without taking into account the estimators bias Δt which is mostly unknown in real casework. This yields the following results:

 

(A1) a = 10:00 a.m., b = 11:00 a.m.; t^ = 9:00 ? P = 0.678
(A2) a = 10:00 a.m., b = 11:00 a.m.; t^ = 7:00 ? P = 0.867
(A3) a = 10:00 a.m., b = 11:00 a.m.; t^ = 5:00 ? P = 0.949
(B1) a = 10:30 a.m., b = 11:30 a.m.; t^ = 9:00 ? P = 0.442
(B2) a = 10:30 a.m., b = 11:30 a.m.; t^ = 7:00 ? P = 0.346
(B3) a = 10:30 a.m., b = 11:30 a.m.; t^ = 5:00 ? P = 0.231

 

It is of some interest to have a look at the erroneous posterior probability P = PΔt(t ?[a,b] | t^, t ? [c,d]) for different values of the bias Δt. Since the bias is unknown in this situation, we present plots of the probability P as functions of Δt for scenario (A) in figure 1 and for scenario (B) in figure 2 where we assume a ‘true value’ (which means an unbiased value) of t^ = 9:00 a.m. to be able to compute P.

 

In scenario (A) the P value falls from 1 down to 0 in an S-shaped curve crossing the x-axis at 0.678 while the Δt values rise from -10 h up to 10 h (figure 1). The result of scenario (B) over an interval [-8 h, 8 h] of biases Δt is a peak of P rising moderately from left to right from ca. P = 0.1 to a maximum value of P = 0.45 at Δt = 1 h. The curve then takes a steeper course down to a value of ca. P = 0.005 at Δt = 8 h (figure 2).

 

The figures 1 and 2 illustrate the two types of scenarios possible for the long range biases:

 

(A) Equal lower limits a = c of the two intervals [a,b] and [c,d]:
PΔt rises for lower negative values of the bias Δt to 1.
PΔt falls for higher positive values of the bias Δt to 0.

 

(B) Different lower limits a > c and b < d of the two intervals [a,b] and [c,d]:
PΔt falls for higher positive and for lower negative values of the bias Δt to 0.
PΔt reaches a maximum for a value of the bias Δt near 0 h.




5 Discussion

This paper presents an investigation of the influence of likelihood shifts on the results of BE. It was inspired by a work [1] of Biermann and Potente from 2011 who intended their Bayesian approach - which they called Conditional Probability Distribution (CPD) method - to calculate probabilities for time intervals in temperature based death time determination.

 

We analyze the important cases of large systematic errors in the BE input. Systematic errors which result in biases Δt can lead to major deviations of the estimated probabilities. The analysis of the large bias cases reveals that the set A of all pdds investigated (for a definition see: (A1), (A2), (A3)) is divided into three disjoint subsets according to three families of asymptotic limit distributions which are in a one to one correspondence to the three possible range sets for the AAL-values: A±[f](d-c) in {0}, {1}, ]0,1[ U ]1,+∞[.

 

In the cases (1b) of proposition 3.1 only the position of the interval [a, b] of interest in the large 100% interval matters: For large biases Δt → +∞ (which is equivalent to the expectation E(t^) of the estimator t^ being shifted to -∞ if the interval [c,d] is considered fixed) the conditional probability of [a, b] under the condition of the backcalculated value tD tends to the limit 1 if a = c and to the limit 0 if c < a. For a large negative bias Δt → -∞, the situation is vice versa: The limit is 1 if b = d and 0 if b < d. This establishes the paradox that a huge bias can make the probability of an interval higher, the farer the estimated value t^ lies away from this interval.

 

Case (2) of proposition 3.1 shows a dependency on the relative length of the interval [a,b] in the interval [c,d] but no dependency on the position of the interval [c,d] in IR or on the position of [a,b] in the larger interval [c,d]. In case (1a) of proposition 3.1 shows there is a dependency on the absolute position of c, d as well as on the absolute position of a, b in IR.

 

The fact that large biases can lead to dramatic overestimation of partial intervals establishes a warning post for practical work using Bayesian estimation approaches as e.g. the CPD method. The probabilities calculated are useful only in case the likelihood probability P(t^ | t) is unbiased.

 

We illustrated the importance of the issue by a forensic science example (E) representing a typical error induced in the results of the CPD method by a TDE bias. Example (E) was designed to demonstrate the typical sensitivness of Bayesian estimation to biases in the likelihood probability P(t^ | t) used. TDE, having a Gaussian likelihood, implies case (1b) of proposition 3.1 in CPD.

 

From an abstract point of view, the result can be interpreted as a classification of asymptotic conditional probability distributions induced on a bounded interval by the far tail of a probability distribution which can be chosen from a wide range containing many probability distributions which are important in science statistical problems.

 

From an abstract point of view, the result can be interpreted as a classification of asymptotic conditional probability distributions induced on a bounded interval by the far tail of a probability distribution which can be chosen from a wide range containing many probability distributions which are important in science statistical problems.

 

Up to now we are not aware of any further applications of our results apart from the usage in death time determination. In the latter case the approach resolved merely a misunderstanding than yielding a new statistical method. There is no concrete statistical application to our knowledge yet, but we believe that our mainly theoretical article might stimulate further statistical applications. Particularly the classification result could be used for designing tests which can differentiate between classes of probability distributions by using samples from their far tails only. This might be interesting for experimental researchers if they can access the far tails of a distribution only for experimental or financial reasons.


References


1. Biermann FM, Potente S (2011) The deployment of conditional probability distributions for death time estimation. Forensic Sci Int 210: 82-86.


2. Hubig M, Muggenthaler H, Mall G (2014) Conditional probability distribution (CPD) method in temperature based death time estimation: Error propagation analysis. Forensic Sci Int 238: 53-58.


3. Papoulis A (1990) Probability & Statistics. Prentice Hall, Englewood Cliffs, New Jersey.


4. Pericchi LR, Sanso B (1995) A Note on Bounded Influence in Bayesian-Analysis. Biometrika 82: 223-225.


5. Marshall TK, Hoare FE (1962) Estimating the time of death – The rectal cooling after death and its mathematical expression. J Forensic Sci 7: 56-81.


6. Henssge C (1979) Precision of estimating the time of death by mathematical expression of rectal body cooling (author's transl). Z Rechtsmed 83: 49-67.


7. Henssge C (2000) Experiences with a compound method for estimating the time since death in practical casework at the scene. I Rectal temperature time of death nomogram. Int J Legal Med 113: 320-331.