User:TLeadbe1/sandbox

This is the user sandbox of TLeadbe1. A user sandbox is a subpage of the user's user page. It serves as a testing spot and page development space for the user and is not an encyclopedia article. Create or edit your own sandbox here.

Other sandboxes: Main sandbox | Template sandbox

Finished writing a draft article? Are you ready to request review of it by an experienced editor for possible inclusion in Wikipedia? Submit your draft for review!

Events are often triggered when a stochastic or random process first encounters a threshold. The threshold can be a barrier, boundary or specified state of a system. The amount of time required for a stochastic process, starting from some initial state, to encounter a threshold for the first time is referred to variously as a first hitting time. In statistics, first-hitting-time models are a sub-class of survival models. The first hitting time, also called first passage time, of the barrier set $B$ with respect to an instance of a stochastic process is the time until the stochastic process first enters $B$ .

More colloquially, a first passage time in a stochastic system, is the time taken for a state variable to reach a certain value. Understanding this metric allows one to further understand the physical system under observation, and as such has been the topic of research in very diverse fields, from economics to ecology.^[1]

The idea that a first hitting time of a stochastic process might describe the time to occurrence of an event has a long history, starting with an interest in the first passage time of Wiener diffusion processes in economics and then in physics in the early 1900s.^[2]^[3]^[4] Modeling the probability of financial ruin as a first passage time was an early application in the field of insurance.^[5] An interest in the mathematical properties of first-hitting-times and statistical models and methods for analysis of survival data appeared steadily between the middle and end of the 20th century.^[6]^[7]^[8]^[9]^[10]

Examples

A common example of a first-hitting-time model is a ruin problem, such as Gambler's ruin. In this example, an entity (often described as a gambler or an insurance company) has an amount of money which varies randomly with time, possibly with some drift. The model considers the event that the amount of money reaches 0, representing bankruptcy. In finance, the framework for a first-hitting-time problem can be applied to stochastic models for market volatility in order to help predict the frequency of market crashes.^[11] The model can answer questions such as the probability that this occurs within finite time, or the mean time until which it occurs.

First-hitting-time models can be applied to expected lifetimes, of patients or mechanical devices. When the process reaches an adverse threshold state for the first time, the patient dies, or the device breaks down.

First passage time of a 1D Brownian Particle

One of the simplest and omnipresent stochastic systems is that of the Brownian particle in one dimension. This system describes the motion of a particle which moves stochastically in one dimensional space, with equal probability of moving to the left or to the right. Given that Brownian motion is used often as a tool to understand more complex phenomena, it is important to understand the probability of a first passage time of the Brownian particle of reaching some position distant from its start location. This is done through the following means.

The probability density function (PDF) for a particle in one dimension is found by solving the one-dimensional diffusion equation. Namely,

{\frac {\partial p(x,t\mid x_{0})}{\partial t}}=D{\frac {\partial ^{2}p(x,t\mid x_{0})}{\partial x^{2}}},

given the initial condition $p(x,t={0}\mid x_{0})=\delta (x-x_{0})$ ; where $x(t)$ is the position of the particle at some given time, $x_{0}$ is the tagged particle's initial position, and $D$ is the diffusion constant with the S.I. units $m^{2}s^{-1}$ (an indirect measure of the particle's speed).

The PDF solving the diffusion equation is given by

p(x,t;x_{0})={\frac {1}{\sqrt {4\pi Dt}}}\exp \left(-{\frac {(x-x_{0})^{2}}{4Dt}}\right).

This states that the probability of finding the particle at $x(t)$ is Gaussian, and the width of the Gaussian is time dependent. Using the PDF one is able to derive the average of a given function, $L$ , at time $t$ :

\langle L(t)\rangle \equiv \int _{-\infty }^{\infty }L(x,t)p(x,t)dx,

where the average is taken over all space.

The question we are most interested in is given that the Brownian particle started at point $x_{0}$ , how long does it take for the particle to first reach another point $x_{c}$ ? Since the motion of the Brownian particle is stochastic, this question cannot be answered in full with a single time but rather with a distribution of times. This distribution is called the First Passage Time Density (FPTD) and is the probability that a particle has first reached the point $x_{c}$ at time $t$ and not at any point before. In order to neglect the possibility of having the particle reach $x_{c}$ at some earlier time, the absorbing boundary condition $p(x_{c},t)=0$ is imposed. The PDF satisfying this boundary condition is given by

p(x,t;x_{0},x_{c})={\frac {1}{\sqrt {4\pi Dt}}}\left(\exp \left(-{\frac {(x-x_{0})^{2}}{4Dt}}\right)-\exp \left(-{\frac {(x-(2x_{c}-x_{0}))^{2}}{4Dt}}\right)\right),

for $x<x_{c}$ . To calculate the FPTD, one uses the Survival probability: the probability that the particle has remained at a position $x<x_{c}$ for all times up to $t$ . It is given by

S(t)\equiv \int _{-\infty }^{x_{c}}p(x,t;x_{0},x_{c})dx=\operatorname {erf} \left({\frac {x_{c}-x_{0}}{2{\sqrt {Dt}}}}\right),

where $\operatorname {erf}$ is the error function. The relation between the Survival probability and the FPTD is as follows: the probability that a particle has reached the absorption point between times $t$ and $t+dt$ is $f(t)dt=S(t)-S(t+dt)$ . If one uses the first-order Taylor approximation, the definition of the FPTD follows):

f(t)=-{\frac {\partial S(t)}{\partial t}}.

By using the diffusion equation and integrating, the explicit FPTD is

f(t)\equiv {\frac {|x_{c}-x_{0}|}{\sqrt {4\pi Dt^{3}}}}\exp \left(-{\frac {(x_{c}-x_{0})^{2}}{4Dt}}\right).

The first-passage time for a Brownian particle therefore follows a Lévy distribution.

For $t\gg {\frac {(x_{c}-x_{0})^{2}}{4D}}$ , it follows from above that

f(t)={\frac {\Delta x}{\sqrt {4\pi Dt^{3}}}}\sim t^{-3/2},

where $\Delta x\equiv |x_{c}-x_{0}|$ . Thus, the probability for a Brownian particle achieving a first passage at some long time (defined in the paragraph above) becomes increasingly small, but always non-zero.

The first moment of the FPTD diverges (as it is a so-called heavy-tailed distribution), therefore one cannot calculate the average FPT, so instead, one can calculate the typical time, the time when the FPTD is at a maximum ( $\partial f/\partial t=0$ ), i.e.,

\tau _{\rm {ty}}={\frac {\Delta x^{2}}{6D}}.

More complex models extend the theory of first-passage times from diffusion in one dimension to diffusion in higher dimensions and when the point being considered is allowed to move in more directions. Moreover, first-passage times can be worked out when the allowed paths of travel occur in fractal settings. ^[12]^[13]

Mean first passage time in systems obeying a Fokker-Planck equation

One important extension of the previous example is the calculation of first passage times for a broader class of equations called Fokker-Planck equations. The Fokker-Planck equation describes the evolution of the probability distribution associated with a particle moving within an potential energy landscape along with Brownian motion. A specific example is a particle moving in one dimension with potential energy $U$ and diffusion coefficient $D$ where the Fokker-Planck equation in given by:

{\frac {\partial }{\partial t}}f(x,t)=D{\frac {\partial }{\partial x}}\exp(U(x)/K_{B}T){\frac {\partial }{\partial x}}\exp(-U(x)/K_{B}T)f(x,t)

where $f(x,t)$ is the probability of finding the particle at position $x$ at time $t$ , $T$ is the temperature, and $K_{B}$ the Boltzmann constant.

In general, a first passage time problem for a Fokker-Planck system can be formulated as:

{\frac {\partial }{\partial t}}f({\vec {x}},t)={\text{div}}({\vec {v}}({\vec {x}})f({\vec {x}},t))+{\text{div}}(\mathbf {A} ({\vec {x}})\cdot \nabla f({\vec {x}},t))={\hat {L}}f({\vec {x}},t)\quad {\text{for all }}{\vec {x}}\in R\subseteq \mathbb {R} ^{n}

with boundary conditions like before

f({\vec {x}},t=0)=\delta ({\vec {x}}-{\vec {x}}_{0})\quad f({\vec {x}},t)=0\quad {\text{for all }}{\vec {x}}\in \partial R

where ${\vec {x}}$ are the coordinates of the particle in question, ${\vec {v}}$ is a vector valued function of the coordinates corresponding to advection within the system, $\mathbf {A}$ is an diffusion matrix depending on ${\vec {x}}$ , ${\text{div}}$ is the divergence operator, $\nabla$ is the gradient, and $R$ the region of space with boundary $\partial R$ in $\mathbb {R} ^{n}$ which the particle is escaping from. The condition that $f({\vec {x}},t)=0$ on the boundary $\partial R$ is the absorbing boundary condition which is imposed to neglect any probability that the particle leaves the region at one time and re-enters at a later time. One should also note that the normalization condition

\int _{R}f({\vec {x}},t)dx_{1}\ldots dx_{n}=1

is not imposed though

\int _{R}f({\vec {x}},t)dx_{1}\ldots dx_{n}\leq 1

is imposed since the particle is allowed to diffuse out of the region $R$ . ^[14]^[15]

Equation for mean first passage time

The Fokker-Planck equation defines a second order, linear parabolic partial differential equation, which in most cases has not been solved explicitly. However, a formal operator solution for the equation can be given by

f({\vec {x}},t)=\sum _{m=0}^{\infty }{\frac {t^{m}{\hat {L}}^{m}}{m!}}f({\vec {x}},t=0)=\exp(t{\hat {L}})f({\vec {x}},t=0)

and for some systems the mean first passage time can be computed without knowledge of the general solution. To find an equation for the mean first passage time, we begin by defining the survival probability similar to in the previous section:

S(t)=\int _{R}f({\vec {x}},t)dx_{1}\ldots dx_{n}

.

Since $S(t)$ is the probability that the particle remains within $R$ until time $t$ , $S(t)-S(t+dt)$ is the probability that the particle was within $R$ at time $t$ and exits by time $t+dt$ . This means that the distribution of exit times, $\rho (t)$ , is given by

\rho (t)dt=S(t)-S(t+dt)\quad \rho (t)={\frac {d}{dt}}S(t)

.

The mean first passage time is the first moment in time with respect to $\rho (t)$ :

T_{\text{mfpt}}=\int _{0}^{\infty }t\rho (t)dt

which implicitly depends on ${\vec {x}}_{0}$ through the boundary conditions imposed on $L$ . Plugging in the formula for $\rho (t)$ and integrating by parts gives

T_{\text{mfpt}}=-\int _{0}^{\infty }t{\frac {d}{dt}}S(t)dt=\int _{0}^{\infty }S(t)dt

where it has been assumed that $\lim _{t\to \infty }tS(t)=0$ . Now, plugging in the definition of $S(t)$ as well as the formal solution to the Fokker-Planck equation gives

T_{\text{mfpt}}=\int _{0}^{\infty }\int _{R}f({\vec {x}},t)dx_{1}\ldots dx_{n}dt=\int _{0}^{\infty }\int _{R}\exp(t{\hat {L}})\delta ({\vec {x}}-{\vec {x}}_{0})dx_{1}\ldots dx_{n}dt

.

Instead of considering the operator $\exp(t{\hat {L}})$ acting on $\delta ({\vec {x}}-{\vec {x}}_{0})$ , we can consider its adjoint operator $\exp(t{\hat {L}}^{*})$ acting on the constant function equal to $1$

T_{\text{mfpt}}=\int _{0}^{\infty }\int _{R}\delta ({\vec {x}}-{\vec {x}}_{0})\exp(t{\hat {L}}^{*})(1)dx_{1}\ldots dx_{n}dt=\int _{0}^{\infty }\exp(t{\hat {L}}^{*})(1)dt

.

Finally, apply the adjoint operator ${\hat {L}}^{*}$ to $T_{\text{mfpt}}$ to get

{\hat {L}}^{*}T_{\text{mfpt}}=\int _{0}^{\infty }{\hat {L}}^{*}\exp(t{\hat {L}}^{*})(1)dt=\int _{0}^{\infty }{\frac {d}{dt}}\exp(t{\hat {L}}^{*})(1)dt=-1

after applying the fundamental theorem of calculus and noting that ${\hat {L}}^{*}$ is negative definite due to the absorbing boundary so only the lower boundary is left. Thus, the mean first passage time satisfies the equation:

{\hat {L}}^{*}T_{\text{mfpt}}=-1

and

T_{\text{mfpt}}=0\quad {\text{for }}{\vec {x}}_{0}\in \partial R

which means that a particle placed on the boundary of the region leaves the region immediately.

Application to a diffusing particle in a potential energy landscape

Returning to the example mentioned in the previous subsection, we consider the mean first passage time of a particle diffusing in one dimension in a potential energy landscape within the interval $(a,b)$ . The equation governing the probability distribution associated with the particle is

{\frac {\partial }{\partial t}}f(x,t)=D{\frac {\partial }{\partial x}}\exp(U(x)/K_{B}T){\frac {\partial }{\partial x}}\exp(-U(x)/K_{B}T)f(x,t)

and its adjoint operator is

{\hat {L}}^{*}T(x,t)=D\exp(U(x)/K_{B}T){\frac {\partial }{\partial x}}\exp(-U(x)/K_{B}T){\frac {\partial }{\partial x}}T(x,t)

assuming both $f$ and $T$ are $0$ at $a$ and $b$ . The equation for the mean first passage time is then

D\exp(U(x)/K_{B}T){\frac {\partial }{\partial x}}\exp(-U(x)/K_{B}T){\frac {\partial }{\partial x}}T_{\text{mfpt}}(x,t)=-1

.

Repeated integration over the variable $x$ gives

T_{\text{mfpt}}(x)={\frac {1}{D}}\int _{x}^{b}\exp(U(y)/K_{B}T)\int _{a}^{y}\exp(-U(z)/K_{B}T)dzdy

.

The plots to the left depict data from a computational experiment of a particle diffusing in a double well potential in a viscous system. On top is a sample of points from a single particle trajectory. The simulation begins with the particle at the bottom of the left well and ends once the particle has reached the top of the barrier. The time it takes the particle to reach the top is recorded as the first passage time for that simulation. The plot below shows the population of the first passage time distribution as more simulations are run. The simulated mean first passage time is plotted as well as the analytical value computed numerically from the equation above.

First-hitting-time applications in many families of stochastic processes

First hitting times are central features of many families of stochastic processes, including Poisson processes, Wiener processes, gamma processes, and Markov chains, to name but a few. The state of the stochastic process may represent, for example, the strength of a physical system, the health of an individual, or the financial condition of a business firm. The system, individual or firm fails or experiences some other critical endpoint when the process reaches a threshold state for the first time. The critical event may be an adverse event (such as equipment failure, congested heart failure, or lung cancer) or a positive event (such as recovery from illness, discharge from hospital stay, child birth, or return to work after traumatic injury). The lapse of time until that critical event occurs is usually interpreted generically as a ‘survival time’. In some applications, the threshold is a set of multiple states so one considers competing first hitting times for reaching the first threshold in the set, as is the case when considering competing causes of failure in equipment or death for a patient.

Threshold regression: first-hitting-time regression

Practical applications of theoretical models for first hitting times often involve regression structures. When first hitting time models are equipped with regression structures, accommodating covariate data, we call such regression structure threshold regression.^[16] The threshold state, parameters of the process, and even time scale may depend on corresponding covariates. Threshold regression as applied to time-to-event data has emerged since the start of this century and has grown rapidly, as described in a 2006 survey article ^[17] and its references. Connections between threshold regression models derived from first hitting times and the ubiquitous Cox proportional hazards regression model ^[18] was investigated in.^[19] Applications of threshold regression range over many fields, including the physical and natural sciences, engineering, social sciences, economics and business, agriculture, health and medicine.^[20]^[21]^[22]^[23]^[24]

Latent vs observable

In many real world applications, a first-hitting-time (FHT) model has three underlying components: (1) a parent stochastic process $\{X(t)\}\,\,$ , which might be latent, (2) a threshold (or the barrier) and (3) a time scale. The first hitting time is defined as the time when the stochastic process first reaches the threshold. It is very important to distinguish whether the sample path of the parent process is latent (i.e., unobservable) or observable, and such distinction is a characteristic of the FHT model. By far, latent processes are most common. To give an example, we can use a Wiener process $\{X(t),t\geq 0\,\}\,$ as the parent stochastic process. Such Wiener process can be defined with the mean parameter ${\mu }\,\,$ , the variance parameter ${\sigma ^{2}}\,\,$ , and the initial value $X(0)=x_{0}>0\,$ .

Operational or analytical time scale

The time scale of the stochastic process may be calendar or clock time or some more operational measure of time progression, such as mileage of a car, accumulated wear and tear on a machine component or accumulated exposure to toxic fumes. In many applications, the stochastic process describing the system state is latent or unobservable and its properties must be inferred indirectly from censored time-to-event data and/or readings taken over time on correlated processes, such as marker processes. The word ‘regression’ in threshold regression refers to first-hitting-time models in which one or more regression structures are inserted into the model in order to connect model parameters to explanatory variables or covariates. The parameters given regression structures may be parameters of the stochastic process, the threshold state and/or the time scale itself.

References

^ Redner 2001
^ Bachelier 1900
^ Von E 1900
^ Smoluchowski 1915
^ Lundberg 1903
^ Tweedie 1945
^ Tweedie 1957-1
^ Tweedie 1957-2
^ Whitmore 1970
^ Lancaster 1972
^ Masoliver 2009
^ Broeck 1989
^ Yuste 1995
^ Hänggi 1990
^ Zwanzig 2001
^ Lee 2006
^ Lee 2006
^ Cox 1972
^ Lee 2010
^ Aaron 2010
^ Chambaz 2014
^ Aaron 2015
^ He 2015
^ Hou 2016

Whitmore, G. A. (1986). "First passage time models for duration data regression structures and competing risks". The Statistician. 35: 207–219. doi:10.2307/2987525. JSTOR 2987525.
Whitmore, G. A. (1995). "Estimating degradation by a Wiener diffusion process subject to measurement error". Lifetime Data Analysis. 1 (3): 307–319. doi:10.1007/BF00985762.
Whitmore, G. A.; Crowder, M. J.; Lawless, J. F. (1998). "Failure inference from a marker process based on a bivariate Wiener model". Lifetime Data Analysis. 4 (3): 229–251. doi:10.1023/A:1009617814586.
Redner, S. (2001). A Guide to First-Passage Processes. Cambridge University Press. ISBN 0-521-65248-0.
Lee, M.-L. T.; Whitmore, G. A. (2006). "Threshold regression for survival analysis: Modeling event times by a stochastic process". Statistical Science. 21 (4): 501–513. arXiv:0708.0346. doi:10.1214/088342306000000330.
Bachelier, L. (1900). "Théorie de la Spéculation". Annales Scientifiques de l'École Normale Supérieure. 3 (17): 21–86.
Schrodinger, E. (1915). "Zur Theorie der Fall-und Steigversuche an Teilchen mit Brownscher Bewegung". Physikalische Zeitschrift. 16: 289–295.
Smoluchowski, M. V. (1915). "Notiz über die Berechning der Brownschen Molkularbewegung bei des Ehrenhaft-millikanchen Versuchsanordnung". Physikalische Zeitschrift. 16: 318–321.
Lundberg, F. (1903). Approximerad Framställning av Sannolikehetsfunktionen, Återförsäkering av Kollektivrisker. Almqvist & Wiksell, Uppsala.
Tweedie, M. C. K. (1945). "Inverse statistical variates". Nature. 155: 453. Bibcode:1945Natur.155..453T. doi:10.1038/155453a0.
Tweedie, M. C. K. (1957). "Statistical properties of inverse Gaussian distributions – I". Annals of Mathematical Statistics. 28: 362–377. doi:10.1214/aoms/1177706964.
Tweedie, M. C. K. (1957). "Statistical properties of inverse Gaussian distributions – II". Annals of Mathematical Statistics. 28: 696–705. doi:10.1214/aoms/1177706881.
Whitmore, G. A.; Neufeldt, A. H. (1970). "An application of statistical models in mental health research". Bull. Math. Biophys. 32: 563–579.
Lancaster, T. (1972). "A stochastic model for the duration of a strike". J. Roy. Statist. Soc. Ser. A. 135: 257–271.
Cox, D. R. (1972). "Regression models and life tables (with discussion)". J R Stat Soc Ser B. 187: 187–230.
Lee, M.-L. T.; Whitmore, G. A. (2010). "Threshold Proportional hazards and threshold regression: their theoretical and practical connections". Lifetime Data Analysis. 16: 196–214. doi:10.1007/s10985-009-9138-0. PMC 6447409. PMID 19960249.
Aaron, S. D.; Ramsay, T.; Vandemheen, K.; Whitmore, G. A. (2010). "A threshold regression model for recurrent exacerbations in chronic obstructive pulmonary disease". Journal of Clinical Epidemiology. 63: 1324–1331. doi:10.1016/j.jclinepi.2010.05.007.
Chambaz, A.; Choudat, D.; Huber, C.; Pairon, J.; Van der Lann, M. J. (2014). "Analysis of occupational exposure to asbestos based on threshold regression modeling of case-control data". Biostatistics. 15: 327–340. doi:10.1093/biostatistics/kxt042.
Aaron, S. D.; Stephenson, A. L.; Cameron, D. W.; Whitmore, G. A. (2015). "A statistical model to predict one-year risk of death in patients with cystic fibrosis". Journal of Clinical Epidemiology. 68: 1336–1345. doi:10.1016/j.jclinepi.2014.12.010.
He, X.; Whitmore, G. A.; Loo, G. Y.; Hochberg, M. C.; Lee, M.-L. T. (2015). "A model for time to fracture with a shock stream superimposed on progressive degradation: the Study of Osteoporotic Fractures". Statistics in Medicine. 34: 652–663. doi:10.1002/sim.6356. PMC 4314426. PMID 25376757.
Hou, W.-H.; Chuang, H.-Y.; Lee, M.-L. T. (2016). "A threshold regression model to predict return to work after traumatic limb injury". Injury. 47: 483–489. doi:10.1016/j.injury.2015.11.032.
Hänggi, Peter; Talkner, Peter; Borkovec, Michael (1990). "Reaction-rate theory: fifty years after Kramers". Reviews of Modern Physics. 62 (2): 290–299.
Zwanzig, Robert (2001). Nonequilibrium Statistical Mechanics. Oxford University Press.
Masoliver, Jaume; Perelló, Josep (2009). "First-passage and risk evaluation under stochastic volatility". Physical Review E. 80 (1). doi:10.1103/PhysRevE.80.016108.
Van den Vroeck, C. (1989). "Renormalization of first-passage times for random walks on deterministic fractals". Physical Review A. 40: 7334–7345. doi:10.1103/PhysRevA.40.7334.
Yuste, S. B. (1995). "First-passage time, survival probability and propagator on deterministic fractals". Journal of Physics A:Mathematical and General. 28 (24): 7027–7038. doi:10.1088/0305-4470/28/24/004.

Category:Survival analysis Category:Regression with time series structure Category:Regression models

[1] Redner 2001

[2] Bachelier 1900

[3] Von E 1900

[4] Smoluchowski 1915

[5] Lundberg 1903

[6] Tweedie 1945

[7] Tweedie 1957-1

[8] Tweedie 1957-2

[9] Whitmore 1970

[10] Lancaster 1972

[11] Masoliver 2009

[12] Broeck 1989

[13] Yuste 1995

[14] Hänggi 1990

[15] Zwanzig 2001

[16] Lee 2006

[17] Lee 2006

[18] Cox 1972

[19] Lee 2010

[20] Aaron 2010

[21] Chambaz 2014

[22] Aaron 2015

[23] He 2015

[24] Hou 2016

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]