Talk:Baum–Welch algorithm

This article has not yet been rated on Wikipedia's content assessment scale.
It is of interest to the following WikiProjects:

Template:WikiProject Computational Biology

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Computing C‑class Low‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing
C	This article has been rated as C-class on Wikipedia's content assessment scale.
Low	This article has been rated as Low-importance on the project's importance scale.

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Statistics C‑class Low‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics
C	This article has been rated as C-class on Wikipedia's content assessment scale.
Low	This article has been rated as Low-importance on the importance scale.

Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.

Mathematics C‑class Low‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics
C	This article has been rated as C-class on Wikipedia's content assessment scale.
Low	This article has been rated as Low-priority on the project's priority scale.

Suggestion

Could Template:HMM example be put to use on this page? (Granted we might have to modify it a bit to make it more general) -- Kowey 14:16, 14 Nov 2004 (UTC)

Content Discussion

There is a document written by Welch on IEEE Information Theory Society Newslectter, December 2003. See http://www.itsoc.org/publications/nltr/it_dec_03final.pdf

I scrutinized the description, and I also think it is wrong.

We have: gamma_t(i): the probability that an HMM is in state s_i at time t given the observations O and the parameters Lambda. (1)

gamma_t(i)=P(s_i=q_t | O, Lambda) (2)

gamma_t(i)=alpha_t(i)*beta_t(i)

          --------------------          (3)
            P(O|Lambda)

(3) is the Forward-Backward probability of state s_t(i), the probability one is in state s_i at t given time t. So this is the role the Foward-Backward algorithm plays in B W.

Now, let's look at the probability of going through a TRANSITION from s_i at time t to s_j at time t+1. Let's call it "chi'

Chi(i,j) is the probability of being in state s_i at time t, and being in state s_j at time t+1 given the observations and the parameters. (4)

Chi(i,j)=P(s_i=q_t, s_j=q_t+1 | O, Lambda) (5)

Chi(i,j)=alpha_t(i)a_i_jb_j(ot+1)*beta_t+1(j)

        --------------------
          P(O|Lambda)      (6)

T-1

SIGMA Chi(i,j) is the expected number of

   t=1
       transitions from s_i to s_j

T-1

SIGMA gamma_t(i) is the expected number of

   t=1
       transitions from s_i

                  T-1
              SIGMA   Chi(i,j)
                  t=1

Updated a_i_j =------------------------

                  T-1
              SIGMA   gamma_t_(i)
                  t=1

It is the last step I am not getting. Certainly the enumerator is not constant, as the current article claims, nor is it equal to the probability of the entire string. I cannot see why updated a_i_j is greater than the old value.

References

Cutting, D., and Kupiec, J., Pedersen, and P Sibun "A Practical Part-of-Speech Tagger", Xerox Palo Alto Research Center.

ChengXiang Zhai, A Brief Note On The Hidden Markov Models, Lecture Notes, Dept. of CS, University of Illinois at Urbana-Champaign.

tp://ocw.mit.edu/NR/rdonlyres/Electrical-Engineering-and-Computer-Science/6-345Automatic-Speech-RecognitionSpring2003/4C063D4A-3B8B-4F05-B475-85AAD0D66586/0/lecture10.pdf

Koos van der Wilt Koos van der Wilt 00:52, 13 May 2007 (UTC)[reply]

Were you discussing the article or the Welch presentation? Which one do you feel wrong? I think the latter, please confirm. I've not read your contribution, but I remember several headaches when trying to make sense of this algorithm in Pattern Classification (2ed) by Richard Duda et al. (ISBN 0 471 05669 3); I was convinced the description was wrong, and I rememeber I thought it was difficult to fix it.

--Blaisorblade (talk) 03:09, 24 June 2008 (UTC)[reply]

Confusion over Forward-backward algorithm

I think the link is confusing. The current text of Forward-backward algorithm seems to describe a different algorithm, which uses a forward and a backward pass but combines them for other purposes (at least, it appears so to me). Also the Baum-Welch algorithm does such a forward and backward pass, so it is called forward-backward algorithm somewhere (for instance in Pattern Classification (2ed) by Richard Duda et al. (ISBN 0 471 05669 3)). --Blaisorblade (talk) 03:09, 24 June 2008 (UTC)[reply]

http://bits.wikimedia.org/skins-1.5/common/images/button_math.pnghttp://bits.wikimedia.org/skins-1.5/common/images/button_media.png —Preceding unsigned comment added by 210.83.214.164 (talk) 03:22, 25 February 2010 (UTC)[reply]

Ambiguous indices

In the formula for the epsilon variables, i and j are both free indices, and used for summation. 2001:6B0:1:1DF0:C185:CDB5:7AF7:29BA (talk) 17:58, 19 October 2013 (UTC)[reply]

Confusing chicken and egg example

There is one chicken and we observe whether it lays eggs or not at a day. Why are the observations then like NN, NE, EE, EN, that is, why are there two observations each day? I don't get it. HenningThielemann (talk) 17:33, 25 January 2015 (UTC)[reply]

Working out the probability all day works up an appetite, wherefore the chicken is fried at suppertime. (Probability of laying an egg drops to zero.)

Convergence?

The description ends with "These steps are now repeated iteratively until a desired level of convergence." What steps? Although it would be nice to specify how convergence is measured/detected, I'll leave that aside for now. Isn't the algorithm meant to be evaluated for T=1, then T=2, T=3, and so on until a desired level of convergence is attained? If true, the text should reflect that (it currently does not, since the "steps" are not linked with T). Urhixidur (talk) 15:54, 24 October 2017 (UTC)[reply]