`
单眼皮大娘
  • 浏览: 110956 次
  • 性别: Icon_minigender_2
  • 来自: 上海
社区版块
存档分类
最新评论
阅读更多
一、introduction
      Often we are interested in finding patterns which appear over a space of time. These patterns occur in many areas; the pattern of commands someone uses in instructing a computer, sequences of words in sentences, the sequence of phonemes in spoken words - any area where a sequence of events occurs could produce useful patterns.
      Consider the simple example of someone trying to deduce the weather from a piece of seaweed - folklore tells us that `soggy' seaweed means wet weather, while `dry' seaweed means sun. If it is in an intermediate state (`damp'), then we cannot be sure. However, the state of the weather is not restricted to the state of the seaweed, so we may say on the basis of an examination that the weather is probably raining or sunny. A second useful clue would be the state of the weather on the preceding day (or, at least, its probable state) - by combining knowledge about what happened yesterday with the observed seaweed state, we might come to a better forecast for today.

       This is typical of the type of system we will consider in this tutorial.

       First we will introduce systems which generate probabalistic patterns in time, such as the weather fluctuating between sunny and rainy.

       We then look at systems where what we wish to predict is not what we observe - the underlying system is hidden. In the above example, the observed sequence would be the seaweed and the hidden system would be the actual weather.

       We then look at some problems that can be solved once the system has been modeled. For the above example, we may want to know

       1、What the weather was for a week given each day's seaweed observation.
       2、Given a sequence of seaweed observations, is it winter or summer? Intuitively, if the seaweed has been dry for a while it may be summer, if it has been soggy for a while it might be winter.

二、Deterministic Patterns
 
    Consider a set of traffic lights; the sequence of lights is red - red/amber - green - amber - red. The sequence can be pictured as a state machine, where the different states of the traffic lights follow each other.


 
    Notice that each state is dependent solely on the previous state, so if the lights are green, an amber light will always follow - that is, the system is deterministic. Deterministic systems are relatively easy to understand and analyse, once the transitions are fully known.

三、Non-deterministic patterns

    To make the weather example a little more realistic, introduce a third state - cloudy. Unlike the traffic light example, we cannot expect these three weather states to follow each other deterministically, but we might still hope to model the system that generates a weather pattern.

    One way to do this is to assume that the state of the model depends only upon the previous states of the model. This is called the Markov assumption and simplifies problems greatly. Obviously, this may be a gross simplification and much important information may be lost because of it.

    When considering the weather, the Markov assumption presumes that today's weather can always be predicted solely given knowledge of the weather of the past few days - factors such as wind, air pressure etc. are not considered. In this example, and many others, such assumptions are obviously unrealistic. Nevertheless, since such simplified systems can be subjected to analysis, we often accept the assumption in the knowledge that it may generate information that is not fully accurate. 



    A Markov process is a process which moves from state to state depending (only) on the previous n states. The process is called an order n model where n is the number of states affecting the choice of next state. The simplest Markov process is a first order process, where the choice of state is made purely on the basis of the previous state. Notice this is not the same as a deterministic system, since we expect the choice to be made probabalistically, not deterministically.

     The figure below shows all possible first order transitions between the states of the weather example.
    


     Notice that for a first order process with M states, there are M2 transitions between states since it is possible for any one state to follow another. Associated with each transition is a probability called the state transition probability - this is the probability of moving from one state to another. These M2 probabilities may be collected together in an obvious way into a state transition matrix. Notice that these probabilities do not vary in time - this is an important (if often unrealistic) assumption.

     The state transition matrix below shows possible transition probabilities for the weather example;



     that is, if it was sunny yesterday, there is a probability of 0.5 that it will be sunny today, and 0.375 that it will be cloudy. Notice that (because the numbers are probabilities) the sum of the entries for each row is 1.

     To initialise such a system, we need to state what the weather was (or probably was) on the day after creation; we define this in a vector of initial probabilities, called the TT vector.



- that is, we know it was sunny on day 1.
We have now defined a first order Markov process consisting of :

    states : Three states - sunny, cloudy, rainy.
    vector : Defining the probability of the system being in each of the states at time 0.
     state transition matrix : The probability of the weather given the previous day's weather.
     Any system that can be described in this manner is a Markov process.

四、Patterns generated by a hidden process

     When a Markov process may not be powerful enough

     In some cases the patterns that we wish to find are not described sufficiently by a Markov process. Returning to the weather example, a hermit may perhaps not have access to direct weather observations, but does have a piece of seaweed. Folklore tells us that the state of the seaweed is probabalistically related to the state of the weather - the weather and seaweed states are closely linked. In this case we have two sets of states, the observable states (the state of the seaweed) and the hidden states (the state of the weather). We wish to devise an algorithm for the hermit to forecast weather from the seaweed and the Markov assumption without actually ever seeing the weather.
   
     A more realistic problem is that of recognising speech; the sound that we hear is the product of the vocal chords, size of throat, position of tongue and several other things. Each of these factors interact to produce the sound of a word, and the sounds that a speech recognition system detects are the changing sound generated from the internal physical changes in the person speaking.

     Some speech recognition devices work by considering the internal speech production to be a sequence of hidden states, and the resulting sound to be a sequence of observable states generated by the speech process that at best approximates the true (hidden) states. In both examples it is important to note that the number of states in the hidden process and the number of observable states may be different. In a three state weather system (sunny, cloudy, rainy) it may be possible to observe four grades of seaweed dampness (dry, dryish, damp,soggy); pure speech may be described by (say) 80 phonemes, while a physical speech system may generate a number of distinguishable sounds that is either more or less than 80.

      In such cases the observed sequence of states is probabalistically related to the hidden process. We model such processes using a hidden Markov model where there is an underlying hidden Markov process changing over time, and a set of observable states which are related somehow to the hidden states.

五、Hidden Markov Models

    Definition of a hidden Markov model

    A hidden Markov model (HMM) is a triple (,A,B).

    1、the vector of the initial state probabilities;
2、the state transition matrix;
3、the confusion matrix;

     Each probability in the state transition matrix and in the confusion matrix is time independent - that is, the matrices do not change in time as the system evolves. In practice, this is one of the most unrealistic assumptions of Markov models about real processes.

     注:本文原地址:http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/main.html

     想学习隐马尔可夫的朋友可以去原网址看看,应该会受益匪浅。这是摘一部分,感觉确实不是,仅供参考。

   



  • 大小: 50.2 KB
  • 大小: 2.2 KB
  • 大小: 9.8 KB
  • 大小: 5.6 KB
  • 大小: 519 Bytes
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics