Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -235,8 +235,8 @@ $$\mathbf{s}_0 \sim \mathcal{N}(\mathbf{0}, \sigma^2 I_{n\cdot h})$$ | |
| 235 |  | 
| 236 | 
             
            $$\mathbf{s}_i = R(\mathbf{e}, \mathbf{s}_{i-1}) \; \textnormal{for} \;  i \in \lbrace 1, \dots, r \rbrace$$
         | 
| 237 |  | 
| 238 | 
            -
            $$\mathbf{p} =  | 
| 239 | 
            -
            where \\(\sigma\\) is the standard deviation of the initial random state. Given an init random state \\(\mathbf{s}_0\\), the model repeatedly applies the core 
         | 
| 240 | 
             
            block \\(R\\), which accepts the latent state \\(\mathbf{s}_{i-1}\\) and the embedded input \\(\mathbf{e}\\) and outputs a new latent state \\(\mathbf{s}_i\\). 
         | 
| 241 | 
             
            After finishing all iterations, the coda block processes the last state and produces the probabilities of the next token.
         | 
| 242 |  | 
|  | |
| 235 |  | 
| 236 | 
             
            $$\mathbf{s}_i = R(\mathbf{e}, \mathbf{s}_{i-1}) \; \textnormal{for} \;  i \in \lbrace 1, \dots, r \rbrace$$
         | 
| 237 |  | 
| 238 | 
            +
            $$\mathbf{p} = C(\mathbf{s}_r)$$
         | 
| 239 | 
            +
            where \\(\sigma\\) is the standard deviation of the initial random state. Given an init random state \\(\mathbf{s}_0\\), the model repeatedly applies the core recurrent 
         | 
| 240 | 
             
            block \\(R\\), which accepts the latent state \\(\mathbf{s}_{i-1}\\) and the embedded input \\(\mathbf{e}\\) and outputs a new latent state \\(\mathbf{s}_i\\). 
         | 
| 241 | 
             
            After finishing all iterations, the coda block processes the last state and produces the probabilities of the next token.
         | 
| 242 |  | 
