Math Problem Statement
Why I am getting below answer ?
Transition to state s′s's′ with probability 0.4, reward R′=10R' = 10R′=10, and v(s′)=0v(s') = 0v(s′)=0 (assumed).
Transition to state s′′s''s′′ with probability 0.6, reward R′′=3R'' = 3R′′=3, and v(s′′)=0v(s'') = 0v(s′′)=0 (assumed).
Discount factor γ=0.5\gamma = 0.5γ=0.5.
Formula:
v(s)=Probability to s′×(R′+γv(s′))+Probability to s′′×(R′′+γv(s′′))v(s) = \text{Probability to } s' \times \left(R' + \gamma v(s')\right) + \text{Probability to } s'' \times \left(R'' + \gamma v(s'')\right)v(s)=Probability to s′×(R′+γv(s′))+Probability to s′′×(R′′+γv(s′′))
Substitute the values:
v(s)=0.4×(10+0.5×0)+0.6×(3+0.5×0)v(s) = 0.4 \times \left(10 + 0.5 \times 0\right) + 0.6 \times \left(3 + 0.5 \times 0\right)v(s)=0.4×(10+0.5×0)+0.6×(3+0.5×0)
Simplify:
v(s)=0.4×(10+0)+0.6×(3+0)v(s) = 0.4 \times (10 + 0) + 0.6 \times (3 + 0)v(s)=0.4×(10+0)+0.6×(3+0) v(s)=0.4×10+0.6×3v(s) = 0.4 \times 10 + 0.6 \times 3v(s)=0.4×10+0.6×3 v(s)=4+1.8v(s) = 4 + 1.8v(s)=4+1.8 v(s)=5.8v(s) = 5.8v(s)=5.8
Thus, the value of state sss is 5.8.
4o mini
Solution
Ask a new question for Free
By Image
Drop file here or Click Here to upload
Math Problem Analysis
Mathematical Concepts
Markov Decision Process
Bellman Equation
Formulas
Bellman equation for state value
Theorems
-
Suitable Grade Level
Advanced
Related Recommendation
Calculating Expected Value in a Simple MDP Using Bellman Equation
Calculate Expected Value in a Simple Markov Decision Process (MDP)
Understanding Value Iteration in Markov Decision Processes
Calculate Probability of Sequence in Stochastic Environment - Markov Processes
Monte Carlo Prediction Update for Value Function in Reinforcement Learning