Math Problem Statement

A simple environment has 3 states which are encoded as integers 0 - 2, and 2 actions encoded as integers 0 and 1. Suppose that the environment is stochastic. The two tables below provide the transition probabilities for the two actions.

For example, if the agent is in state 2 and takes action 0, there is a 0.7 probability of transitioning to state 0, a 0.1 probability of transitioning to state 1, and a 0.2 probability of transitioning to state 2.

Transition Probabilities for Action 0

0 1 2 0 0.1 0.5 0.4 1 0.1 0.6 0.3 2 0.7 0.1 0.2

Transition Probabilities for Action 1

0 1 2 0 0.5 0.2 0.3 1 0.6 0.3 0.1 2 0.1 0.4 0.5

The agent starts in state 0 and generates an episode according to some policy. The states visited are provided by the list below.

states_visited = [0, 2, 1, 1, 2]

The actions taken by the agent are as follows:

actions_taken = [0, 1, 0, 0]

Use the tables above to determine the probability of each transition that occurred, given the actions that were taken. Then multiply these probabilities together to determine the probability of obtaining this sequence of states as a result of the actions taken.

Solution

Ask a new question for Free

By Image

Drop file here or Click Here to upload

Math Problem Analysis

Mathematical Concepts

Markov Processes
Stochastic Processes
Probability Theory

Formulas

-

Theorems

-

Suitable Grade Level

Advanced Undergraduate