first order markov chain
on policy algorithm is easier to be paralleled
off policy algorithm has to fit transition net, and policy net. much more computationally expensive
时间:2021-09-16 01:04:36
first order markov chain
on policy algorithm is easier to be paralleled
off policy algorithm has to fit transition net, and policy net. much more computationally expensive