Grovecity Review

Complete News World

Mastering Atari, Go, Chess and Shoghi by planning with the learned model

  • 1.

    Campbell, M., Honey, A.J. Jr. & Hsu, F.H. Dark blue. Artif. Intel. 134, 57–83 (2002).

    Article

    Google Scholar

  • 2.

    Silver, d. And others. Mastering the Go game with deep neural networks and tree search. Natural 529, 484–489 (2016).

    ADS
    Case
    Article

    Google Scholar

  • 3.

    Bellemare, M.G., Nadaf, Y., Venus, J. & Bowling, m. Arcade Learning Context: An Appraisal Site for Public Agent. J. Artif. Intel. Res. 47, 253–279 (2013).

    Article

    Google Scholar

  • 4.

    Machado, M. And others. Reviewing the Arcade Learning Context: Evaluation Protocols and Open Issues for Public Agents. J. Artif. Intel. Res. 61, 523–562 (2018).

    MathSciNet
    Article

    Google Scholar

  • 5.

    Silver, d. And others. A general reinforcement learning algorithm that goes through masters chess, showy and self-playing. Science 362, 1140–1144 (2018).

    ADS
    MathSciNet
    Case
    Article

    Google Scholar

  • 6.

    Schaefer, J .; And others. World Championship Caliber Checkers Project. Artif. Intel. 53, 273–289 (1992).

    Article

    Google Scholar

  • 7.

    Brown, n. & Sandhome, d. Superhuman AI Science 359, 418–424 (2018).

    ADS
    MathSciNet
    Case
    Article

    Google Scholar

  • 8.

    Moravak, M. And others. Deepstock: Expert-level artificial intelligence in heads-up no-limit poker. Science 356, 508–513 (2017).

    ADS
    MathSciNet
    Article

    Google Scholar

  • 9.

    Vilavas, i. & Refinitis, i. Planning and planning Technical Report (EETN, 2013).

  • 10.

    Seckler, M.H., Bruce, M. & Waller, M.P. Plans chemistry packages with deep neural networks and coded AI. Natural 555, 604–610 (2018).

    ADS
    Case
    Article

    Google Scholar

  • 11.

    Sutton, RS & Bardot, A.G. Reinforcement Learning: An Introduction 2nd Edition (MIT Press, 2018).

  • 12.

    DeSenroth, M. & Rasmussen, c. Bilco: Model-based and data-efficient approach to policy search. In Brock. 28th International Conference on Machine Learning, ICML 2011 465–472 (Omnipress, 2011).

  • 13.

    Heiss, n. And others. Learning continuous control principles with random value gradients. In NIPS’15: Proc. 28th International Conference on Neural Information Processing Systems Plane. 2 (eds Cortes, C. et al.) 2944–2952 (MIT Press, 2015).

    READ  Browns Jarvis Laundry, Roshard Higgins, Donovan Peoples-Jones, Cadrell Hodge, PJ Gutson, Jacob Phillips Cowit-19 on the list and out of the Jets game
  • 14.

    Levine, s. & Appeal, b. Learning neural network principles with guided principle search under unknown dynamics. Adv. Neurology Inf. Process. Cyst. 27, 1071-1079 (2014).

    Google Scholar

  • 15.

    Hoffner, D. And others. Learning the hidden dynamics of planning from pixels. Print in https://arxiv.org/abs/1811.04551 (2018).

  • 16.

    Kaiser, L. And others. Model-based reinforcement learning for Atari. Print in https://arxiv.org/abs/1903.00374 (2019).

  • 17.

    Fusing, L. And others. Rapid production models for reinforcement learning Learning and querying. Print in https://arxiv.org/abs/1802.03006 (2018).

  • 18.

    Espeholt, L. And others. Impala: Measurable deep RL with significant weighted cast-learning structures. In Brock. International Conference on Machine Learning, ICML Volume. 80 (eds Dy, J. & Krause, A.) 1407–1416 (2018).

  • 19.

    Kapturovsky, S., Astrovsky, G., Daphne, W., Kuan, J. & Munos, r. Continuous experiential repetition of distributed reinforcement learning. In International Conference on Learning Representatives (2019).

  • 20.

    Horgan, D. And others. Reprinted Distributed Priority Experience. In International Conference on Learning Representatives (2018).

  • 21.

    Butterman, M.L. Markov End Processes: Distinctive random dynamic programming 1st edition (John Wiley & Sons, 1994).

  • 22.

    Coolom, r. Efficient selection and backup operators in Monte-Carlo wood search. In International Conference on Computers and Games 72–83 (Springer, 2006).

  • 23.

    Wallstrom, N., Shawn, D.P. & Disenroth, M.P. Print in http://arxiv.org/abs/1502.02251 (2015).

  • 24.

    Water, M., Springenberg, J.D., Poidecker, J.J. & Wrightmiller, M. Control Embed: Internally linear latent dynamics model to control from source images. In NIPS’15: Proc. 28th International Conference on Neural Information Processing Systems Plane. 2 (eds Cortes, C. et al.) 2746-2754 (MIT Press, 2015).

  • 25.

    Ha, D. & Schmidhuber, J. Continuing global models facilitate policy evolution. In NIPS’18: Proc. 32nd International Conference on Neural Information Processing Systems (eds Bengio, S. et al.) 2455–2467 (Quran Associates, 2018).

    READ  No fans allowed in the AFC wild card game against the Browns
  • 26.

    Zelata, C., Kumar, S., Buckman, J., Nacham, O.. & Bellemare, M.G. Brock. 36th International Conference on Machine Learning: Brock Volume 97. Machine Learning Research (Editions Sadh Duri, Q.

  • 27.

    Von Hazelt, H., Hazel, M. & Aslanites, J. When should parameter models be used in reinforcement learning? Print in https://arxiv.org/abs/1906.05243 (2019).

  • 28.

    Tamar, A., Wu, Y., Thomas, G., Levine, S. & Appeal, b. Value repeat networks. Adv. Neurology Inf. Process. Cyst. 29, 2154–2162 (2016).

    Google Scholar

  • 29.

    Silver, d. And others. Forecast: Final to End Learning and Planning. In Brock. 34th International Conference on Machine Learning Plane. 70 (eds Precup, D. & Teh, YW) 3191–3199 (JMLR, 2017).

  • 30.

    Fahrmond, A.M., Barreto, A.J. & Nikovsky, D. Value-awareness loss function for model-based reinforcement learning. In Brock. 20th International Conference on Artificial Intelligence and Statistics: Brock Volume 54. Machine Learning Research (eds Singh, A. & Ju, J) 1486–1494 (PMLR, 2017).

  • 31.

    Fahrmond, A. Reusable value-awareness model learning. Adv. Neurology Inf. Process. Cyst. 31, 9090–9101 (2018).

    Google Scholar

  • 32.

    Farquhar, G., Rocktashel, D., Igl, M. & Whitson, s. In International Conference on Learning Representatives (2018).

  • 33.

    O, J., Singh, S. & Lee, H. Valuation Network. Adv. Neurology Inf. Process. Cyst. 30, 6118–6128 (2017).

    Google Scholar

  • 34.

    Khrushchevsky, A., Sutkevar, I. & Hinton, G.E. Imagenerate classification of deep convolutional neural networks. Adv. Neurology Inf. Process. Cyst. 25, 1097-1105 (2012).

    Google Scholar

  • 35.

    He, K., Zhang, X., Ren, S. & Sun, J. Identity maps in deep residual networks. In 14th European Conference on Computer Vision 630–645 (2016).

  • 36.

    Hazel, M. And others. Rainbow: Combining improvements in deep reinforcement learning. In Thirty-second AAAI Conference on Artificial Intelligence (2018).

  • 37.

    Schmidt, S., Hazel, M. & Simonyan, k. With off-policy actor-reviewer shared experiential reprint. Print in https://arxiv.org/abs/1909.11583 (2019).

    READ  Srivijaya Air is what we know
  • 38.

    Azizadenaceli, K.. And others. Surprising negative results for generating negative tree search. Print in http://arxiv.org/abs/1806.05780 (2018).

  • 39.

    Mini, V. And others. Human-level control through deep reinforcement learning. Natural 518, 529–533 (2015).

    ADS
    Case
    Article

    Google Scholar

  • 40.

    Open, for AI OpenAI. OpenAI https://blog.openai.com/openai-five/ (2018).

  • 41.

    Vinyals, o. And others. Grandmaster level at StarCraft II using multi-agent reinforcement learning. Natural 575, 350–354 (2019).

    ADS
    Case
    Article

    Google Scholar

  • 42.

    Jotherberg, M. And others. Reinforcement learning with unsupervised ancillary tasks. Print in https://arxiv.org/abs/1611.05397 (2016).

  • 43.

    Silver, d. And others. Mastering the Go game without human knowledge. Natural 550, 354–359 (2017).

    ADS
    Case
    Article

    Google Scholar

  • 44.

    Coxsis, L. & Sebeswari, c. Plunder-based Monte-Carlo planning. Tendon European Conference on Machine Learning 282-293 (Springer, 2006).

  • 45.

    Rosin, CD with multi-armed bandits episode context. Ann. Mathematics. Artif. Intel. 61, 203-230 (2011).

    MathSciNet
    Article

    Google Scholar

  • 46.

    Shot, M.P., Vinands, M.H., van den Herrick, H.J., Chaslot, G.M.-P. & Uterwiz, JW Single Player Monte-Carlo Tree Search. In International Conference on Computers and Games 1–12 (Springer, 2008).

  • 47.

    Bohlan, D. And others. Take a look and see more: Achieving consistent performance on Atari. Print in https://arxiv.org/abs/1805.11593 (2018).

  • 48.

    Shawl, D., Quan, J., Antonoglo, I. & Silver, d. Prioritized experiential reprint. In International Conference on Learning Representatives (2016).

  • 49.

    Cloud TPU. Google Cloud https://cloud.google.com/tpu/ (2019).

  • 50.

    Coolom, r. Full Historical Assessment: The Bayesian Appraisal System for Players of Different Strengths. In International Conference on Computers and Games 113-124 (2008).

  • 51.

    Nair, A. And others. Massive parallel methods for deep reinforcement learning. Print in https://arxiv.org/abs/1507.04296 (2015).

  • 52.

    Lancott, m. And others. OpenSpeel: A framework for reinforcement learning in sports. Print in http://arxiv.org/abs/1908.09453 (2019).