B. Dutertre, Dynamic scan scheduling, 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002., pp.327-336, 2002.
DOI : 10.1109/REAL.2002.1181586

E. Koksal, Periodic Search Strategies For Electronic Countermeasure Receivers With Desired Probability Of Intercept For Each Frequency Band, 2010.

C. Winsor and E. J. Hughes, Optimisation and evaluation of receiver search strategies for electronic support, IET Radar, Sonar & Navigation, vol.6, issue.4, pp.233-240, 2012.
DOI : 10.1049/iet-rsn.2010.0377

R. G. Wiley, ELINT: The Interception and Analysis of Radar Signals, 2006.

I. V. Clarkson, Optimal periodic sensor scheduling in electronic support, Proceedings of the Defence Applications of Signal Processing (DASP'05), 2005.

I. V. Clarkson, E. D. El-mahassni, and S. D. Howard, Sensor scheduling in electronic support using Markov chains, Radar, Sonar and Navigation, pp.325-332, 2006.
DOI : 10.1049/ip-rsn:20050055

I. V. Clarkson, Synchronisation in scan-on-scan-on-scan problems, Proceedings of the Defence Applications of Signal Processing, 2009.

I. V. Clarkson and A. D. Pollington, Performance limits of sensor-scheduling strategies in electronic support Aerospace and Electronic Systems, IEEE Transactions on, vol.43, issue.2, pp.645-650, 2007.

S. W. Kelly, G. P. Noone, and J. E. Perkins, Synchronization effects on probability of pulse train interception Aerospace and Electronic Systems, IEEE Transactions on, vol.32, issue.1, pp.213-220, 1996.

E. D. El-mahassni and G. P. Noone, A new way of estimating radar pulse intercepts, ANZIAM Journal, vol.45, pp.448-460, 2004.
DOI : 10.21914/anziamj.v45i0.900

Y. Xun, M. M. Kokar, and K. Baclawski, Control based sensor management for a multiple radar monitoring scenario, Information Fusion, vol.5, issue.1, pp.49-63, 2004.
DOI : 10.1016/j.inffus.2003.08.001

Y. Xun, M. M. Kokar, and K. Baclawski, Using a taskspecific qos for controlling sensing requests and scheduling, Network Computing and Applications Proceedings. Third IEEE International Symposium on, pp.269-276, 2004.

M. Rosencrantz, G. Gordon, and S. Thrun, Learning low dimensional predictive representations, Twenty-first international conference on Machine learning , ICML '04, p.88, 2004.
DOI : 10.1145/1015330.1015441

B. Boots, M. Sajid, G. J. Siddiqi, and . Gordon, Closing the learning-planning loop with predictive state representations, The International Journal of Robotics Research, vol.24, issue.7, pp.954-966, 2011.
DOI : 10.1177/0278364911404092

. Britton-wolfe, R. Michael, S. James, and . Singh, Learning predictive state representations in dynamical systems without reset, Proceedings of the Twentysecond International Conference on Machine Learning (ICML-05, pp.980-987, 2005.

M. Bowling, P. Mccracken, M. James, J. Neufeld, and D. Wilkinson, Learning predictive state representations using non-blind policies, Proceedings of the 23rd international conference on Machine learning , ICML '06, pp.129-136, 2006.
DOI : 10.1145/1143844.1143861

A. Kulesza, R. Rao, and S. Singh, Low-rank spectral learning, Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS-14), pp.522-530, 2014.

H. Glaude, O. Pietquin, and C. Enderli, Subspace identification for predictive state representation by nuclear norm minimization, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014.
DOI : 10.1109/ADPRL.2014.7010609

URL : https://hal.archives-ouvertes.fr/hal-01104423

J. Cai, E. J. Candès, and Z. Shen, A Singular Value Thresholding Algorithm for Matrix Completion, SIAM Journal on Optimization, vol.20, issue.4, pp.1956-1982, 2010.
DOI : 10.1137/080738970

B. Boots, J. Geoffrey, and . Gordon, An online spectral learning algorithm for partially observable nonlinear dynamical systems, Proceedings of the Twentyfifth AAAI Conference on Artificial Intelligence (AAAI-11), 2011.

P. Auer, N. Cesa-bianchi, Y. Freund, E. Robert, and . Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002.
DOI : 10.1137/S0097539701398375

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.130.158

. Britton-wolfe, R. Michael, S. James, and . Singh, Predictive representations of state, Proceedings of the Fifteenth Conference on Neural Information Processing Systems (NIPS- 01), pp.1555-1561, 2001.

A. Anandkumar, A. Michael, A. Tang, and . Swami, Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret, IEEE Journal on Selected Areas in Communications, vol.29, issue.4, pp.731-745, 2011.
DOI : 10.1109/JSAC.2011.110406

URL : http://arxiv.org/abs/1006.1673

C. Tekin and L. Mingyan, Online Learning of Rested and Restless Bandits, IEEE Transactions on Information Theory, pp.5588-5611, 2012.
DOI : 10.1109/TIT.2012.2198613

C. Tekin and L. Mingyan, Adaptive learning of uncontrolled restless bandits with logarithmic regret, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2011.
DOI : 10.1109/Allerton.2011.6120273

K. Liu and Q. Zhao, Distributed Learning in Multi-Armed Bandit With Multiple Players, IEEE Transactions on Signal Processing, pp.5667-5681, 2010.
DOI : 10.1109/TSP.2010.2062509

R. Ortner, D. Ryabko, P. Auer, and R. Munos, Regret bounds for restless Markov bandits, Theoretical Computer Science, pp.62-76, 2014.
DOI : 10.1007/978-3-642-34106-9_19

URL : https://hal.archives-ouvertes.fr/hal-00765450