J. Allan, B. Carterette, J. A. Aslam, V. Pavlu, B. Dachev et al., Overview of the TREC 2007 Million Query Track, Proceedings of TREC, 2007.

J. A. Aslam, V. Pavlu, and E. Yilmaz, A statistical method for system evaluation using incomplete judgments, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '06, 2006.
DOI : 10.1145/1148170.1148263

J. A. Aslam and R. Savell, On the effectiveness of evaluating retrieval systems in the absence of relevance judgments, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval , SIGIR '03, 2003.
DOI : 10.1145/860435.860501

B. Carterette, J. Allan, and R. Sitaraman, Minimal test collections for retrieval evaluation, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '06, 2006.
DOI : 10.1145/1148170.1148219

R. Nuray and F. Can, Automatic ranking of information retrieval systems using data fusion. Information Processing and Management: an International Journal, v.42 n, pp.595-614, 2006.

I. Soboroff, C. Nicholas, and P. Cahan, Ranking retrieval systems without relevance judgments, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '01, pp.66-73, 2001.
DOI : 10.1145/383952.383961

A. Spoerri, How the overlap between search results correlates with relevance, Proceedings of the 68th annual meeting of the American Society for Information Science and Technology, 2005.

A. Spoerri, Using the structure of overlap between search results to rank retrieval systems without relevance judgments. Information Processing and Management: an International Journal, v.43 n, pp.1059-1070, 2007.

E. M. Voorhees, Variations in relevance judgments and the measurement of retrieval effectiveness, Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp.315-323, 1998.

E. M. Voorhees and D. Harman, Overview of the eighth text retrieval conference (TREC- 8). The 8th text retrieval conference (TREC-8), 1999.

S. Wu and F. Crestani, Methods for ranking information retrieval systems without relevance judgments, Proceedings of the 2003 ACM symposium on Applied computing , SAC '03, 2003.
DOI : 10.1145/952532.952693

J. Zobel, How reliable are the results of large-scale information retrieval experiments?, Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '98, pp.307-314, 1998.
DOI : 10.1145/290941.291014