G. Graefe, Implementing sorting in database systems, ACM Computing Surveys, vol.38, issue.3, p.10, 2006.
DOI : 10.1145/1132960.1132964

L. Bishop, D. Eberly, T. Whitted, M. Finch, and M. Shantz, Designing a PC game engine, IEEE Computer Graphics and Applications, vol.18, issue.1, pp.46-53, 1998.
DOI : 10.1109/38.637270

. Intel, Intel xeon phi processorsen/products/ processors/xeon-phi/xeon-phi-processors.html. Accessed [4] Intel. Intel architecture instruction set extensions programming reference. https://software.intel.com/sitesarchitecture-instruction-set-extensions- programming-reference.pdf, 2016.

C. A. Hoare, Quicksort, The Computer Journal, vol.5, issue.1, pp.10-16, 1962.
DOI : 10.1093/comjnl/5.1.10

D. R. Musser, Introspective Sorting and Selection Algorithms, Software: Practice and Experience, vol.27, issue.8, pp.983-993, 1997.
DOI : 10.1002/(SICI)1097-024X(199708)27:8<983::AID-SPE117>3.0.CO;2-#

K. E. Batcher, Sorting networks and their applications, Proceedings of the April 30--May 2, 1968, spring joint computer conference on, AFIPS '68 (Spring), pp.307-314, 1968.
DOI : 10.1145/1468075.1468121

URL : http://www.cs.kent.edu/~parallel/papers/sort.pdf

D. Nassimi and S. Sahni, Bitonic Sort on a Mesh-Connected Parallel Computer, IEEE Transactions on Computers, vol.28, issue.1, pp.2-7, 1979.
DOI : 10.1109/TC.1979.1675216

J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone et al., GPU Computing, Proceedings of the IEEE, pp.879-899, 2008.
DOI : 10.1109/JPROC.2008.917757

P. M. Kogge, The architecture of pipelined computers [14] Intel. Intel 64 and ia-32 architectures software developer's manual: Instruction set reference (2a, 2b, 2c, and 2d). https://software.intel.com/enus/articles/intel-sdm . Accessed: December 2016. [15] Intel. Introduction to intel advanced vector extensions, 1981.

P. Sanders and S. Winkel, Super Scalar Sample Sort, European Symposium on Algorithms, pp.784-796, 2004.
DOI : 10.1007/978-3-540-30140-0_69

URL : http://rw4.cs.uni-sb.de/~sewi/ssss.pdf

H. Inoue, T. Moriyama, H. Komatsu, and T. Nakatani, AA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), pp.189-198, 2007.
DOI : 10.1109/PACT.2007.4336211

URL : http://www.research.ibm.com/trl/people/inouehrs/pdf/PACT2007-SIMDsort.pdf

T. Furtak, J. N. Amaral, and R. Niewiadomski, Using SIMD registers and instructions to enable instruction-level parallelism in sorting algorithms, Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures , SPAA '07, pp.348-357, 2007.
DOI : 10.1145/1248377.1248436

URL : http://ce.sharif.ac.ir/~ghodsi/archive/d-papers/SPAA/2007/Using SIMD Registers and Instructions to Enable Instruction-Level Parallelism in Sorting Algorithms.pdf

J. Chhugani, A. D. Nguyen, V. W. Lee, W. Macy, M. Hagog et al., Efficient implementation of sorting on multi-core SIMD CPU architecture, Proceedings of the VLDB Endowment, pp.1313-1324, 2008.
DOI : 10.14778/1454159.1454171

S. Gueron and V. Krasnov, Fast Quicksort Implementation Using AVX Instructions, The Computer Journal, vol.59, pp.83-90, 2016.
DOI : 10.1093/comjnl/bxv063