Add Broadwell performance results.
Reviewed-by: Emilia Käsper <emilia@openssl.org>
(cherry picked from commit b3d7294976)
			
			
This commit is contained in:
		@@ -61,8 +61,12 @@
 | 
				
			|||||||
#
 | 
					#
 | 
				
			||||||
# rsa2048 sign/sec	OpenSSL 1.0.1	scalar(*)	this
 | 
					# rsa2048 sign/sec	OpenSSL 1.0.1	scalar(*)	this
 | 
				
			||||||
# 2.3GHz Haswell	621		765/+23%	1113/+79%
 | 
					# 2.3GHz Haswell	621		765/+23%	1113/+79%
 | 
				
			||||||
 | 
					# 2.3GHz Broadwell(**)	688		1200(***)/+74%	1120/+63%
 | 
				
			||||||
#
 | 
					#
 | 
				
			||||||
# (*)	if system doesn't support AVX2, for reference purposes;
 | 
					# (*)	if system doesn't support AVX2, for reference purposes;
 | 
				
			||||||
 | 
					# (**)	scaled to 2.3GHz to simplify comparison;
 | 
				
			||||||
 | 
					# (***)	scalar AD*X code is faster than AVX2 and is preferred code
 | 
				
			||||||
 | 
					#	path for Broadwell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
$flavour = shift;
 | 
					$flavour = shift;
 | 
				
			||||||
$output  = shift;
 | 
					$output  = shift;
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -22,7 +22,10 @@
 | 
				
			|||||||
# [1] and [2], with MOVBE twist suggested by Ilya Albrekht and Max
 | 
					# [1] and [2], with MOVBE twist suggested by Ilya Albrekht and Max
 | 
				
			||||||
# Locktyukhin of Intel Corp. who verified that it reduces shuffles
 | 
					# Locktyukhin of Intel Corp. who verified that it reduces shuffles
 | 
				
			||||||
# pressure with notable relative improvement, achieving 1.0 cycle per
 | 
					# pressure with notable relative improvement, achieving 1.0 cycle per
 | 
				
			||||||
# byte processed with 128-bit key on Haswell processor.
 | 
					# byte processed with 128-bit key on Haswell processor, and 0.74 -
 | 
				
			||||||
 | 
					# on Broadwell. [Mentioned results are raw profiled measurements for
 | 
				
			||||||
 | 
					# favourable packet size, one divisible by 96. Applications using the
 | 
				
			||||||
 | 
					# EVP interface will observe a few percent worse performance.]
 | 
				
			||||||
#
 | 
					#
 | 
				
			||||||
# [1] http://rt.openssl.org/Ticket/Display.html?id=2900&user=guest&pass=guest
 | 
					# [1] http://rt.openssl.org/Ticket/Display.html?id=2900&user=guest&pass=guest
 | 
				
			||||||
# [2] http://www.intel.com/content/dam/www/public/us/en/documents/software-support/enabling-high-performance-gcm.pdf
 | 
					# [2] http://www.intel.com/content/dam/www/public/us/en/documents/software-support/enabling-high-performance-gcm.pdf
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -63,6 +63,7 @@
 | 
				
			|||||||
# Sandy Bridge	1.80(+8%)
 | 
					# Sandy Bridge	1.80(+8%)
 | 
				
			||||||
# Ivy Bridge	1.80(+7%)
 | 
					# Ivy Bridge	1.80(+7%)
 | 
				
			||||||
# Haswell	0.55(+93%) (if system doesn't support AVX)
 | 
					# Haswell	0.55(+93%) (if system doesn't support AVX)
 | 
				
			||||||
 | 
					# Broadwell	0.45(+110%)(if system doesn't support AVX)
 | 
				
			||||||
# Bulldozer	1.49(+27%)
 | 
					# Bulldozer	1.49(+27%)
 | 
				
			||||||
# Silvermont	2.88(+13%)
 | 
					# Silvermont	2.88(+13%)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -73,7 +74,8 @@
 | 
				
			|||||||
# CPUs such as Sandy and Ivy Bridge can execute it, the code performs
 | 
					# CPUs such as Sandy and Ivy Bridge can execute it, the code performs
 | 
				
			||||||
# sub-optimally in comparison to above mentioned version. But thanks
 | 
					# sub-optimally in comparison to above mentioned version. But thanks
 | 
				
			||||||
# to Ilya Albrekht and Max Locktyukhin of Intel Corp. we knew that
 | 
					# to Ilya Albrekht and Max Locktyukhin of Intel Corp. we knew that
 | 
				
			||||||
# it performs in 0.41 cycles per byte on Haswell processor.
 | 
					# it performs in 0.41 cycles per byte on Haswell processor, and in
 | 
				
			||||||
 | 
					# 0.29 on Broadwell.
 | 
				
			||||||
#
 | 
					#
 | 
				
			||||||
# [1] http://rt.openssl.org/Ticket/Display.html?id=2900&user=guest&pass=guest
 | 
					# [1] http://rt.openssl.org/Ticket/Display.html?id=2900&user=guest&pass=guest
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 
 | 
				
			|||||||
		Reference in New Issue
	
	Block a user