ARM设备是XScale-PXA270 @ 520,128MB(可能还有一些较慢的SDRAM),运行linux,总有足够的可用内存,性能与越狱的iPhone相当.
对生产数据库(SQLite)进行基准测试给了我们很有希望的结果(ARM只是
慢20-30%),所以我试着建立ruby(1.9.2p0).
rails应用程序在ARM上运行速度非常慢(从sql获取并生成10-20倍速的模板).我已经决定运行一些基准来找到瓶颈.
再次,一些结果是可以的(与我们现在使用的旧ruby1.8.6相比,比ruby1.9.2慢6倍),有些非常慢(慢20-30倍).铁.它看起来散列方法在ARM上慢了40倍.运行Ruby Benchmark Suite显示更多瓶颈,字符串,线程,数组……
我知道ARM比Atom慢,我只是没想到会有这么大的差异,特别是在SQLite运行正常之后.
在ARM上有一些Ruby漏洞,我是否需要应用一些补丁,这是否无望并且如果我想使用ARM设备或只是设备没有足够的计算能力,应该用C重写整个应用程序?
例子
def fib(n)
return 1 if n < 2
fib(n-1)+fib(n-2)
end
Benchmark.bm do |x|
x.report { fib(32) }
x.report { fib(36) }
x.report { h = {}; (0..10**3).each {|i| h[i] = i} }
x.report { h = {}; (0..10**4).each {|i| h[i] = i} }
x.report { h = {}; (0..10**5).each {|i| h[i] = i} }
end
ruby -rbenchmark bench.rb
Atom N270,1GB
ruby 1.9.2p0 (2010-08-18) [i686-linux]
user system total real
2.440000 0.000000 2.440000 ( 2.459400)
16.780000 0.030000 16.810000 ( 17.293015)
0.000000 0.000000 0.000000 ( 0.001180)
0.020000 0.000000 0.020000 ( 0.012180)
0.160000 0.000000 0.160000 ( 0.161803)
ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-linux]
user system total real
12.500000 0.020000 12.520000 ( 12.628106)
84.450000 0.170000 84.620000 ( 85.879380)
0.010000 0.000000 0.010000 ( 0.002216)
0.040000 0.000000 0.040000 ( 0.032939)
0.240000 0.010000 0.250000 ( 0.255756)
XScale-PXA270 @ 520,128MB
ruby 1.9.2p0(2010-08-18)[arm-linux]
user system total real
12.470000 0.000000 12.470000 ( 12.526507)
85.480000 0.000000 85.480000 ( 85.939294)
0.
0 0.000000 0.
0 ( 0.060643)
0.640000 0.000000 0.640000 ( 0.642136)
6.460000 0.130000 6.590000 ( 6.605553)
建立:
./configure --host=arm-linux --without-X11 --disable-largefile \
--enable-socket=yes --without-Win32API --disable-ipv6 \
--disable-install-doc --prefix=/opt --with-openssl-include=/opt/include/ \
--with-openssl-lib=/opt/include/lib
ENV:
PFX=arm-iwmmxt-linux-gnueabi
export DISCIMAGE="/opt"
export CROSS_COMPILE="arm-linux-"
export HOST="arm-linux"
export TARGET="arm-linux"
export CROSS_COMPILING=1
export CC=$PFX-gcc
export CFLAGS="-O3 -I/opt/include"
export LDFLAGS="-O3 -L/opt/lib/"
#LIBS=
#CPPFLAGS=
export CXX=$PFX-g++
#CXXFLAGS=
export CPP=$PFX-cpp
export OBJCOPY="$PFX-objcopy"
export LD="$PFX-ld"
export AR="$PFX-ar"
export RANLIB="$PFX-ranlib"
export NM="$PFX-nm"
export STRIP="$PFX-strip"
export ac_cv_func_setpgrp_void=yes
export ac_cv_func_isinf=no
export ac_cv_func_isnan=no
export ac_cv_func_finite=no
def fib(n)
return 1 if n < 2
fib(n-1)+fib(n-2)
end
Benchmark.bm do |x|
x.report { fib(32) }
x.report { fib(36) }
x.report { h = {}; (0..10**3).each {|i| h[i] = i} }
x.report { h = {}; (0..10**4).each {|i| h[i] = i} }
x.report { h = {}; (0..10**5).each {|i| h[i] = i} }
enddef fib(n)
return 1 if n < 2
fib(n-1)+fib(n-2)
end
Benchmark.bm do |x|
x.report { fib(32) }
x.report { fib(36) }
x.report { h = {}; (0..10**3).each {|i| h[i] = i} }
x.report { h = {}; (0..10**4).each {|i| h[i] = i} }
x.report { h = {}; (0..10**5).each {|i| h[i] = i} }
end
看来你抱怨Ruby 1.9.2中的新优化(与1.8.x相比)是特定于x86的.对于Ruby 1.8.x,Atom和ARM的性能相当.也许你可以问一个特定于ruby的邮件列表.快速搜索显示是的,Ruby 1.9.x中有很多变化:
Ruby 1.9.2 brings […] major speed improvements to Ruby by way of the Yet Another Ruby VM (YARV) interpreter
也许正确的问题是“YARV是否具有x86特定优化?这些优化是否可以在ARM端口中重复?”
