关于 aget performance这个问题的后续跟进 在优化方面似乎有一些非常奇怪的事情.我们知道以下情况属实: = (def xa (int-array (range 100000)))#'user/xa= (set! *warn-on-reflection* true)true= (time (reduce + (
在优化方面似乎有一些非常奇怪的事情.我们知道以下情况属实:
=> (def xa (int-array (range 100000))) #'user/xa => (set! *warn-on-reflection* true) true => (time (reduce + (for [x xa] (aget ^ints xa x)))) "Elapsed time: 42.80174 msecs" 4999950000 => (time (reduce + (for [x xa] (aget xa x)))) "Elapsed time: 2067.673859 msecs" 4999950000 Reflection warning, NO_SOURCE_PATH:1 - call to aget can't be resolved. Reflection warning, NO_SOURCE_PATH:1 - call to aget can't be resolved.
然而,一些进一步的实验确实让我感到惊讶:
=> (for [f [get nth aget]] (time (reduce + (for [x xa] (f xa x))))) ("Elapsed time: 71.898128 msecs" "Elapsed time: 62.080851 msecs" "Elapsed time: 46.721892 msecs" 4999950000 4999950000 4999950000)
没有反射警告,不需要提示.将aget绑定到root var或let中可以看到相同的行为.
=> (let [f aget] (time (reduce + (for [x xa] (f xa x))))) "Elapsed time: 43.912129 msecs" 4999950000
知道为什么绑定的aget似乎“知道”如何优化,核心功能不在哪里?
它与:aget上的inline指令有关,它扩展为(.clojure.lang.RT(aget~a(int~i)),而普通函数调用涉及Reflector.试试这些:user> (time (reduce + (map #(clojure.lang.Reflector/prepRet (.getComponentType (class xa)) (. java.lang.reflect.Array (get xa %))) xa))) "Elapsed time: 63.484 msecs" 4999950000 user> (time (reduce + (map #(. clojure.lang.RT (aget xa (int %))) xa))) Reflection warning, NO_SOURCE_FILE:1 - call to aget can't be resolved. "Elapsed time: 2390.977 msecs" 4999950000
那么你可能想知道内联的重点是什么.好吧,看看这些结果:
user> (def xa (int-array (range 1000000))) ;; going to one million elements #'user/xa user> (let [f aget] (time (dotimes [n 1000000] (f xa n)))) "Elapsed time: 187.219 msecs" user> (time (dotimes [n 1000000] (aget ^ints xa n))) "Elapsed time: 8.562 msecs"
事实证明,在您的示例中,只要您通过反射警告,您的新瓶颈就是reduce部分而不是数组访问.这个例子消除了这一点,并显示了类型暗示的内联时代的数量级优势.