ArrayFire will make your code run as fast as possible. It beats efforts to manually write CUDA or OpenCL kernels. It beats compiler optimizations. It beats other libraries. ArrayFire is the best way to accelerate your code.
ArrayFire developers are amazingly talented at accelerating code; that's all we do - ever!
The array object is beautifully simple. It's fun to use!
Array-based notation effectively expresses computational algorithms in readable math-resembling notation. You do not need expertise in parallel programming to use ArrayFire. A few lines of ArrayFire code accomplishes what can take 100s of complicated lines in CUDA or OpenCL kernels.
Save yourself from verbose templates, ineffective and complicated compiler directives, and time-wasting low-level development. Arrays are the best possible way to accelerate your code.
You can easily switch between CUDA or OpenCL with ArrayFire, without changing your code.
ArrayFire contains hundreds of functions for matrix arithmetic, signal processing, linear algebra, statistics, image processing, and more. Each function is hand-tuned by ArrayFire developers with all possible low-level optimizations.
ArrayFire operates on common data shapes and sizes, including vectors, matrices, volumes, and N-dimensional arrays. It supports common data types, including single and double precision floating point values, complex numbers, booleans, and 32-bit signed and unsigned integers.
ArrayFire can be used as a stand-alone application or integrated with existing CUDA or OpenCL code. All ArrayFire
arrays can be interchanged with other CUDA or OpenCL data structures.
ArrayFire performs run-time analysis of your code to increase arithmetic intensity and memory throughput, while avoiding unnecessary temporary allocations. It has an awesome internal JIT compiler to make optimizations for you.
ArrayFire can also execute loop iterations in parallel with the gfor function.
ArrayFire supports easy multi-GPU or multi-device scaling.
Here's a live example to let you see ArrayFire code. You create [arrays](Array allocation, initialization) which reside on CUDA or OpenCL devices. Then you can use ArrayFire functions on those arrays.
// sample 40 million points on the GPU array x = randu(20e6), y = randu(20e6); array dist = sqrt(x * x + y * y); | // pi is ratio of how many fell in the unit circle array pi = 4.0 * sum(dist < 1) / 20e6; print(pi);