Case Studies
Learn how ArrayFire has worked in real code, including applications in academia, finance, government, life sciences, manufacturing, media, and oil & gas. With ArrayFire, you get the best in GPU and accelerator computing, ensuring real success in your business and research objectives.
Academia
|
Accelerating LTE Simulation Tsinghua University |
Speedup: 3X |
![]() |
Accelerating LTE Simulation
Authors: Yuan Gao, Yin Sun, Chun Hui Zhou, Xin Su, Xi Bin
Xu, Shi Dong Zhou, Tsinghua University Fast simulations are a driving force in several research projects. However, the accompanying long simulation times can tend to be a drag in many of these projects. In this article, we shall bring up the example of the work on 3GPP LTE System Simulation by Yuan Gao et al (from Tsinghua University, Beijing) and demonstrate how the use of AccelerEyes software can significantly improve the simulator performance and lead to faster validation times in simulation projects. Last Updated: 8 Aug 2011 |
|
High Performance Compressive Sensing Rice University |
Speedup: 5X |
![]() |
High Performance Compressive Sensing
Authors: Nabor Reyna and Wotao Yin from Rice University This work deals with reconstruction of signals using partial Fourier matrices (RecPF). The major computational components of the algorithm involve shrinkage and FFTs. AccelerEyes software is employed to accelerate this compute-heavy code. Last Updated: 27 Jul 2011 |
|
Power System Simulations Indian Institute of Technology, Roorkee, India |
Speedup: 35X |
![]() |
Power Flow on the GPU
Authors: Indian Institute of Technology, Roorkee Power flow studies are one of the most important aspects of power system planning and operation. The power flow reveals the sinusoidal steady state characteristics of the entire system - voltages, real and reactive power generated, and absorbed and line losses- elucidating the voltage magnitudes and angles at each bus, the generation of each generating unit, and real and reactive power losses in the system. All this is necessary to ensure the security, economy, and control of electrical energy distribution. Learn how AccelerEyes software can deliver magnitudes of performance improvement over CPU-based solutions. Last Updated: 18 Apr 2010 |
|
Antenna Array Simulations University of Naples Federico II |
Speedup: 4.5X |
![]() |
Design and simulate echo generators
Authors: A. Capozzoli, C. Curcio, A. Liseno at University
of Naples Federico II Antenna array design involves repeated simulation to tune the many parameters involved, and waiting around for simulations to finish is no fun. Offloading the optimization problem onto the GPU cuts that time down significantly. In their recent paper, Capozzoli, Curcio, and Liseno of University of Naples Federico II demonstrated how a simple modification to their echo generator array simulation took advantage of the GPU to bring immediate speedups. Last Updated: 20 Jul 2011 |
|
Laplace Transform Inversion Acunum Algorithms and Simulations |
Speedup: 3.8X |
![]() |
Laplace Transform Inversion on the GPU
Authors: Patrick Kano and Moysey Brio at Acunum
Algorithms and Simulations The numerical inversion of the Laplace transform is a long standing problem due its implicit ill-posedness. Patrick Kano and Moysey Brio of Acunum Algorithms and Simulations, with their experience in computational methods and algorithm development, found a solution that not only works, but is very fast. Last Updated: 13 May 2011 |
|
Compressed Sensing for Image Reconstruction College of Engineering, Roorkee, India |
Speedup: 8X |
![]() |
Compressed Sensing Algorithms
Authors: Kuldeep Yadav, Ankush Mittal, M.A. Ansar and Avi
Srivastava, College of Engineering, Roorkee, India Compressed sensing is very critical in the areas of medical image reconstruction, image acquisition or sensor networks. An algorithm for compressed sensing developed using a Basis Pursuit Algorithm shows over 8X speedup when run on an NVIDIA GPU. Last Updated: 5 May 2011 |
|
Fat/Water Reconstruction for Medical Images Case Western Reserve University |
Speedup: 11.6X |
![]() |
Improved Fat/Water Reconstruction Algorithm
Authors: D. H. Johnson, S. Narayan, C. A. Flask and
D. L. Wilson, Case Western Reserve University Case Western Reserve University researchers turned to GPUs running AccelerEyes software to develop a fast and robust version of the "Iterative Decomposition of water and fat with an Echo Asymmetry and Least-squares" (IDEAL) reconstruction algorithm. This algorithm uses a lot of Image Processing algorithms for reconstruction, and was shown to achieve very high speedups. Last Updated: 25 Mar 2011 |
Finance
|
Option Pricing Koch Supply & Trading |
Speedup: 51.8X |
![]() |
Option Pricing
Authors: Koch Supply & Trading Andrew Shin, Market Risk Manager of Koch Supply & Trading, achieves significant performance increases on option pricing algorithms using AccelerEyes software to accelerate his code with GPUs. Andrew says, "My buddy and I are, at best, novice programmers and we couldn't imagine having to figure out how to code all this in CUDA." But he found AccelerEyes software to be straight-forward. With these results, he says he can see AccelerEyes software and GPUs populating Koch's mark-to-futures cube, which contains its assets, simulations, and simulated asset prices. Last Updated: 13 Aug 2012 |
|
GPU Computing in Automated Trader Automated Trader |
Speedup: 37.5X |
![]() |
GPU Computing in Automated Trader
Authors: Automated Trader The Q1 2012 issue of Automated Trader contains an excellent Mashup piece reviewing software for algorithmic trading. The article provides a wonderful glimpse into the 1-2 month adventure of Andy Webb, Automated Trader.s Founder, and Wrecking Crew building a fast trading platform from several technologies. The full trading platform they built was quite extensive. The part that caught our eye was the core computational component of the pipeline. That component involved permuting 1,000 potential pairs with cointegration tests for 350 time windows on each potential pair. Last Updated: 28 Feb 2012 |
GPUs in Quantitative Analytics and Finance
Private Bank
![]() |
The world of Quantitative finance is all about getting accurate results really, really fast. AccelerEyes is working with one of the largest banks in Spain to maximize their output using GPUs. Click the link below for an overview of the uses of GPU computing in finance. Last Updated: 17 Mar 2010 |
Government
|
Powering Mars Research NASA and UAA in Anchorage |
Speedup: 5X |
![]() |
Powering Mars Research
Authors: NASA and UAA in Anchorage The main thrust of this research is improving mars rover image compression via GPUs and genetic algorithms. With AccelerEyes software and GPUs, the researchers were able to achieve 5X speedups on the larger data sizes. The algorithm works by pairing neighboring pixels with a random one and then adjusting the random pixel based on whether it incrementally improves the original image. Babb described the algorithm as an embarrassingly parallel process, ideally suited to GPU acceleration. He estimates he has been able to achieve a 20 to 30 percent error reduction in subjects like fingerprints and satellite imagery. Last Updated: 6 Aug 2012 |
|
Radar Image Formation System Planning Corporation |
Speedup: ~45X |
![]() |
Radar Image Formation
Authors: Gary Rubin and Earl Sager - System
Planning Corporation Radar imaging is computationally intensive. As a result, many imaging algorithms apply FFT-based approximations. While efficient, these algorithms sacrifice data fidelity for speed. Other algorithms better preserve information, but are often too slow for many applications. At System Planning Corporation (SPC) , we have implemented a SAR/ISAR imaging routine based on the Backprojection algorithm. Using AccelerEyes software, we have demonstrated speedups of roughly 45x for large datasets. Last Updated: 26 May 2010 |
|
Radar Clutter Reduction System Planning Corporation |
Speedup: 5X - 10X |
![]() |
Radar Clutter Reduction
Authors: David Berger and Gary Rubin - System
Planning Corporation System Planning Corporation (SPC) uses AccelerEyes software to accelerate radar processing algorithms. The system processes raw data from marine navigation radars using a variety of thresholding techniques to extract real targets from clutter. This involves highly data-parallel processing in which each radar pulse is subjected to the same computations; very few operations occur across multiple pulses. Using AccelerEyes software, SPC has achieved 10x speed improvements relative to a Core i7-920 CPU and 5x improvements relative to a realtime DSP implementation. Last Updated: 26 May 2010 |
|
Novel Algorithms for Linear Algebra SAIC |
Speedup: 3.5X |
![]() |
Novel Algorithms for LU Decomposition
Authors: Nolan Davis and Daniel Redig, SAIC Nolan Davis and Daniel Redig at SAIC recently presented work on Hybrid GPU/Multicore Solutions for Large Linear Algebra Problems where they developed a novel algorithm for LU decomposition, one of the most important routines in linear algebra. They presented a Hybrid CPU/GPU computing approach, where problems too large to fit in GPU memory can also be solved faster than using only the CPU. Last Updated: 26 Jul 2011 |
|
Geolocation BAE Systems |
Speedup: 17X |
Geolocation
Authors: BAE Systems Geolocation is the identification of the real-world geographic location of a target of interest. In this application, the system receives the signal with an array of several antennas and computes the direction of arrival of the radio energy by measuring the time difference of arrival (or the phase difference) at the different antennas. Last Updated: 13 Apr 2009 |
|
Tsunami Modeling University of Minnesota, Boise State, Saint Scholastica , and NCAR |
Speedup: 3X - 5X |
![]() |
Tsunami Modeling
Authors: University of Minnesota, Boise State,
Saint Scholastica , and NCAR Natural catastrophic disasters like tsunamis commonly strike with little warning. For most people, tsunamis are underrated as major hazards. People sometimes wrongly believe that they occur infrequently and only along distant coasts. Tsunamis are usually caused by earthquakes. Seismic signals can give some margin of warning since the speed of tsunami waves travels at 1/30 the speed of seismic waves. Still there is little time between the creation of the tsunami and its impact making fast processing critical to producing effective warning systems. AccelerEyes software was used to run an RBF simulation on the GPU with a time to solution not available by other alternatives. Last Updated: 20 Dec 2009 |
Life Sciences
|
Parallelized Gene Predictors University of Quebec |
Speedup: 43X |
![]() |
Authors: University of Quebec Computerized approaches to studying the human genome are challenged by the exploding amount of data, which doubles roughly every 6 months. In order to deal with this burgeoning datasets, demands for faster processing power continue to arise. This work focuses on predicting genes using frequency analysis with FFTs and with an equivalent technique known as Goertzel's algorithm. In these applications, the emphasis of this paper is to propose tools to geneticists and molecular biologists for the prediction or identification of new genes using existing complementary strategies. The criteria for these tools are speed, reliability, accuracy and ease of use, thus requiring little training. Last Updated: 26 Jun 2012 |
|
Pathology advances with GPUs Northeastern University |
Speedup: 100X+ |
![]() |
Pathology advances with GPUs
Authors: Laboratory for Spectral Diagnosis at
Northeastern University One element of the hyperspectral image analysis workflow that requires more than a traditional desktop workstation or personal computer is Hierarchical Cluster analysis (HCA). HCA requires a large amount of data space and substantial computation time (~11 hours) for typical datasets using a single processor personal computer. Rather than following the traditional approach of moving to a lower level programming language like C or C++ and complex parallel programming paradigms such as OpenMP or the Massage Passing Interface (MPI), the lab utilized graphics processing units, or GPUs, and the AccelerEyes software platform. The solution allowed the lab to dramatically increase the performance of the analysis while substantially decreasing the amount of calendar time to reach the desired results. Last Updated: 27 May 2010 |
|
Hepatitis C Virus - mutation modeling Centers for Disease Control and Prevention |
Speedup: ~20X |
![]() |
Hepatitis C Virus - mutation modeling
Authors: CDC Research and Development Team This case study provides a look at biological research regarding coordinated mutations of the Hepatitis C Virus (HCV). AccelerEyes provided collaborative R&D resources and greatly improved the speed of this HCV research with the use of parallelization, reducing the computing time from 40 days to less than 1 day. Most importantly, the conclusion of the case study illustrates the the relative price-performance of personal supercomputers that leverage GPUs and AccelerEyes software provides a compelling solution versus other architectures and approaches. Last Updated: 10 Sep 2010 |
|
Accelerating the SPM package for Neuroimaging Georgia Institute of Technology |
Speedup: 3.5X |
![]() |
fMRI with SPM in Neuroimaging
Authors: Georgia Institute of Technology The Georgia Tech team explores the value of AccelerEyes software and GPUs for fMRI workflows within the popular SPM - Statistical Parametric Mapping software widely used in neuroscience research. Last Updated: 8 Mar 2010 |
|
Medical Image Compression Indian Institute of Technology, Roorkee, India |
Speedup: 38X |
![]() |
Medical Image Compression
Authors: Jaideep Singh, Ipseeta Aruni,
R. Balasubramanian - IIT - Roorkee, India This study presents the acceleration of Haar wavelet-based image compression algorithm for medical imaging on the Graphics Processing Unit (GPU) using AccelerEyes software. Due to bandwidth and image size constraints of medical imaging systems, image compression plays a vital role in reducing the bit rate of transmission or storage. Wavelet-based image compression provides the most promising approach for high quality image compression. Last Updated: 23 June 2010 |
|
Brain Displacement Spencer Technologies |
Speedup: 12X |
Brain Displacement
Authors: Spencer Technologies Spencer describes how AccelerEyes software facilitates the development of fast algorithms enabling observation of brain displacement across depth with sampling density that far surpasses previous benchmarks. Last Updated: 23 Jan 2010 |
|
Multidimensional Scaling for Genomics Leibniz Institute of Plant Genetics and Crop Plant Research |
Speedup: 20X - 35X |
![]() |
Multidimensional Scaling for Genomics
Authors: Leibniz Institute of Plant Genetics
and Crop Plant Research Multidimensional scaling (MDS) is a general computing technique to turn a distance matrix into a set of reconstructed points with pair-wise relationships approximating the original distances by points located in a usually low-dimensional space. AccelerEyes software is used to enhance execution of the HiT-MDS procedure and delivers considerable performance improvement. Last Updated: 26 Jun 2009 |
|
Drug Delivery Model Georgia Institute of Technology |
Speedup: 70X |
|
|
Drug Delivery Model
Authors: Georgia Institute of Technology In this work, the researchers simulate the delivery of a novel nanoparticle chemotherapy drug to cancerous tissue. Simulation allows scientists to predict experimental outcomes and thus reduce the cost of development and time to clinical relevance. The simulation model includes blood vessels, tumor cells, and healthy cells and an engine to calculate the spatial distributions of both drug and oxygen. AccelerEyes software is used to speed up the diffusion calculations for the drug and oxygen within the tissue. Last Updated: 09 Nov 2008 |
|
Biomedical Infrared Spectroscopy University of Manchester and Nofima Mat, Norway |
Speedup: Hours of runtime reduction |
![]() |
Biomedical Infrared Spectroscopy
Authors: University of Manchester and Nofima Mat, Norway The authors present an iterative algorithm that applies full Mie scattering theory and avoids noise accumulation in their iterative algorithm by integrating a curve-fitting step. AccelerEyes software along with NVIDIA GPUs are leveraged to reduce the time added by the curve-fitting step. Last Updated: 20 May 2010 |
Manufacturing
|
Feature Learning on Images Stanford University |
Speedup: Hours of runtime reduction |
![]() |
Feature Learning Architectures with GPU-acceleration
Authors: Andrew Ng, Stanford University Stanford researchers in Andrew Ng's group used GPUs and AccelerEyes software to speed up their work on Feature Learning Architectures. They decided to use AccelerEyes software for this study because of the need to quickly evaluate many architectures on thousands of images. AccelerEyes software taps into the immense computing power of GPUs and speeds up research utilizing many images. Last Updated: 9 Apr 2011 |
|
Tomography of Vegetation - Filtered Back-Projection and Non-Uniform FFTs Universita di Napoli Federico II |
Speedup: 10X |
![]() |
Tomography of Vegetation - Filtered Back-Projection and Non-Uniform FFTs
Authors: Drs. Capozzoli, Curcio, di Vico, and Liseno,
Universita di Napoli Federico II In order to investigate changes of forest biomass, scientists use microwave tomography to image the vegetation. At the smallest scale, individual plants can be imaged to investigate branching and growth, but even synthetic aperture radar can reveal large-scale changes in regional ecology. To the right, you can see the experimental setup to image an individual plant. Last Updated: 16 Aug 2011 |
|
Action Recognition with Independent Subspace Analysis Stanford University |
Speedup: 4.4X |
![]() |
Action Recognition with Independent Subspace Analysis
Authors: Quoc Le, Will Zou, Serena Yeung, Andrew Ng, Stanford University In a paper at this year's CVPR 2011, entitled "Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis", the authors explain how their unsupervised feature learning algorithm competes with other algorithms that are hand crafted or use learned features. For their training purposes, they used a multi-layered stacked convolutional ISA (Independent subspace analysis) network. An ISA is used for learning features from image patches without supervision. Last Updated: 19 Aug 2011 |
Media & Computer Vision
|
Fast Computer Vision with OpenCV and ArrayFire OpenCV Blogger |
Speedup: 10X |
![]() |
Fast Computer Vision with OpenCV and ArrayFire
Authors: OpenCV Blogger The OpenCV library is the defacto standard for doing computer vision and image processing research projects. OpenCV includes several hundreds of computer vision algorithms, aimed for use in realtime vision applications. This case study shows how to use both libraries together. There is a simple example application that demonstrates using OpenCV for webcam access and ArrayFire for some basic processing routines and displaying results. Last Updated: 24 Aug 2011 |
|
Video Processing |
Speedup: 10X - 20X |
![]() |
Video Processing
Authors: Google and Stanford University Video content analysis is the basis for categorizing videos and enabling search by content. Growing interest in using sparse-coding methods to extract motion features in video in support of video content analysis led to the application of AccelerEyes software to improve performance by substantially accelerating the solution of the L1-regularized least-squares optimization problem. Last Updated: 13 Jan 2010 |
|
Action Recognition with Independent Subspace Analysis Stanford University |
Speedup: 4.4X |
![]() |
Action Recognition with Independent Subspace Analysis
Authors: Quoc Le, Will Zou, Serena Yeung, Andrew Ng, Stanford University In a paper at this year's CVPR 2011, entitled "Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis", the authors explain how their unsupervised feature learning algorithm competes with other algorithms that are hand crafted or use learned features. For their training purposes, they used a multi-layered stacked convolutional ISA (Independent subspace analysis) network. An ISA is used for learning features from image patches without supervision. Last Updated: 19 Aug 2011 |
|
Music Beat Analysis Georgia Tech |
Speedup: 15X |
![]() |
Music Beat Analysis
Authors: Vidhur Vohra - Georgia Tech Did you ever wonder how the music visualizer in your media player works? Watching it pulsate in synchrony with the beats of the song is almost as entertaining as listening to the song itself! Researchers have been attempting to detect beats in audio signals for many years, and there are many techniques available, from the simplest (and least accurate) to more complicated algorithms that are highly accurate. All algorithms, though, perform some form of signal processing and frequency analysis, applications highly suited to GPU Computing. Last Updated: 11 Aug 2011 |
|
Optimization methods for deep learning Stanford Artificial Intelligence Laboratory |
Speedup: Improved Accuracy |
![]() |
Optimization methods for deep learning
Authors: Stanford Artificial Intelligence Laboratory Researchers at SAIL (Stanford Artificial Intelligence Laboratory), have done it again. They have successfully used AccelerEyes software to speed up the training part of Deep Learning algorithms. In their paper titled .On Optimization Methods for Deep Learning., they experiment with some of the well known training algorithms and demostrate their scalability across parallel architectures (GPUs as well as multi-machine networks). The algorithms include SGDs (Stochastic Gradient Descent) L-BFGS (Limited BFGS used for solving non-linear problems), CG (Conjugate Gradient). Last Updated: 20 Sep 2011 |
|
Feature Learning on Images Stanford University |
Speedup: Hours of runtime reduction |
![]() |
Feature Learning Architectures with GPU-acceleration
Authors: Andrew Ng, Stanford University Stanford researchers in Andrew Ng’s group used AccelerEyes software to speed up their work on Feature Learning Architectures. They decided to use AccelerEyes software for this study because of the need to quickly evaluate many architectures on thousands of images. AccelerEyes software taps into the immense computing power of GPUs and speeds up research utilizing many images. Last Updated: 9 Apr 2011 |
|
Digital Holography for Imaging National University of Ireland, Maynooth |
Speedup: 17X |
![]() |
Digital Holography
Authors: Nitesh Pandey, Damien Kelly, Bryan Hennelly and
Thomas Naughton from the National University of Ireland,
Maynooth
Digital holography is a powerful imaging technique with many new
applications like true 3D display. It allows the capture of both
amplitude and phase information of the light reflected off the
surface of 3D objects. Researchers at the National University of
Ireland, Maynooth are developing techniques based on digital
holography for 3D display applications. |
Oil & Gas
|
3D Mantle Convection - Geodynamics Boise State, University of Colorado, University of Minnesota |
Speedup: 2.5X - 4.5X |
![]() |
3D Mantle Convection - Geodynamics
Authors: Boise State, University of Colorado, University of Minnesota The authors introduce a GPU implementation of a three-dimensional mantle convection modeling at a high Rayleigh number to the solid earth geophysics community. They outline code development time, compare performance of CPUs versus GPUs, and deliver powerful visualizations. Last Updated: 10 Feb 2010 |
|
Ground Water Simulations Louisiana State University |
Speedup: >20X |
![]() |
Lattice Boltzmann Models - Ground Water Simulations
Authors: Kevin R. Tubbs and Frank T-C. Tsai at Louisiana
State University A lattice Boltzmann method for solving the shallow water equations and the advection-dispersion equation is developed and implemented on graphics processing unit (GPU)-based architectures. The proposed LBM is implemented to an NVIDIA Computing Processor in a single GPU workstation. GPU computing is performed using AccelerEyes software. Mass transport with velocity-dependent dispersion in shallow water flow is simulated by combining the MRT-LBM model and the TRT-LBM model. The GPU parallel performance increases as the grid size increases. The results indicate the promise of the GPU-accelerated LBM for modeling mass transport phenomena in shallow water flows. Last Updated: 1 Dec 2010 |
|
Shallow Water Fluid Flow Louisiana State University |
Speedup: 10X |
Shallow Water Fluid Flow
Authors: Louisiana State University A lattice Boltzmann method (LBM) on high performance computing (HPC) environments for three-dimensional shallow water flow fields coupled to mass transport is developed. LBM is an attractive method for solving the multilayered shallow water equations because the extension to multilayer is straight forward with all of the simplicities and advantages of the LBM in mass transport in shallow water flows and the LBM performance on central processing unit (CPU)-based and graphics processing unit (GPU)-based HPC environments. Last Updated: 6 Sep 2009 |
































