Recent Developments in Parallel Computing in Finance | June 5 – 6, 2014

The Stevanovich Center
5727 S. University Avenue
Chicago, IL 60637

Check-in and breakfast begin at 9:00 AM both days.

Parallel computing has become a very important tool in modern quantitative finance.  The ability to analyze massive amounts of data in parallel in almost real time exceeds the capabilities of even the fastest single core processor (CPU).  Many of the computations in quantitative finance lend themselves to parallelization where many computations are made simultaneously and at the end the individual results are aggregated into a single result.  The speed of such a parallel computation scales, sometimes linearly, with the number of processor cores.

A normal off-the-shelf processor usually has 2 – 8 computing units.  However a graphics processor (GPU) i.e. a processor whose main task is to compute the pixels that form a digital image, may have a thousand or more small cores.  A digital image on a large computer screen contains upwards of 2 million pixels that have to be computed about 50 times a second so there is a large computational asset in a GPU.  Since recently, practitioners, mathematicians and engineers have been working on harvesting this GPU resource for numerical computations in other fields, including finance.

Researchers from academia and from industry will report on work on finding parallel algorithms to do financial computations and engineers will report on implementation issues and new and improved hardware solutions.

On the software implementation side there are two main platforms: CUDA and OpenCL.

In addition to CUDA, OpenCL is gaining popularity as an industry standard for doing parallel computing.  Microsoft last year introduced a proprietary technology, Accelerated Massive Parallelism (AMP) as a way to exploit the GPGPU computing technology in C++ programs.


Professionals in the financial services industry have already started exploring and experimenting with some of these technologies and in many cases using them on a daily basis in areas such as pricing financial instruments, risk management, simulations, order execution and data analysis.

The aims of this conference are:

  1. Providing a platform for the industry practitioners and the solution providers to share their experiences in recent developments in parallel computing with students and finance industry professionals
  2. Facilitating discussions on future directions and create opportunities for further research and collaborations among the participants
  3. Give students a hands-on experience with GPU parallel computing through tutorials and workshops

Scheduled Speakers


Check-in and Breakfast
9:00 AM

John Reppy   The University of Chicago
10:00 AM
High-level programming models for GPUs


GPUs provide super-computer performance at commodity prices, but are notoriously hard to program.  The standard languages for programming GPUs, CUDA and OpenCL, expose low-level architectural details, such as an explicit memory hierarchy, that the programmer must exploit to maximize performance.  To make the power of GPUs, accelerators, and heterogeneous architectures more widely applicable, we need to lift the level of abstraction away from the details of the hardware threading and memory models.  In this talk, I will discuss ongoing research to develop higher-level parallelism models that map well onto GPUs, such as nested data parallelism and domain-specific models.

John Ashley   NVIDIA
11:00 AM
Latest Research on GPU implementation of explicit and implicit Finite Difference methods in Finance


Based on a presentation by Professor Mike Giles of Oxford, the talk will cover the latest (2014) joint Oxford-NVIDIA research and implementation work on 1D and 3D finite difference solvers on the GPU.  A hybrid PCR method is outlined that takes advantage of the latest GPU features to accelerate computation.  Pointers to the latest codes available for download will be provided.

12:00 – 1:30 PM

Thomas Luu   University College London
1:30 PM
Parallel non-uniform random number generation


A variety of probability distributions are used in financial modelling.  The most direct and desirable way to generate a non-uniform random number is by inversion.  This of course requires the ability to evaluate the quantile function (inverse CDF) of a distribution — preferably at speed and with negligible approximation error.  Only a handful of distributions (e.g. exponential) possess quantile functions with closed-form expressions, so numerical methods must be used to simulate the others.  These other distributions (e.g. normal and gamma) are usually the ones used in financial simulations.

This talk will focus on the fast and accurate computation of the normal and gamma quantile functions on GPUs.  Traditional algorithms that evaluate the normal quantile function (e.g. Wichura and Moro) are not optimal for GPUs.  This is because these algorithms split the input domain and treat each subdomain separately.  This is a significant source of branch divergence. During this talk we will see how branch divergence can be aggressively minimised via changes of variables.  A surprising change of variable for the normal distribution leads to a very fast double-precision GPU algorithm.

Evaluation of the gamma quantile function is more challenging, because of the presence of a shape parameter.  Algorithms based on root finding are known to be too slow for real-time Monte Carlo simulation.  Instead, we take a differential equation approach.  Our differential equations incorporate changes of variables that make the evaluation of the gamma quantile function easier and more efficient on GPUs.  We will see how this leads to a solution that facilitates real-time inversion of the gamma distribution.

The normal and gamma quantile function algorithms described are currently being integrated into the NAG GPU Library.

Trevor Misfeldt   Centerspace LLC
Andy Gray   Black Crater Software Solutions
2:30 PM


Lecture by Trevor Misfeldt

CenterSpace Software, a leading provider of numerical component libraries for the .NET platform, will give an overview of their NMath math and statistics libraries and how they are being used in industry.  The Premium Edition of NMath offers GPU parallelization. Xeon Phi, C++ AMP and CUDA are technologies of interest. Support for each will be discussed.  Also discussed will be CenterSpace’s Adapative Bridge™ technology, which provides intelligent, adaptive routing of computations between CPU and GPUs.  The presentation will finish with a demonstration followed by performance charts. 

Tutorial by Andy Gray

In this hands-on programming tutorial, we will compare and contrast several approaches to a simple algorithmic problem: a straightforward implementation using managed code, a multi-CPU approach using a parallelization library, coupling object-oriented managed abstractions with high-performance native code, and seamlessly leveraging the power of a GPU for massive parallelization.

Raman Sharma    Microsoft Corporation
4:30 PM
How to obtain superior performance for computational finance workloads using APM (Accelerated Massive Parallelism)


This session provides insight on how to obtain superior performance for computational finance workloads without compromising developer productivity.  C++ AMP technology lets you write C++ STL like code that runs on GPUs (and CPUs) in a platform and vendor agnostic manner.  The session will start with an overview of C++ AMP, dive into the features, list various compilers that support C++ AMP and showcase the performance characteristics of options pricing workloads written using C++ AMP code.  Attend this talk to see how you can write productive and easy to maintain code that offers superior performance.


Check-in and Breakfast
9:00 AM

John Ashley   NVDIA
10:00 AM
Implementing correlated FX baskets on GPUs


A guided tour Leveraging library functions and CUDA code, the talk will walk through the design and implementation of a code to compute a covariance matrix, perform a Cholesky decomposition, and then construct a Monte Carlo analysis to compute the correlated portfolio exposures over time.  This session will consist of a design overview and a code walkthrough.

12:00 – 1:30 PM

Jerry Hanweck   Hanweck Associates
1:30 PM
NVIDIA GPU – Accelerated Stochastic Volatility Modeling for Derivatives Pricing


Pricing derivatives using stochastic volatility models, such as Heston’s, is computationally intensive.  In this presentation, we will review some popular stochastic volatility models and their solutions for options pricing, with a focus on hardware-accelerated numerical methods employing NVIDIA GPUs and CUDA.

Peter M. Phillips   Aon Benfield
Aamir Mohammad   Aon Benfield
2:30 PM
Visual DSL for Actuarial Models – An Industrial Experience Report


Actuarial models for Variable Annuities are characterized by a high degree of logical complexity, computational burden and implementation risks and errors.  We describe PathWise Modeling Studio (PWMS), a spreadsheet-like visual DSL to simplify the model implementation process while allowing for computations to be accelerated on Graphics Processing Units (GPUs) and cloud computing resources.  We comment on PWMS user experience in an industry setting and the general applicability of visual DSLs to financial modeling.

Michael D’Mello   Intel
3:30 PM

Part 1: Empowering Financial Services Applications for Intel® Xeon® and Intel® Xeon Phi™ architectures using the Intel® Software Tools

Part 2: Workshop on empowering Financial Services Applications for Intel® Xeon® and Intel® Xeon Phi™ architectures using the Intel® Software Tools


Part 1:  Leveraging the latest hardware platforms and software techniques for performance is a must in today’s Financial industry.  This presentation provides an overview of hardware and software from Intel® Corporation that together constitute some of the most powerful solution offerings available to address the needs of the marketplace in general and the Financial industry in particular.  This overview will also touch on how certain key developments in computing have been integrated into common practice. 

Part 2:  This workshop, provides hands-on experience to students interested in leveraging the latest features of the Intel® Xeon® platform and Intel® Xeon Phi™ coprocessor using the Intel® Parallel Studio XE and Intel® Cluster Studio XE suite of software development tools.  Topics covered include programming models, performance analysis, threading, vectorization, math library usage, and compiler based optimization.

Organizing Committee

Niels Nygaard  University of Chicago
John Reppy  University of Chicago
Chanaka Liyanaarachchi  University of Chicago
John Ashley  Senior Solutions Architect
Financial Services, NVIDIA Corporation

The Stevanovich Center is supported by the generous philanthropy of University of Chicago Trustee Steve G. Stevanovich, AB ’85, MBA ’90.