Recent Developments in Parallel Computing in Finance | June 5 – 6, 2014
GPUs provide super-computer performance at commodity prices, but are notoriously hard to program. The standard languages for programming GPUs, CUDA and OpenCL, expose low-level architectural details, such as an explicit memory hierarchy, that the programmer must exploit to maximize performance. To make the power of GPUs, accelerators, and heterogeneous architectures more widely applicable, we need to lift the level of abstraction away from the details of the hardware threading and memory models. In this talk, I will discuss ongoing research to develop higher-level parallelism models that map well onto GPUs, such as nested data parallelism and domain-specific models.
John Ashley NVIDIA
Latest Research on GPU implementation of explicit and implicit Finite Difference methods in Finance
Based on a presentation by Professor Mike Giles of Oxford, the talk will cover the latest (2014) joint Oxford-NVIDIA research and implementation work on 1D and 3D finite difference solvers on the GPU. A hybrid PCR method is outlined that takes advantage of the latest GPU features to accelerate computation. Pointers to the latest codes available for download will be provided.
12:00 – 1:30 PM
Thomas Luu University College London
Parallel non-uniform random number generation
A variety of probability distributions are used in financial modelling. The most direct and desirable way to generate a non-uniform random number is by inversion. This of course requires the ability to evaluate the quantile function (inverse CDF) of a distribution — preferably at speed and with negligible approximation error. Only a handful of distributions (e.g. exponential) possess quantile functions with closed-form expressions, so numerical methods must be used to simulate the others. These other distributions (e.g. normal and gamma) are usually the ones used in financial simulations.
This talk will focus on the fast and accurate computation of the normal and gamma quantile functions on GPUs. Traditional algorithms that evaluate the normal quantile function (e.g. Wichura and Moro) are not optimal for GPUs. This is because these algorithms split the input domain and treat each subdomain separately. This is a significant source of branch divergence. During this talk we will see how branch divergence can be aggressively minimised via changes of variables. A surprising change of variable for the normal distribution leads to a very fast double-precision GPU algorithm.
Evaluation of the gamma quantile function is more challenging, because of the presence of a shape parameter. Algorithms based on root finding are known to be too slow for real-time Monte Carlo simulation. Instead, we take a differential equation approach. Our differential equations incorporate changes of variables that make the evaluation of the gamma quantile function easier and more efficient on GPUs. We will see how this leads to a solution that facilitates real-time inversion of the gamma distribution.
The normal and gamma quantile function algorithms described are currently being integrated into the NAG GPU Library.
Trevor Misfeldt Centerspace LLC
Andy Gray Black Crater Software Solutions
Lecture by Trevor Misfeldt
CenterSpace Software, a leading provider of numerical component libraries for the .NET platform, will give an overview of their NMath math and statistics libraries and how they are being used in industry. The Premium Edition of NMath offers GPU parallelization. Xeon Phi, C++ AMP and CUDA are technologies of interest. Support for each will be discussed. Also discussed will be CenterSpace’s Adapative Bridge™ technology, which provides intelligent, adaptive routing of computations between CPU and GPUs. The presentation will finish with a demonstration followed by performance charts.
Tutorial by Andy Gray
In this hands-on programming tutorial, we will compare and contrast several approaches to a simple algorithmic problem: a straightforward implementation using managed code, a multi-CPU approach using a parallelization library, coupling object-oriented managed abstractions with high-performance native code, and seamlessly leveraging the power of a GPU for massive parallelization.
Raman Sharma Microsoft Corporation
How to obtain superior performance for computational finance workloads using APM (Accelerated Massive Parallelism)
This session provides insight on how to obtain superior performance for computational finance workloads without compromising developer productivity. C++ AMP technology lets you write C++ STL like code that runs on GPUs (and CPUs) in a platform and vendor agnostic manner. The session will start with an overview of C++ AMP, dive into the features, list various compilers that support C++ AMP and showcase the performance characteristics of options pricing workloads written using C++ AMP code. Attend this talk to see how you can write productive and easy to maintain code that offers superior performance.
FRIDAY, JUNE 6
Check-in and Breakfast
John Ashley NVDIA
Implementing correlated FX baskets on GPUs
A guided tour Leveraging library functions and CUDA code, the talk will walk through the design and implementation of a code to compute a covariance matrix, perform a Cholesky decomposition, and then construct a Monte Carlo analysis to compute the correlated portfolio exposures over time. This session will consist of a design overview and a code walkthrough.
12:00 – 1:30 PM
Jerry Hanweck Hanweck Associates
NVIDIA GPU – Accelerated Stochastic Volatility Modeling for Derivatives Pricing
Pricing derivatives using stochastic volatility models, such as Heston’s, is computationally intensive. In this presentation, we will review some popular stochastic volatility models and their solutions for options pricing, with a focus on hardware-accelerated numerical methods employing NVIDIA GPUs and CUDA.
Peter M. Phillips Aon Benfield
Aamir Mohammad Aon Benfield
Visual DSL for Actuarial Models – An Industrial Experience Report
Actuarial models for Variable Annuities are characterized by a high degree of logical complexity, computational burden and implementation risks and errors. We describe PathWise Modeling Studio (PWMS), a spreadsheet-like visual DSL to simplify the model implementation process while allowing for computations to be accelerated on Graphics Processing Units (GPUs) and cloud computing resources. We comment on PWMS user experience in an industry setting and the general applicability of visual DSLs to financial modeling.
Michael D’Mello Intel
Part 1: Empowering Financial Services Applications for Intel® Xeon® and Intel® Xeon Phi™ architectures using the Intel® Software Tools
Part 2: Workshop on empowering Financial Services Applications for Intel® Xeon® and Intel® Xeon Phi™ architectures using the Intel® Software Tools
Part 1: Leveraging the latest hardware platforms and software techniques for performance is a must in today’s Financial industry. This presentation provides an overview of hardware and software from Intel® Corporation that together constitute some of the most powerful solution offerings available to address the needs of the marketplace in general and the Financial industry in particular. This overview will also touch on how certain key developments in computing have been integrated into common practice.
Part 2: This workshop, provides hands-on experience to students interested in leveraging the latest features of the Intel® Xeon® platform and Intel® Xeon Phi™ coprocessor using the Intel® Parallel Studio XE and Intel® Cluster Studio XE suite of software development tools. Topics covered include programming models, performance analysis, threading, vectorization, math library usage, and compiler based optimization.
Niels Nygaard University of Chicago
John Reppy University of Chicago
Chanaka Liyanaarachchi University of Chicago
John Ashley Senior Solutions Architect
Financial Services, NVIDIA Corporation
The Stevanovich Center is supported by the generous philanthropy of University of Chicago Trustee Steve G. Stevanovich, AB ’85, MBA ’90.