# Financial Statistics Conference Program | September 26 – 28, 2014

##### Abstract

The particle Gibbs (\pg) sampler is a systematic way of using a particle filter within Markov chain Monte Carlo (MCMC). This results in an off-the-shelf Markov kernel on the space of state trajectories, which can be used to simulate from the full joint smoothing distribution for a state space model in an MCMC scheme. We show that the PG Markov kernel is uniformly ergodic under rather general assumptions, that we will carefully review and discuss. In particular, we provide an explicit rate of convergence which reveals that: \emph{(i)} for fixed number of data points, the convergence rate can be made arbitrarily good by increasing the number of particles, and \emph{(ii)} under general mixing assumptions, the convergence rate can be kept constant by increasing the number of particles superlinearly with the number of observations. We illustrate the applicability of our result by studying in details stochastic volatility model with a non-compact state space. This work is joint with R. Douc and F. Lindsten.

10:40 AM

**Qiwei Yao** London School of Economics

*Estimation of Extreme Quantiles for Functions of Dependent Random Variables*

##### Abstract

Motivated by a concrete risk management problem in financial industry, we propose a new method for estimating the extreme quantiles for a function of several dependent random variables. In contrast to the conventional approach based on extreme value theory, we do not impose the condition that the tail of the underlying distribution admits an approximate parametric form, and, furthermore, our estimation makes use of the full observed data. The proposed method is semiparametric as no parametric forms are assumed on all the marginal distributions. But we select appropriate bivariate copulas to model the joint dependence structure by taking the advantage of the recent development in constructing large dimensional vine copulas. Consequently a sample quantile resulted from a large bootstrap sample drawn from the fitted joint distribution is taken as the estimates for the extreme quantile. This estimator is proved to be consistent as long as the quantile to be estimated is not too extreme. The reliable and robust performance of the proposed method is further illustrated by simulation.

11:20 AM

**Zhengjun Zhang** University of Wisconsin at Madison

*Copula Structured M4 Processes with Application to High-Frequency Financial Data*

##### Abstract

Statistical applications of classical parametric max-stable processes are still sparse mostly due to lack of 1) efficiency of statistical estimation of many parameters in the processes, 2) flexibility of concurrently modeling asymptotic independence and asymptotic dependence among variables, and 3) capability of fitting real data directly. This paper studies a more flexible model, i.e. a class of copula structured M4 (multivariate maxima and moving maxima) processes, and hence CSM4 for short. CSM4 processes are constructed by incorporating sparse random coefficients and structured extreme value copulas in asymptotically (in)dependent M4 (AIM4) processes. As a result, the new model overcomes all of the aforementioned constraints. The paper illustrates these new features and advantages of the CSM4 model using simulated examples and real data of intra-daily maxima of high-frequency financial time series. The paper also studies probabilistic properties of the proposed model, statistical estimators and their properties.

12:00 – 1:30 PM

**Lunch**

1:30 PM

**Zheng Tracy Ke** University of Chicago

*Covariate Assisted Multivariate Screening*

##### Abstract

Given a linear regression model, we consider variable selection in a rather challenging situation: the columns of the design matrix are moderately or even heavily correlated, and signals (i.e., nonzero regression coefficients) are rare and weak. In such a case, popular penalization methods are simply overwhelmed. An alternative approach is screening, which has recently become popular. However, univariate screening is fast but suffers “signal cancellations”, and exhaustive multivariate screening may overcome “signal cancellations” but is infeasible in computation. We discover that in multivariate screening, if all we wish is to overcome the “signal cancellations”, it is not necessary to screen all m-tuples in an exhaustive fashion: most of them can be safely skipped. In light of this, we propose covariate-assisted multivariate screening (CAS) as a new approach to variable selection, where we first construct a sparse graph using the Gram matrix, then use this graph to decide which m-tuples to skip and which to screen. CAS has a modest computational cost and is able to overcome the challenge of “signal cancellations”. We demonstrate the advantage of CAS over penalization methods and univariate screening in a “rare and weak” signal model. We show that our method yields optimal convergence rate on the Hamming selection error and optimal phase diagram. CAS is a flexible idea for incorporating correlation structures into inferences. We discuss its possible extensions to multiple testing and feature ranking.

2:10 PM

**Wei Biao Wu** University of Chicago

*Estimation of High-dimensional Vector Auto-regressive Processes*

##### Abstract

We will present a systematic theory for high-dimensional linear models with dependent errors and/or dependent covariates. To study properties of estimates of the regression parameters, we adopt the framework of functional dependence measures. For the covariates two schemes are addressed: the random design and the deterministic design. For the former we apply the constrained L1 minimization approach, while for the latter the Lasso estimation procedure is used. We provide a detailed characterization on how the error rates of the estimates depend on the moment conditions that characterize the tail behaviors, the dependencies of the underlying processes that generate the errors and covariates, the dimension and the sample size. Our theory substantially extends earlier ones by allowing dependent and/or heavy-tailed errors and the covariates.

2:50 – 3:30 PM

**Break**

3:30 PM

**Clifford Hurvich** NYU

*Drift in Transaction-Level Asset Price Models*

##### Abstract

We study the effect of drift in pure-jump transaction-level models for asset prices in continuous time, driven by point processes. The drift is assumed to arise from a nonzero mean in the efficient shock series. It follows that the drift is proportional to the driving point process itself, i.e. the cumulative number of transactions. This link reveals a mechanism by which properties of intertrade durations (such as heavy tails and long memory) can have a strong impact on properties of average returns, thereby potentially making it extremely difficult to determine growth rates. We focus on a basic univariate model for log price, coupled with general assumptions on durations that are satisfied by several existing flexible models, allowing for both long memory and heavy tails in durations. Under our pure-jump model, we obtain the limiting distribution for the suitably normalized log price. This limiting distribution need not be Gaussian, and may have either finite variance or infinite variance. We show that the drift can affect not only the limiting distribution for the normalized log price, but also the rate in the corresponding normalization. Therefore, the drift (or equivalently, the properties of durations) affects the rate of convergence of estimators of the growth rate, and can invalidate standard hypothesis tests for that growth rate. Our analysis also sheds some new light on two longstanding debates as to whether stock returns have long memory or infinite variance.

4:10 PM

**Ruey Tsay** University of Chicago

*Time Evolution of Income Distributions*

##### Abstract

We propose a new method to investigate whether and how the evolution of income distribution (ID) of a population could be interpreted by a set of time-varying explanatory factors. The proposed method compares the time series of estimated ID with a hypothetical ID sequence generated by the explanatory factors and the subgroup-share sequences when the population is divided into subgroups. The proposed method is also applicable to exploring various aspects of the ID evolution, such as the growth (inequality) trend measured by the mean (Gini coefficient), and to decomposing the changes in the ID, mean, or Gini coefficient over time in search for explanations. In empirical application, we apply the proposed method to assessing whether and how Taiwan’s family ID (FID) evolution from 1981 to 2010 could be explained by the transition of family structure. This work is joint with Yi-Ting Chen.

5:00 – 6:00 PM

**Reception**

### Saturday, September 27

9:30 AM

**Registration and Breakfast**

10:00 AM

**Laurent E. Calvet** HEC Paris

*Robust Filtering*

##### Abstract

Filtering methods are powerful tools to estimate the hidden state of a state-space model from observations available in real time. However, they are known to be highly sensitive to the presence of small misspecifications of the under-lying model and to outliers in the observation process. In this paper, we show that the methodology of robust statistics can be adapted to sequential filtering. We define a filter as being robust if the relative error in the state distribution caused by misspecifications is uniformly bounded by a linear function of the perturbation size. Since standard filters are nonrobust even in the simplest cases, we propose robustified filters which provide accurate state and param-eter inference in the presence of model misspecifications. In particular, the robust particle filter naturally mitigates the degeneracy problems that plague the bootstrap particle filter (Gordon, Salmond and Smith, 1993) and its many extensions. We illustrate the good properties of robust filters in linear and nonlinear state-space examples. This work is joint with Veronika Czellar and Elvezio Ronchetti.

10:40 AM

**Eric Renault** Brown University

*Indirect Inference for Estimating Equations*

##### Abstract

Indirect inference is a method to estimate a parameter Θ in models where brute force likelihood maximization can be problematic, either because no analytical closed form is available for the likelihood function (e.g. dynamic models with latent variables) or because non-linearities or boundaries of parameter space make likelihood optimization computationally involved. Indirect inference estimates Θ by targeting some auxiliary parameters ν that are a function ν(Θ) of the structural parameters and for which a first step consistent estimator is readily available. In this paper, auxiliary parameters ν are such that the prior knowledge of their value would make inference on Θ much simpler. By contrast with standard two-step approaches, we are interested in situations where no nuisance parameters are present but occurrences of ν(Θ) within the estimsting equations create complexities for estimation. These difficult occurrences of the parameters, which are a nuisance when it comes to solving estimating equations, show up in many situations often handled by indirect inference. Leading examples are again models with latent variables (their observation would make estimation much simpler), models where it is simpler to first set the focus of inference on marginal distributions, models with highly nonlinear objective functions, etc. Based on targeting and penalization of the auxiliary parameters, we propose a new two-step estimation that leads to stable and user friendly computations. Moreover, estimators delivered in the second step of the estimation procedure are asymptotically efficient. We compare this new method with existing iterative methods in the framework of copula models and asset pricing models as well. Simulation results illustrate that this new method performs better than existing iterative procedures and is (almost) computationally equivalent. This work is joint with David Frazier.

11:20 AM

**Jianqing Fan** Princeton University

*Projected Principal Component Analysis for Factor Models*

##### Abstract

This paper introduces a Projected Principal Component Analysis (Projected-PCA), which is based on the projection of the data matrix onto a given linear space before performing the principal component analysis. When it applies to high-dimensional factor analysis, the projection removes idiosyncratic noisy components. We show that the unobserved latent factors can be more accurately estimated than the conventional PCA if the projection is genuine, or more precisely the factor loading matrices are related to the projected linear space, and that they can be estimated accurately when the dimensionality is large, even when the sample size is finite. In an effort to more accurately estimating factor loadings, we propose a flexible semi-parametric factor model, which decomposes the factor loading matrix into the component that can be explained by subject-specific covariates and the orthogonal residual component. The covariates effect on the factor loadings are further modeled by the additive model via sieve approximations. By using the newly proposed Projected-PCA, the rates of convergence of the smooth factor loading matrices are obtained, which are much faster than those of the conventional factor analysis. The convergence is achieved even when the sample size is finite and is particularly appealing in the high-dimension-low-sample-size situation. This leads us to developing nonparametric tests on whether observed covariates have explaining powers on the loadings and whether they fully explain the loadings. Finally, the proposed method is illustrated by both simulated data and the returns of the components of the S\&P 500 index. This work is joint with Yuan Liao and Weichen Wang.

12:00 – 1:30 PM

**Lunch**

1:30 PM

**Nikolaus Hautsch** University of Vienna

*Estimating the Spot Covariation of Asset Prices – Statistical Theory and Empirical Evidence*

##### Abstract

We propose a new type of estimator for the spot covariance matrix of a multi-dimensional semi-martingale log asset price process which is subject to noise. The estimator is constructed based on a local average of block-wise constant spot covariance estimates. The latter originate from the local method of moments (LMM) proposed by Bibinger et al (2014) building on locally constant approximations of the underlying process. We extend the LMM estimator to allow for autocorrelated noise and propose a consistent estimator of the order of serial dependence. We prove the consistency and asymptotic normality of the proposed spot covariance estimator and show that it benefits from the near rate-optimality of the underlying LMM approach. Based on extensive simulations, we provide empirical guidance on the optimal implementation of the estimator and apply it to high-frequency data based on a cross-section of NASDAQ blue chip stocks. Employing the estimator to estimate spot covariances, correlations and betas in normal, but also extreme-event periods, yields novel insights into intraday covariance and correlation dynamics. We show that intraday (co-)variations (i) follow underlying periodicity patterns, (ii) reveal substantial intraday variability associated with (co-)variation risk, (iii) are strongly serially correlated, and (iv) can increase strongly and nearly instantaneously if new information arrives.

2:10 PM

**Lan Zhang** University of Illinois at Chicago

*Assessment of Uncertainty in High Frequency Data: The Observed Asymptotic Variance*

##### Abstract

High frequency inference has generated a wave of research interest among econometricians and practitioners, as indicated from the increasing number of estimators based on intra-day data. However, we also witness a scarcity of methodology to assess the uncertainty — standard error– of the estimator. The root of the problem is that whether with or without the presence of microstructure, standard errors rely on estimating the asymptotic variance (AVAR), and often this asymptotic variance involves substantially more complex quantities than the original parameter to be estimated. Standard errors are important: they are used both to assess the precision of estimators in the form of confidence intervals, to create “feasible statistics” for testing, and also when building forecasting models based on, say, daily estimates. The contribution of this paper is to provide an alternative and general solution to this problem, which we call “Observed Asymptotic Variance”‘. It is a general nonparametric method for assessing asymptotic variance (AVAR), and it provides consistent estimators of AVAR for a broad class of integrated parameters. The spot parameter process (the integrand) can be a general semimartingale, with continuous and jump components. The construction and the analysis of work well in the presence of microstructure noise, and when the observation times are irregular or asynchronous in the multivariate case. The edge effect — phasing in and phasing out the information on the boundary of the data interval — of any relevant estimator is also analyzed and treated rigorously. As part of the theoretical development, the paper shows how to feasibly disentangle the effect of the estimation error and the variation in the parameter spot alone. For the latter, we obtain a consistent estimator of the quadratic variation (QV) of the parameter to be estimated, for example, the QV of the leverage effect. The methodology is valid for a wide variety of estimators, including the standard ones for variance and covariance, and also for estimators, such as, of leverage effects, high frequency betas, and semi-variance. This work is joint with Per Mykland.

2:50 to 3:30 PM

**Break**

3:30 PM

**Ilze Kalnina** University of Montreal

*The Idiosyncratic Volatility Puzzle: A Reassessment at High Frequency*

##### Abstract

We consider a nonparametric continuous-time Fama-French factor model, which can be estimated with high frequency data. Our theoretical framework allows us to obtain precise estimates of the factor loadings without the usual assumption of piecewise constant betas. It also allows us to decompose the idiosyncratic volatility into its jump and continuous components. We provide an asymptotic theory for the betas and of each idiosyncratic volatility component. Empirically, we use the panel of all traded stocks from NYSE, AMEX, and NASDAQ stock markets for 1993-2012 to construct the Fama-French factors at 5-minute frequency, and apply the new methodology to the idiosyncratic volatility puzzle documented by Ang 2006. This work is joint with Yacine Ait-Sahalia and Dacheng Xiu.

4:10 PM

**J. E. Figueroa-Lopez** Purdue University

*Optimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models*

##### Abstract

Thresholded Realized Power Variations are popular nonparametric estimators for continuous-time processes with jumps. An important issue in their application lies in the necessity of choosing a suitable threshold for the estimator. In this talk, a selection method for the threshold is proposed based on desirable optimality properties of the estimators. Concretely, we introduce a well-posed optimization problem which, for a fixed sample size and time horizon, selects a threshold that minimizes the expected total number of jump misclassifications committed by the thresholding mechanism. The leading term of the optimal threshold sequence is shown to be proportional to the Lévy’s modulus of continuity of the underlying Brownian motion, hence theoretically justifying and sharpening several selection methods previously proposed in the literature based on power functions or multiple testing procedures. Furthermore, building on the aforementioned asymptotic characterization, we develop an estimation algorithm, which allows for a feasible implementation of the newfound optimal sequence and extensions to larger classes of processes. This work is joint with Jeff Nisen.

### Sunday, September 28

9:30 AM

**Registration and Breakfast**

10:00 AM

**Rong Chen** Rutgers University

*Dynamic Modeling and Prediction of Risk Neutral Densities*

##### Abstract

Risk neutral density is extensively used in option pricing and risk management in finance. It is often implied using observed option prices through a complex nonlinear relationship. In this study, we model the dynamic structure of risk neutral density through time, investigate modeling approach, estimation method and prediction performances. State space models, Kalman filter and sequential Monte Carlo methods are used. Simulation and real data examples are presented.

10:40 AM

**Marc Hallin** ECARES, Univesite libre de Bruxelles and ORFE,

Princeton University

*General Dynamic Factors and Volatilities*

##### Abstract

Whether static or dynamic, factor model methods in the analysis of large panels of time series so far are exclusively based on the covariance structure of levels or returns, hence do not say anything about volatilities. A simple idea would consist in calling “market volatility” the volatility of the common component in a factor representation of returns. Market volatility, however, does not only impact the level-common shocks, but typically is also present in the level-idiosyncratic ones. Based on this observation, we propose a two-stage generalized dynamic factor joint analysis for the returns and the volatilities. Our approach is entirely model-free and avoids curse of dimensionality problems. It allows for volatility prediction and dynamic portfolio optimization. When applied to S&P100 asset returns data, the method confirms that a considerable proportion of the common volatility of returns originates in their idiosyncratic components. Based on joint work with Matteo Barigozzi, LSE, London.

11:20 AM

**Ching-Kang Ing** Academia Sinica, Taipei

*Toward Optimal Model Averaging in Regression Models with Serially Correlated Errors*

##### Abstract

Consider a regression model with infinitely many parameters and autocorrelated errors. We are interested in choosing weights for averaging a set of misspecified models, thereby leading to a reliable estimator of the mean function. We propose an “autocorrelation-corrected Mallows model averaging” (AMMA) estimator which selects the weights by minimizing a Mallows-type criterion based on the feasible generalized least squares (FGLS) estimator. One distinctive feature of our approach is that a consistent estimator of the high-dimensional inverse covariance matrix of the error process is provided to construct the FGLS estimator. We show that our AMMA estimator is asymptotically efficient in the sense that the resultant averaging estimator of the mean function achieves the lowest possible conditional mean squared error. We also conduct an extensive Monte Carlo study to demonstrate the finite sample performance of AMMA in comparison with several existing model averaging methods.

12:00 – 1:30 PM

**Lunch**

1:30 PM

**Mohsen Pourahmadi** Texas A&M University

*Cholesky-log-GARCH Multivariate Volatility Models: Asset Ordination*

##### Abstract

Parsimonious models for high-dimensional covariance matrices is of fundamental importance in multivariate statistics and in finance where forecasting the time-varying volatility matrices of several asset returns is of interest. The commonly used multivariate GARCH-type models impose strong restrictions on the coefficient matrices to ensure positive-definiteness and they do not scale up easily to higher dimensions. We consider two alternative (sequential) models which are effectively univariate in nature and better suited for larger portfolios. They are based on the modifed Cholesky decomposition (MCD) of the time-varying covariance matrices and the standard Cholesky decomposition (SCD) of their correlation matrices, respectively. The unconstrained regression-based and hyperspherical parametrizations of the two Cholesky factors allow using covariates to model them parsimoniously. Each Cholesky factor is then combined with univariate log-GARCH models for their respective time-varying volatilities. These methods require an order among the assets, we discuss the issue of asset ordination in detail and compare the performances of a regression (BIC)-based ordering and a random sample of permutations of assets. Application of the proposed methodologies to two real financial datasets of 12 and 92 assets reveals better performance of the SCD and a random sample of asset permutations.

2:10 PM

**David Stoffer** University of Pittsburgh

*Stochastic Volatility: Not Just Another State Space Model*

##### Abstract

The stochastic volatility model (SVM) has become one of the standard models used to explain the volatility of an asset. And while appearing to be a simple extension of the linear Gaussian state space model (SSM), the SVM can be a pain to fit. In this talk, I will discuss some aspects on fitting these models; I’ll also discuss the analysis of some extensions of SVM.

2:50 PM

**Concluding Remarks**

**Scientific Committee**

**Local Organizing Committee **Per Mykland, Wei Biao Wu, and Mary King

**Conference Location **5727 S. University

Chicago, IL 60637

The Stevanovich Center is supported by the generous philanthropy of University of Chicago Trustee Steve G. Stevanovich, AB ’85, MBA ’90.