Vector Quantile Regression: An Optimal Transport Approach
We propose a notion of conditional vector quantile function and a vector quantile regression. A \emph{conditional vector quantile function} (CVQF) of a random vector $Y$, taking values in $\mathbb{R}^d$ given covariates $Z=z$, taking values in $\mathbb{R}% ^k$, is a map $u \longmapsto Q_{Y\mid Z}(u,z)$, which is monotone, in the sense of being a gradient of a convex function, and such that given that vector $U$ follows a reference non-atomic distribution $F_U$, for instance uniform distribution on a unit cube in $\mathbb{R}^d$, the random vector $Q_{Y\mid Z}(U,z)$ has the distribution of $Y$ conditional on $Z=z$. Moreover, we have a strong representation, $Y = Q_{Y\mid Z}(U,Z)$ almost surely, for some version of $U$. The \emph{vector quantile regression} (VQR) is a linear model for CVQF of $Y$ given $Z$. Under correct specification, the notion produces strong representation, $Y=\beta \left(U\right) ^\top f(Z)$, for $f(Z)$ denoting a known set of transformations of $Z$, where $u \longmapsto \beta(u)^\top f(Z)$ is a monotone map, the gradient of a convex function, and the quantile regression coefficients $u \longmapsto \beta(u)$ have the interpretations analogous to that of the standard scalar quantile regression. As $f(Z)$ becomes a richer class of transformations of $Z$, the model becomes nonparametric, as in series modelling. A key property of VQR is the embedding of the classical Monge-Kantorovich's optimal transportation problem at its core as a special case. In the classical case, where $Y$ is scalar, VQR reduces to a version of the classical QR, and CVQF reduces to the scalar conditional quantile function. An application to multiple Engel curve estimation is considered.
PDF Abstract