CFE-CMStatistics 2025: Tutorials

Tutorials

The tutorials are organized by the COST Action HiTEc (see HiTEc Winter Course 2025), and chaired by Prof. Erricos Kontoghiorghes and Prof. Ana Colubi in representation of the Action. The conference participants can register for each one of the tutorials separately. For further information send an email to info@CMStatistics.org.

Dates: 10-12 December 2025.

Venue: MAL 404 + 405, Floor 4, Birkbeck Malet Street (main building, Stairs A, Lift A1, A2), Birkbeck, University of London, UK. For virtual participation, please see below.

The registration and coffee breaks with take place in MAL 151, Birkbeck Malet Street (main building, Stairs B, Lifts B1, B2).

Tutorial I (7 hours)

Probabilistic programming for statistical analysis in Julia

Presenters: Mattias Villani, Stockholm University, Sweden and the developers of the Turing.jl ecosystem at the Alan Turing Institute and University of Cambridge, UK.
Email: Contact

Dates: 10th December 2025.

Julia has emerged as an important language for statistical data analysis and machine learning. It is a high-level language that is easy to learn, but with a speed close to C/C++ from its just-in-time compilation. Despite its relatively young age, Julia already has an impressive set of libraries for statistics, and can be easily integrated with a workflow in R or Python.

This first half of this tutorial introduces the Julia programming language with a focus on statistical analysis. The second half focuses on likelihood and Bayesian inference using the Turing.jl probabilistic programming ecosystem in Julia. Participants are encouraged to install Julia and some statistical packages before the tutorial, to follow along on their own laptops. More details with installation instructions will be posted as we approach the conference.

The following topics will be covered:

Part 1

introduction to the Julia programming language
the package manager and tooling
statistical distributions
optimization and automatic differentiation

Part 2

probabilistic programming for statistical inference using Turing.jl

Tutorial II (6 hours)

Bayesian variable selection

Presenter: Jim Griffin, UCL, UK.
Email: Contact

Dates: 11th December 2025.

The routine availability of large numbers of covariates for many data sets has led to interest in variable selection methods which find a subset of the covariates that explain the variation in the response. I will review Bayesian approaches to variable selection. These automatically provide a measure of model uncertainty through the posterior distribution, which is attractive as an alternative to choosing a best model according to some criterion. I will look at the basic ideas, the choice of prior distributions (which is key for effective variable selection), applications to linear, generalized linear, and nonlinear models, computational approaches, and methods to summarise the posterior distribution. The methods will be illustrated on a range of applications including biology, chemometrics, economics and finance, and a range of data set sizes from tens to thousands of covariates.

Tutorial III (4 hours)

Small ball probabilities for functional data analysis

Presenter: Enea Bongiorno,Universita del Piemonte Orientale, Italy.
Email: Contact

Dates: 12th December 2025 (morning).

This presentation explores the use of Small Ball Probabilities (SmBP) in non-parametric statistics as a powerful tool for the analysis of functional data.

Part I: SmBP and Classification

In the first part, SmBPs are utilized to define a concept of pseudodensity for statistical processes taking values in general spaces. This pseudodensity is then specifically applied within functional spaces to construct (un)supervised classification procedures. This approach provides a new way to measure the concentration of functional data, which is crucial for differentiating between classes or clusters.

Part II: Complexity and Model Structure

The second part introduces the notion of complexity for stochastic processes, derived by exploiting an appropriate factorizing hypothesis concerning the SmBP. After demonstrating how this concept of complexity generalizes the standard notion of dimensionality, we will proceed to construct a mixture of complexities. Finally, we will illustrate methods for studying and identifying the inherent structural complexity of this mixture when applied to real-world data.

E. G. Bongiorno, L. Chan, A. Goia. Complexity Mixture Processes on Metric Spaces. Journal of Statistical Computation and Simulation, (2025) Accepted. (doi: 10.1080/00949655.2025.2565623)

E. G. Bongiorno, L. Chan, A. Goia. Detecting the Complexity of a Functional Time Series. Journal of Nonparametric Statistics, (2024), 36(3), 600–622. (doi: 10.1080/10485252.2023.2234507)

E. G. Bongiorno, A. Goia, P. Vieu. Estimating the complexity index of functional data: Some asymptotics. Statistics and Probability Letters, 161 June (2020) (doi: 10.1016/j.spl.2020.108731)

E. G. Bongiorno, A. Goia, P. Vieu. Modeling Functional Data. A test procedure. Computational Statistics, 34 (2) June (2019) pp. 451–468 (doi: 10.1007/s00180-018-0816-9)

E. G. Bongiorno, A. Goia, P. Vieu. Evaluating the complexity of some families of functional data. SORT-Statistics and Operations Research Transactions, 42 (1) January-June (2018). (doi: 10.2436/20.8080.02.50)

E. G. Bongiorno, A. Goia. Some Insights About the Small Ball Probability Factorization for Hilbert Random Elements. Statistica Sinica, 27 (2017) pp. 1949–1965. (doi: 10.5705/ss.202016.0128)

E. G. Bongiorno, A. Goia. Classification methods for Hilbert data based on surrogate density. Computational Statistics & Data Analysis, 99 (2016) pp. 204–222. (doi: 10.1016/j.csda.2016.01.019)

Tutorial IV (4 hours)

Measure transportation, statistical inference, and time series

Presenter: Marc Hallin,Universite libre de Bruxelles, Belgium.
Email: Contact

Dates: 12th December 2025 (afternoon).

1. Introduction: Measure Transportation in a Nutshell

From Monge and Kantorovich to Brenier and McCann, a user-friendly introduction to measure transportation.

2. The long quest for multivariate quantiles

Quantiles are a fundamental concept in probability and an essential tool in statistics, from descriptive to inferential. Still, until recently, and despite half a century of attempts (motivating the development of copula, Tukey depth, spatial and geometric quantiles), no fully satisfactory and fully agreed-upon definition of the concept is available beyond the well-understood case of univariate variables and distributions.

The need for such a definition is particularly critical for variables taking values in R^d, for directional variables (values on the hypersphere), and, more generally, for variables with values on manifolds. Unlike the real line, indeed, no canonical ordering is available on these domains.

We show how measure transportation brings a solution to long-standing problem by characterizing distribution-specific (data-driven, in the empirical case) orderings and center-outward quantile functions that satisfy all the properties expected from such concepts while reducing, in the case of real-valued variables, to the classical univariate notions.

3. Multivariate ranks and multivariate rank tests

Distribution functions and ranks are dual to the concepts of quantile functions and empirical quantiles: measure-transportation-based quantiles, by duality, characterize center-outward distribution functions (in populations) and center-outward ranks (in the sample). These ranks can be used to construct distribution-free tests and R-estimators for a variety of problems: two-sample location, MANOVA, multiple-output regression, vector independence, etc., extending the classical theory of Hájek to a multivariate setting.

4. Multiple-output quantile regression and multivariate quantile autoregression

Among the most powerful applications of classical quantiles is quantile regression, introduced in a pathbreaking paper by Koenker and Bassett (Econometrica 1978). Unlike traditional mean regression, which is dealing with the dependence of the expected value of a variable of interest on a set of covariates, quantile regression is modelling the dependence of the entire conditional distribution via its quantiles.

Due to the lack of an appropriate concept of multivariate quantiles, and despite many attempts (based on marginal quantiles or statistical depth), quantile regression so far was limited to single-output regression. The related concept of quantile autoregression similarly was limited to univariate time series. Thanks to the measure-transportation-based concept of center-outward quantiles, these powerful methods are extended to multiple-output regression and vector autoregressions.

References

Hallin, M., del Barrio, E., Cuesta-Albertos, J., and Matrán, C. (2021). Center-outward distribution and quantile functions, ranks, and signs in R^d: a measure transportation approach, Annals of Statistics 49, 1139--1165.
Hallin, M., La Vecchia, D., and Liu, H. (2020). Center-Outward R-Estimation for Semiparametric VARMA Models. Journal of the American Statistical Association 117, 925–938.
Hallin, M., La Vecchia, D., and Liu, H. (2021). Rank-based testing for semiparametric VAR models: A measure transportation approach. Bernoulli 29 229 - 273.
Hallin, M. (2022). Measure transportation and statistical decision theory, Annual Review of Statistics and its Applications 9, 401--424.
Hallin, M., Hlubinka, D., and Hudecova, S. (2022). Fully distribution-free center-outward rank tests for multiple-output regression and MANOVA, Journal of the American Statistical Association 118, 1923--1939.
Shi, H., Hallin, M., Drton, M., and Han, F. (2022). On universally consistent and fully distribution-free rank tests of vector independence, Annals of Statistics 50, 1933-1959.
Hallin, M. and Konen, D. (2024). Multivariate quantiles: geometric and measure-transportation-based contours, in Vladik Kreinovich, Woraphon Yamaka, and Supanika Leurcharusmee, Eds, Applications of Optimal Transport to Economics and Related Topics, Proceedings of the 17th International Conference of the Econometrics Society of Thailand, Springer, 61--78.
Shi, H., Hallin, M., Drton, M., and Han, F. (2024). Distribution-free tests of multivariate independence based on center-outward quadrant, Spearman, Kendall, and van der Waerden statistics, Bernoulli 31, 106--129.
Hallin, M. and Liu, H. (2023). Center-outward rank- and sign-based VARMA portmanteau tests: Chitturi, Hosking, and Li--McLeod revisited, Econometrics and Statistics, available online.
Hallin, M., Liu, H., and Verdebout, Th. (2024). Nonparametric measure-transportation-based methods for directional data, Journal of the Royal Statistical Society Series B 86, 1172--1196.
del Barrio, E., Gonzalez-Sanz, A., and Hallin, M. (2025). Nonparametric multiple-output center-outward quantile regression, Journal of the American Statistical Association,120, 818--832.

Programme

Wednesday, 10 December 2025

09:00 - 10:30 Tutorial I
10:30 - 11:00 Coffee break
11:00 - 13:00 Tutorial I
13:00 - 14:30 Lunch break
14:30 - 16:00 Tutorial I
16:00 – 16:30 Coffee break
16:30 – 18:30 Tutorial I

Thursday, 11 December 2025

09:00 - 10:30 Tutorial II
10:30 - 11:00 Coffee break
11:00 - 12:30 Tutorial II
12:30 - 14:00 Lunch break
14:00 - 15:30 Tutorial II
15:30 – 16:00 Coffee break
16:00 – 17:30 Tutorial II

Friday, 12 December 2025

09:00 - 11:00 Tutorial III
11:00 - 11:30 Coffee break
11:30 - 13:30 Tutorial III
13:30 - 15:00 Lunch break
15:00 - 17:00 Tutorial IV
17:00 – 17:30 Coffee break
17:30 – 19:30 Tutorial IV

Instructions for virtual participants to access the tutorials

Read the technical requirements and general information to enter the virtual room. Accessing the tutorials implies to accept the conditions.
Log in to the registration tool of CFE-CMStatistics 2025 or HiTEc Winter Course 2025, depending on where you registered, to obtain the password. Only registered participants will have access.
To be redirected to the Zoom room, click here. Once in Zoom, enter the password available on the registration tool.
The conference staff will verify participants in the Zoom rooms. Ensure that you have entered the Zoom meeting with the same name and surname you used to register for the conference. Otherwise, rename yourself as soon as possible. Attendees not on the list of participants will be removed if they fail to identify themselves using the chat.

HiTEc Grants

PhD students and young researchers, according to the COST definition (under 40 years), from eligible COST countries* can apply for a limited number of grants. The granted participants will be reimbursed a daily allowance of 190 euros per day plus travel expenses of up to 350 euros.

In order to apply for the grants, candidates should submit their CV by e-mail to hiteccostaction@gmail.com. The selection of candidates will be made on the basis of the CV of the candidates and considering the COST inclusiveness criteria.
Deadline for applications: 10th September 2025.
Granted candidates will be informed by e-mail after the deadline and must send their flight tickets and registration 7 days after the notification to secure their grants. Otherwise, their grants will be revoked and assigned to other candidate.
The granted candidates must attend all the sessions and sign the attendance list in order to obtain their grants.

*Eligible COST countries: Albania, Armenia, Austria, Belgium, Bosnia and Herzegovina, Bulgaria, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Georgia, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, the Republic of Moldova, Montenegro, The Netherlands, The Republic of North Macedonia, Norway, Poland, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey, Ukraine, United Kingdom and Israel.

Organizers and sponsors

Organized by the HiTEc COST Action CA21163 with the collaboration of CFE-CMStatistics.