1st Seminar by CAST—Centre for Applied Statistics and Data Analytics
Recent developments in multivariate methods using multiple scatter matrices
Date: August 10th, 2016 (14h00 - 16h00)
Venue: University of Tampere, Finland
Local Organisation: Ansa Lilja: Ansa.Lilja@uta.fi
The first Seminar by the Centre for Applied Statistics and Data Analytics (CAST) will be held at the School of Health Sciences, University of Tampere, Finland, at August 10th, 2016. It will gather researchers, other faculty and students interested in applied statistics and data analytics working at the University of Tampere and other Universities, Research Institutes and Companies.
The main aims of the seminar events by CAST are: (i) to bring awareness of the importance of statistics and data analysis in research; (ii) to create a forum of discussion where researchers present their work and research questions followed by discussion and feedback from the audience; (iii) strengthen the links between schools and research groups that might lead to future collaborations in terms of research articles and funding applications.
The visiting speakers of the seminar are Aurore Archimbaud (Universite Toulouse 1 Capitole), Markus Matilainen (University of Turku), Klaus Nordhausen (University of Turku) and Joni Virta (University of Turku)
Please reserve your calendars accordingly and spread the information to whom it may concern. Registrations.
Wednesday, August 10th, 2016 (group room A308, Arvo building in Kauppi)
14h00 – 14h05: Opening (Klaus Nordhausen)
14h05 – 14h30: Klaus Nordhausen : Is it `plug & play' or `plug & pray' in robust multivariate statistics?
14h30 – 15h00: Aurore Archimbaud : Components selection for multivariate outlier detection with ICS
15h00 – 15h30: Markus Matilainen : Some independent component analysis tools for time series data
15h30 – 16h00: Joni Virta : Independent component analysis for tensor-valued data
University of Turku, firstname.lastname@example.org
Title: Is it `plug & play' or `plug & pray' in robust multivariate statistics?
Authors: Klaus Nordhausen and David E. Tyler
Abstract: The sample covariance matrix, which is well known to be highly non-robust, plays a central role in many classical multivariate statistical methods. A popular approach for making such multivariate methods more robust is to simply replace the sample covariance matrix with some robust scatter matrix. In this talk we will demonstrate that multivariate methods often require that certain properties of the covariance matrix also hold for the robust scatter matrix in order for the corresponding robust ``plug-in'' method to be a valid approach, and that not all scatter matrices necessarily possess the desired properties. Plug-in methods for the following three multivariate methods are considered in more detail in this talk: independent components analysis, observational regression and graphical modeling. For each case, it is shown that replacing the sample covariance matrix with a symmetrized robust scatter matrix yields a valid robust multivariate procedure.
TSE-R, University Toulouse 1 Capitole, 21 allee de Brienne, 31000
Title: Components selection for multivariate outlier detection with ICS
Abstract: The detection of a small proportion of multivariate outliers such as identifying production errors in industrial processes is an important topic. In this context, the Invariant Coordinate Selection (ICS) method is an efficient identification procedure. The ingenious idea of the method, compared to other multivariate methods such as Principal Component Analysis (PCA) or robust PCA, is to simultaneously diagonalize two scatter matrices. In case of a small percentage of outliers, the ICS coordinates are ordered decreasingly according to a generalized concept of kurtosis depending on the considered pair of scatters. Taking into account the coordinates associated with large kurtosis values, the observations far away from the center of the data are declared as outliers. One challenging step in the procedure is to select the components that display outliers. Two approaches are introduced and compared. The first one is comparable to a test procedure where the critical value is calculated using some simulations. The other approach incorporates some univariate normality tests.
University of Turku, email@example.com
Title: Some independent component analysis tools for time series data
Abstract:Blind Source Separation models are semiparametric models, where the components of an observed p-variate vector x are assumed to be linear combinations of the components of some unobserved p-variate source vector z. In time series context, the observations are assumed to be from a p-variate time series. We focus on independent component analysis (ICA), which is a special case of Blind Source Separation. We introduce extensions of classic FOBI (Fourth Order Blind Identification) and JADE (Joint Approximate Diagonalization of Eigen-matrices) estimates and a variant of SOBI (Second Order Blind Identification) estimate for multivariate time series, with a special focus on time series with stochastic volatility. In the end of the talk some results from a simulation study are presented.
University of Turku, firstname.lastname@example.org
Title: Independent component analysis for tensor-valued data
Authors: Joni Virta (University of Turku, email@example.com), Bing Li, Klaus Nordhausen and Hannu Oja
Abstract: In preprocessing high-dimensional tensor data, e.g. images or videos, a common procedure is to vectorize the observed tensors and subject the resulting vectors to one of the many methods used for independent component analysis (ICA). However, the structure of the original tensor is lost in the vectorization along with any meaningful interpretations of its modes. To provide a more suitable alternative, we propose the Tensor fourth order blind identification (TFOBI), a tensor-valued analogy of the classic Fourth order blind identification (FOBI), to be used with the semiparametric tensor independent component model. In TFOBI, instead of vectorizing, we stay in the tensor form and in a sense perform FOBI simultaneously on all the modes of the observed tensors. Furthermore, being an extension of FOBI, TFOBI shares with it its computational simplicity. Simulated and real-world examples are used to showcase the method's usefulness and superiority over the combination of vectorizing and FOBI.