SEMINÁRIO: Big Data: Opportunities and Risks. An Application to Time Series Outlier Detection
- Prof. Danil Peña - Dep.Estadística - Universidad Carlos III de Madrid
- FCUL - Campo Grande - Bloco C6 Piso 4 - Sala: 6.4.30 - 14:30h
- Quarta-feira, 8 de Fevereiro de 2017
- Referência Projeto: Projecto FCT: UID/MAT/00006/2013
The talk will discuss the changes that Big Data is producing in the methods for analyzing data. Some new procedures for data analysis proposed for large data sets will be reviewed and also the risks of applying them to this rich data environment without a statistical model . The importance of initial cleaning of the data before any analysis will be emphasized and a new procedure for finding outliers in large sets of multivariate time series using dynamic factor models will be proposed. The method is able to find specific and common outliers and can be applied in a routine way to clean large data sets of dependent or independent data. The procedure proposed will be illustrated in examples with both simulated and real data.