[HAL] BigDataFr recommends: Sequential linear regression with online standardized data

standardized data

BigDataFr recommends: Sequential linear regression with online standardized data

Abstract

[…] The present study addresses the problem of sequential least square multidimensional linear regression, particularly in the case of a data stream, using a stochastic approximation process. To avoid the phenomenon of numerical explosion which can be encountered and to reduce the computing time in order to take into account a maximum of arriving data, we propose using a process with online standardized data instead of raw data and the use of several observations per step or all observations until the current step. Herein, we define and study the almost sure convergence of three processes with online standardized data: a classical process with a variable step-size and use of a varying number of observations per step, an averaged process with a constant step-size and use of a varying number of observations per step, and a process with a variable or constant step-size and use of all observations until the current step. […]

Read paper

By Kévin Duarte 1,2,3, Jean-Marie Monnez 1,23, Eliane Albuisson 4,5
Source: hal-archives-ouvertes.fr

1 – Probabilités et statistiques – IECL – Institut Élie Cartan de Lorraine
2 – BIGS – Biology, genetics and statistics – Inria Nancy – Grand Est, IECL – Institut Élie Cartan de Lorraine
3 – CIC-P Pierre Drouin – Centre d’Investigations Cliniques-Plurithématique [CHU Nancy]
4 – SPI-EAO Santé Publique, Information médicale et Enseignement multimédia Assisté par Ordinateur – Faculté de Médecine [Nancy]
5 – Unité fonctionnelle de la plateforme d’aide à la recherche clinique – ESPRI-Biobase [CHRU de Nancy]

Laisser un commentaire

Votre adresse de messagerie ne sera pas publiée. Les champs obligatoires sont indiqués avec *