CMStatistics 2018: Start Registration
View Submission - CMStatistics
Title: Small area model-based estimation using big data: Applications Authors:  Stefano Marchetti - Dipartimento di Economia e Management, Universita di Pisa (Italy) [presenting]
Abstract: National statistical offices aim to produce statistics for citizen and policy-makers. Survey sampling has been recognized to be an effective method to obtain timely and reliable estimates for a specific area in socio-economic fields. Usually, it is important to infer population parameters at a finer area level, where the sample size is small and does not allow for reliable direct estimates. Small area estimation (SAE) methods by means of auxiliary variables allow as to obtain reliable estimates when the direct o are unreliable. SAE methods can be classified into unit-level and area-level models: Unit-level models require a common set of auxiliary variables between survey and census/registers known for all the population units; area-level models are based on direct estimates and aggregated auxiliary variables. Privacy policies and high census costs make difficult the use of unit-level data, particularly out of the statistical offices. Aggregated auxiliary variables from different sources are more easily available and can be used in are-level models. Big data a collection of data that contains greater variety arriving in increasing volumes and with ever-higher velocity adequately processed can be used as auxiliary variables in area-level models. We show two applications of SAE: the use of mobility data to estimates poverty incidence at local level in Tuscany, Italy and the use of twitter data to estimate the share of food consumption expenditure at the province level in Italy.