Introduction to Training Data

9
Introduction to Training Data

description

Introduction to Training Data. SILC data as delivered from Eurostat Features of training dataset. Some essentials - Specifics of file names. UDB_l11D_ver 2011-1 from 01-08-2013.csv UDB_l11H_ver 2011-1 from 01-08-2013.csv UDB_l11R_ver 2011-1 from 01-08-2013.csv - PowerPoint PPT Presentation

Transcript of Introduction to Training Data

Page 1: Introduction to  Training Data

Introduction to Training Data

Page 2: Introduction to  Training Data

2

• SILC data as delivered from Eurostat• Features of training dataset

Page 3: Introduction to  Training Data

3

1. UDB_l11D_ver 2011-1 from 01-08-2013.csv2. UDB_l11H_ver 2011-1 from 01-08-2013.csv3. UDB_l11R_ver 2011-1 from 01-08-2013.csv 4. UDB_l11P_ver 2011-1 from 01-08-2013.csv

5. UDB_c11D_ver 2011-2 from 01-08-13.csv6. UDB_c11H_ver 2011-2 from 01-08-13.csv7. UDB_c11R_ver 2011-2 from 01-08-13.csv8. UDB_c11P_ver 2011-2 from 01-08-13.csv

Some essentials - Specifics of file names

Page 4: Introduction to  Training Data

4

1. UDB_l11D_ver 2011-1 from 01-08-2013.csv2. UDB_l11H_ver 2011-1 from 01-08-2013.csv3. UDB_l11R_ver 2011-1 from 01-08-2013.csv 4. UDB_l11P_ver 2011-1 from 01-08-2013.csv

5. UDB_c11D_ver 2011-2 from 01-08-13.csv6. UDB_c11H_ver 2011-2 from 01-08-13.csv7. UDB_c11R_ver 2011-2 from 01-08-13.csv8. UDB_c11P_ver 2011-2 from 01-08-13.csv

Some essentials - Specifics of file names

Page 5: Introduction to  Training Data

5

UDB_l11D_ver 2011-1 from 01-08-2013.csvUDB_c11D_ver 2011-2 from 01-08-13.csv

UDB = Userdatabase (anonymised data)

_l = longitudinal file/_c = cross file

11 = year of the survey (c-file)/year of last wave (l-file)

D=Household Register/H=Household Data/R=Personal Register/P=Personal Data

2011-1 = # of version (e.g. 1st version of the 2011 data)

csv = type of data (e.g. comma separated values)

Some essentials - Specifics of file names

Page 6: Introduction to  Training Data

6

• GESIS offers tools to transform Eurostat SILC files to SPSS and Stata format

• Go to http://www.gesis.org/en/services/data-analysis/official-microdata/european-microdata/eu-silc/eu-silc-tools/

• Download SPSS or Stata routines, adapt them to your local computing environment and run them

Transform *.csv to *.sav and *.dta

Page 7: Introduction to  Training Data

7

• Drop household with more than 5 members• Random selection of 30% of the remaining cases but not

more than 1500 cases per country• Drop regional information (DB040) and primary sampling

(DB060)• 12 countries granted access

• Data are not suitable for research!

SILC longitudinal training data

Page 8: Introduction to  Training Data

8

SILC longitudinal training data, observation years

HH-reg. HH-data Pers-reg. Pers.-data

D H R P

Page 9: Introduction to  Training Data

9