Statistical Data Presentation

download Statistical  Data Presentation

of 31

Transcript of Statistical Data Presentation

  • 8/13/2019 Statistical Data Presentation

    1/31

    StatisticsLecture and exercises (WS 2012/2013)

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 1

    Dr. Olaf Lenz

    Institut fr Angewandte Geowissenschaften

    Angewandte SedimentgeologieTechnische Universitt Darmstadt

  • 8/13/2019 Statistical Data Presentation

    2/31

    Structure

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 2

    Basics (3 lectures with exercises)Introduction on StatisticsData PresentationRequirements of Data for Statistical Analysis

    Elementary Statistics (6 lectures with exercises)t-tests and F-testsAnalysis of VarianceCorrelation and Regression

    Chi-square TestsNon-parametric TestsMultivariate ANOVA/Repeated Measures

    Analysis of Multivariate Data (3 lectures with exercises)Cluster-AnalysisPrincipal Component Analysis

    (Detrended) Correspondence Analysis

    Time Series Analysis (1 lecture with exercises)Analysis of stationary data: Spectral AnalysisAnalysis of non-stationary data: Wavelet Analysis

    Final exam

    17.10.2012

    24.10.201231.10.2012

    07.11.201214.11.201221.11.2012

    28.11.201205.12.201212.12.2012

    16.01.201323.01.2013

    30.01.2013

    06.02.2013

    13.02.2013

  • 8/13/2019 Statistical Data Presentation

    3/31

    Summary Statistics

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 3

    Measures of location:

    Measures of spread:

    Measures of shape:

    MeanMedianMode

    Quartiles

    location of the center

    of the distribution

    location of the other partsof the distribution

    VarianceStandard deviationInterquartile range

    variability of thedata values

    Coefficent of skewnessCoefficient of variationKurtosis

    symmetry

    length of the tail

  • 8/13/2019 Statistical Data Presentation

    4/31

    Exercise 1

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 4

    Exercise 1: Find the median and the mean of these three data sets. Use the PAST software.a) 11 8 6 11 10 8 7 5 6

    mean: (72/9) = 8 median: 8

    5 6 6 7 8 8 10 11 11

    b) 7 15 9 8 12 6 3 2

    mean: (62/8 = 7.75) median: (7+8)/2 = 7.5

    2 3 6 7 8 9 12 15

    c) 7 100 9 8 12 6 3 2

    mean: (147/8 = 18.375) median: (7+8)/2 = 7.5

    2 3 6 7 8 9 12 100

  • 8/13/2019 Statistical Data Presentation

    5/31

    Exercise 2

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 5

    Person Income ($)

    Sam 4 785 320

    Harvey 32 190Fred 31 870

    Jill 26 500

    Adrienne 24 200

    Mean

    Median

    980 016

    31 870

    Exercise 2: Given are the incomes of five people in a small village. Calculate the mean and the median.Which measure is better? Why?

    The median is better, because the mean is quite sensitive to erratic high values or outliers

  • 8/13/2019 Statistical Data Presentation

    6/31

    Exercise 3

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 6

    Exercise 3: Given are the results of 31 vocabulary tests.20 23 28 30 32 35 22 23 27

    30 33 36 23

    31 30 30 33 36 23 24 26 3134 23 20 25

    33 32 30 37 23

    mean: (883/31 = 28.48)

    median: 30

    mode: 23

    mode median

    mean

  • 8/13/2019 Statistical Data Presentation

    7/31

    Exercise 4

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 7

    Exercise 4: Suppose that the following scores were obtained on administering an English language test to tennon-native speakers who had undergone before a language course for brushing-up their knowledge, and tenotherwise similar people who had not undergone a language course:

    mean: 37.8 mean: 42.7

    standard dev.: 22.63 standard dev.: 11.36

  • 8/13/2019 Statistical Data Presentation

    8/31

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 8

    Data Presentation

  • 8/13/2019 Statistical Data Presentation

    9/31

    Introduction on data presentation

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 9

    Graphs help to verify that it is valid to use a particular test

    Graphs may reveal unexpected patterns in the data

    Graphs quickly reveal any mistakes in our data

    exploratory data analysis

    antipollution

    treatment

    insects

    0 mg 100

    1 mg 1202 mg 140

    3 mg 2500

    Does adding the antipollution treatment

    lead to an increase in invertebrates?

  • 8/13/2019 Statistical Data Presentation

    10/31

    Column graphs (bar graphs)

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 10Townend (2002)

  • 8/13/2019 Statistical Data Presentation

    11/31

  • 8/13/2019 Statistical Data Presentation

    12/31

  • 8/13/2019 Statistical Data Presentation

    13/31

    PAST column graph

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 13

    1

    2

  • 8/13/2019 Statistical Data Presentation

    14/31

    PAST histogram

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 14

    1

    2

  • 8/13/2019 Statistical Data Presentation

    15/31

    Line graph

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 15

    Changes in mean stomatal conductance of two tree species over the course of a day

    Townend (2002)

  • 8/13/2019 Statistical Data Presentation

    16/31

    Line graph

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 16

    Mean concentrations of insecticide at different distances from fish cages at a fish farm

    Townend (2002)

  • 8/13/2019 Statistical Data Presentation

    17/31

    PAST Line graph

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 17

    1

    2

  • 8/13/2019 Statistical Data Presentation

    18/31

    Scatter graph

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 18

    Relationship between the mean mass per seed and the number of seeds produced, for a range of plantspecimens collected on Mossely Heath. The circled point appears to be an unusual observation, which

    could warrant further investigation

    Townend (2002)

  • 8/13/2019 Statistical Data Presentation

    19/31

    Scatter graph: Independent/dependent variable

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 19

    Example: How fast individual lions can run might be controlled by how long their legs are, butthe length of their legs is not controlled by how fast they run.

    Controlling (independent) variable:leg length

    Response (dependent) variable:running speed

    X-axis

    Y-axis

  • 8/13/2019 Statistical Data Presentation

    20/31

    Scatter graph

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 20Townend (2002)

  • 8/13/2019 Statistical Data Presentation

    21/31

    PAST Scatter graph

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 21

    1

    2

  • 8/13/2019 Statistical Data Presentation

    22/31

    Ternary plot

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 22source: wikipedia

  • 8/13/2019 Statistical Data Presentation

    23/31

    PAST Ternary plot

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 23

    1) A: 10; B: 80; C: 10

    2) A: 30; B: 60; C: 103) A: 40; B: 40; C: 20

    1

    23

    10% C

    10% A

    80% B

    1

    2

  • 8/13/2019 Statistical Data Presentation

    24/31

    Tables

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 24

    When it is important that the reader can obtain the values in your results accurately orif readers are likely to want to use or compare your actual figures elsewhere, ratherthan just to compare with other populations in your experiment or survey.

    When you need to present a lot of data and this appears messy when you try to drawa graph of them.

    When data can be presented a lot more compactly in a table.

    There are sometimes situations where a table might be better than a graph

  • 8/13/2019 Statistical Data Presentation

    25/31

    Standard error and error bars in graphs

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 25

    Tree

    height(m)

    Townend (2002)

  • 8/13/2019 Statistical Data Presentation

    26/31

    Standard error and error bars in tables

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 26Townend (2002)

  • 8/13/2019 Statistical Data Presentation

    27/31

    PAST error bars

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 27

  • 8/13/2019 Statistical Data Presentation

    28/31

    PAST error bars

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 28

  • 8/13/2019 Statistical Data Presentation

    29/31

    Box-and-whisker plot

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 29

    Histogram

    Box-and-whisker plots

    Davis (2002)

  • 8/13/2019 Statistical Data Presentation

    30/31

    PAST Box-and-whisker plot

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 30

    Outlier

  • 8/13/2019 Statistical Data Presentation

    31/31

    Next week

    22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 31

    Basics (3 lectures with exercises)Introduction on StatisticsData Presentation

    Requirements of Data for Statistical Analysis

    Elementary Statistics (6 lectures with exercises)t-tests and F-testsAnalysis of VarianceCorrelation and RegressionChi-square TestsNon-parametric TestsMultivariate ANOVA/Repeated Measures

    Analysis of Multivariate Data (3 lectures with exercises)Cluster-AnalysisPrincipal Component Analysis

    (Detrended) Correspondence Analysis

    Time Series Analysis (1 lecture with exercises)Analysis of stationary data: Spectral AnalysisAnalysis of non-stationary data: Wavelet Analysis

    Final exam

    17.10.2012

    24.10.2012

    31.10.2012

    07.11.201214.11.201221.11.201228.11.201205.12.201212.12.2012

    16.01.201323.01.2013

    30.01.2013

    06.02.2013

    13.02.2013