Descriptive statistics
Cross tabulation and other methods for comparing groups
For comparing 2 groups, e.g. treatment vs control, one can superimpose
the two histograms, using different color or line characteristics.
For more than 2 or 3 groups, this can get confusing.
Cross tab is an alternative to histograms, when the information broken down by
groups and by range,
e.g. blood pressure, age and treatment group. It is useful to look at each age group
separately. This would create a 2-dimensional historgram, so it is clearer to
show it as a table.
Mean and Standard Deviation
- Problem: Health and Nutrition Examination Survey (HANES). A cross-sectional
(not longitudinal) study of 20K Americans age 1-74.
How do you represent the data graphically?
You could use a cross tab.
- How to characterize data for each group?
- The average
tells you the middle and the standard deviation tells you the spread.
look at the two histograms: normal and uniform. Which one has a greater
spread?
- the average and the histogram. How are they related intuitively?
look at fig5.pdf and fig6.pdf. The median doesn't change.
- What happens when the distribution is symmetric, has long left tail,
long right tail?
- The r.m.s. is the root mean square and that tells you the size of the
data values = sqrt(average(entries ** 2))
- standard deviation measures how far the numbers are away from their
average. Take histogram in figure 8, with average = 63.5, and SD = 2.5 What
is the percent of people within 1 SD from the average? What about 2 SD from
the average?
- SD = sqrt(average (deviationFromMean **2))