QERM 598 - HW 4 Introduction to ANOVA (Note: this assignment requires a fair amount of math to be done by hand and some writing. For this reason, it would be best if the solutions are handed in on paper as well as electronically, and should contain more than just R code and output. I would also prefer to have the plots in hand rather than run your R code. Remember, you can copy-paste R-plots into a Word document or graphics or output plots directly to files using the "jpeg()", "bmp()" or "pdf()" functions. Part 1: Due February 5, 2007 1) In the Preface to the notes for this week, I identify 4 important results that are neccessary for deriving the results of ANOVA. Use simulations in R to illustrate facts (3) and (4). Plot the results of the simulation and the theoretical predictions. 2) Discuss briefly the advantages and disadvantages of the two kinds of plots presented on page 4. 3) Show how equation (5) leads to equation (6) on page 9 of the notes. 4) Derive expression (10). Part 2: Due February 12, 2007 5) Randomize the results of the Great QERM Pie Zone-Out Experiment, show boxplots, and perform an analysis of variance on it. What do you expect the result to be? (Note: this can be done in three lines of code!) 6) Analysis of birth timing for Steller sealions on rookeries in Russia: in 2005, researchers situated on several reproduction sea-lion rookeries in Russia observed the birth of pups of ten randomly selected sealions on each rookery. There is some interest as to whether the birth timing of the sealions is different across these rookeries which span about 1000 in latitude. a) Load the Steller sea lion birth date data using the read.table function. b) Create a boxplot of the results of birth-timing. Does it look from the boxplot that island is an important factor is predicting birth timing? c) Formulate two statistical models that predict birth timing of sealions, one that excludes the effect of island and one that includes the effect of island. Identify the number of parameters in each model. Make sure to specify the indices and their range of values. What are "a","n", and "N"? d) Obtain the group means and sample variances for the six islands. e) Obtain the Total Sum of Squares, the Mean Sum of Squares and the Treatement Sum of Squares. f) Obtain the Mean Square Error and Treatment effects. g) Obtain an statistic that will allow you to test the hypothesis of significant island effect. Obtain a p-value from comparing this statistic to a null-distribution. h) Fill out the components of the single-factor fixed-effects ANOVA table. i) Obtain an ANOVA table in one or two lines using the "lm" and "anova" commands. j) Show some diagnostic plots to assess the validity of the ANOVA assumptions. k) What are your conclusions regarding the effect of Island on the birth timing of Steller sea lions in Russia?