Using R for Data Analysis and Graphics Introduction, Code and Commentary.pdf
(
2461 KB
)
Pobierz
usingR
Using R for Data Analysis and Graphics
Introduction, Code and Commentary
J H Maindonald
Centre for Mathematics and Its Applications,
Australian National University.
©J. H. Maindonald 2000, 2004, 2008. A licence is granted for personal study and classroom use.
Redistribution in any other form is prohibited.
Languages shape the way we think, and determine what we can think about (Benjamin Whorf.).
This latest revision has corrected several errors. I plan, in due course, to post a new document that will largely
replace this now somewhat dated document, taking more adequate account of recent changes and enhancements
to the R system and its associated packages since 2002.
19 January 2008
ii
C a m ba r v ille
B e llbir d
W h ia n W h ia n
B y r a n ge r y
C o n o n da le
A lly n R iv e r
B ulbur in
6 0
6 5
7 0
7 5
f e m a le
m a le
2
0
l
le n gt h
8
6
4
5
0
t
le n gt h
5
0
5
e a r c o n c h
le n gt h
0
5
Lindenmayer, D. B., Viggers, K. L., Cunningham, R. B., and Donnelly, C. F. : Morphological variation
among populations of the mountain brushtail possum, t
richosurus
caninus
Ogibly
(Phalangeridae:Marsupialia). Australian Journal of Zoology 43: 449-459, 1995.
3 2
3 6
4 0
4 5
5 0
5 5
possum
n
.
1
Any of many chiefly herbivorous, long-tailed, tree-dwelling, mainly Australian marsupials, some
of which are gliding animals (e.g.
brush-tailed possum
,
flying possum
).
2
a mildly scornful term for a person.
3
an affectionate mode of address.
From the Australian Oxford Paperback Dictionary, 2
nd
ed, 1996.
ii
Introduction...........................................................................................................................................................1
The R System ..................................................................................................................................................1
The Look and Feel of R ..................................................................................................................................1
The Use of these Notes ...................................................................................................................................2
The R Project ..................................................................................................................................................2
Web Pages and Email Lists.............................................................................................................................2
Datasets that relate to these notes....................................................................................................................2
_________________________________________________________________________ .......................2
1. Starting Up ........................................................................................................................................................3
1.1
Getting started under Windows.................................................................................................................3
1.2
Use of an Editor Script Window................................................................................................................4
1.3
A Short R Session .....................................................................................................................................5
1.3.1
Entry of Data at the Command Line...................................................................................................6
1.3.2 Entry and/or editing of data in an editor window...................................................................................6
1.3.3 Options for read.table() ..........................................................................................................................6
1.3.4 Options for plot() and allied functions ...................................................................................................7
1.4
Further Notational Details.......................................................................................................................7
1.5
On-line Help.............................................................................................................................................7
1.6
The Loading or Attaching of Datasets......................................................................................................7
1.7
Exercises ..................................................................................................................................................8
2. An Overview of R..............................................................................................................................................9
2.1 The Uses of R ................................................................................................................................................9
2.1.1 R may be used as a calculator. ...............................................................................................................9
2.1.2 R will provide numerical or graphical summaries of data .....................................................................9
2.1.3 R has extensive graphical abilities .......................................................................................................10
2.1.4 R will handle a variety of specific analyses .........................................................................................10
2.1.5 R is an Interactive Programming Language .........................................................................................11
2.2 R Objects.....................................................................................................................................................11
*2.3 Looping .....................................................................................................................................................12
2.3.1 More on looping...................................................................................................................................12
2.4 Vectors ........................................................................................................................................................12
2.4.1 Joining (concatenating) vectors............................................................................................................13
2.4.2 Subsets of Vectors................................................................................................................................13
2.4.3 The Use of NA in Vector Subscripts....................................................................................................13
2.4.4 Factors..................................................................................................................................................14
2.5 Data Frames ...............................................................................................................................................15
2.5.1 Data frames as lists ..............................................................................................................................15
2.5.2 Inclusion of character string vectors in data frames.............................................................................15
2.5.3 Built-in data sets ..................................................................................................................................15
iv
2.6 Common Useful Functions..........................................................................................................................16
2.6.1 Applying a function to all columns of a data frame .............................................................................16
2.7 Making Tables.............................................................................................................................................17
2.7.1 Numbers of NAs in subgroups of the data ...........................................................................................17
2.8 The Search List ...........................................................................................................................................17
2.9 Functions in R .............................................................................................................................................18
2.9.1 An Approximate Miles to Kilometers Conversion...............................................................................18
2.9.2 A Plotting function...............................................................................................................................18
2.10 More Detailed Information .......................................................................................................................19
2.11 Exercises ...................................................................................................................................................19
3. Plotting .............................................................................................................................................................21
3.1 plot () and allied functions ..........................................................................................................................21
3.1.1 Plot methods for other classes of object...............................................................................................21
3.2 Fine control – Parameter settings...............................................................................................................21
3.2.1 Multiple plots on the one page .............................................................................................................22
3.2.2 The shape of the graph sheet ................................................................................................................22
3.3 Adding points, lines and text .......................................................................................................................22
3.3.1 Size, colour and choice of plotting symbol ..........................................................................................23
3.3.2 Adding Text in the Margin...................................................................................................................24
3.4 Identification and Location on the Figure Region ......................................................................................24
3.4.1 identify() ..............................................................................................................................................24
3.4.2 locator()................................................................................................................................................25
3.5 Plots that show the distribution of data values ...........................................................................................25
3.5.1 Histograms and density plots ...............................................................................................................25
3.5.3 Boxplots ...............................................................................................................................................26
3.5.4 Normal probability plots ......................................................................................................................26
3.6 Other Useful Plotting Functions .................................................................................................................27
3.6.1 Scatterplot smoothing ..........................................................................................................................27
3.6.2 Adding lines to plots ............................................................................................................................28
3.6.3 Rugplots ...............................................................................................................................................28
3.6.4 Scatterplot matrices..............................................................................................................................28
3.6.5 Dotcharts ..............................................................................................................................................28
3.7 Plotting Mathematical Symbols ..................................................................................................................29
3.8 Guidelines for Graphs.................................................................................................................................29
3.9 Exercises .....................................................................................................................................................29
3.10 References .................................................................................................................................................30
4. Lattice graphics...............................................................................................................................................31
4.1 Examples that Present Panels of Scatterplots – Using
xyplot()
...........................................................31
4.2 Some further examples of lattice plots ........................................................................................................32
4.2.1 Plotting columns in parallel .................................................................................................................32
iv
v
4.2.2 Fixed, sliced and free scales.................................................................................................................33
4.3 An incomplete list of lattice Functions........................................................................................................33
4.4 Exercises .....................................................................................................................................................33
5. Linear (Multiple Regression) Models and Analysis of Variance ................................................................35
5.1 The Model Formula in Straight Line Regression........................................................................................35
5.2 Regression Objects......................................................................................................................................35
5.3 Model Formulae, and the X Matrix.............................................................................................................36
5.3.1 Model Formulae in General .................................................................................................................37
*5.3.2 Manipulating Model Formulae ..........................................................................................................38
5.4 Multiple Linear Regression Models ............................................................................................................38
5.4.1 The data frame Rubber.........................................................................................................................38
5.4.2 Weights of Books.................................................................................................................................40
5.5 Polynomial and Spline Regression..............................................................................................................41
5.5.1 Polynomial Terms in Linear Models....................................................................................................41
5.5.2 What order of polynomial? ..................................................................................................................42
5.5.3 Pointwise confidence bounds for the fitted curve ................................................................................43
5.5.4 Spline Terms in Linear Models............................................................................................................43
5.6 Using Factors in R Models .........................................................................................................................43
5.6.1 The Model Matrix ................................................................................................................................44
*5.6.2 Other Choices of Contrasts ................................................................................................................45
5.7 Multiple Lines – Different Regression Lines for Different Species .............................................................46
5.8 aov models (Analysis of Variance)..............................................................................................................47
5.8.1 Plant Growth Example .........................................................................................................................47
*5.8.2 Shading of Kiwifruit Vines ................................................................................................................48
5.9 Exercises .....................................................................................................................................................49
5.10 References .................................................................................................................................................50
6. Multivariate and Tree-based Methods..........................................................................................................51
6.1 Multivariate EDA, and Principal Components Analysis.............................................................................51
6.2 Cluster Analysis ..........................................................................................................................................52
6.3 Discriminant Analysis .................................................................................................................................52
6.4 Decision Tree models (Tree-based models) ................................................................................................53
6.5 Exercises .....................................................................................................................................................54
6.6 References ...................................................................................................................................................54
*7. R Data Structures .........................................................................................................................................55
7.1 Vectors ........................................................................................................................................................55
7.1.1 Subsets of Vectors................................................................................................................................55
7.1.2 Patterned Data ......................................................................................................................................55
7.2 Missing Values ............................................................................................................................................55
7.3 Data frames.................................................................................................................................................56
7.3.1 Extraction of Component Parts of Data frames....................................................................................56
v
Plik z chomika:
turlaj
Inne pliki z tego folderu:
Data.Mining.with.R.pdf
(1730 KB)
25 Recipes for Getting Started with R.pdf
(1140 KB)
A handbook of statistical analyses using R.pdf
(4472 KB)
An Introduction to R.pdf
(628 KB)
An R and S-Plus Companion to Applied Regression.djvu
(2994 KB)
Inne foldery tego chomika:
Pliki dostępne do 19.01.2025
android
circuit cellar
clojure
cloud computing
Zgłoś jeśli
naruszono regulamin