Hands-on course on July 1 2015 07-01-2015d1710 some notes { 07-01-2015d1012 R Statistics course Ramon Diaz-Uriarte do not read Gentleman's bioinfo one, nor Crawley's RCommander JGR (Java Gui for R) JGR + Deducer runif = random uniform distribution R variables can hold any type of object (this does not need to be explicitly defined) rnorm create a random number from a normal distribution from 1 or 0 to a max value rep is for repeat c is for concatenate (paste together) factor object is good when you have something that is a class label in a statistical analysis. head function shows first 6 lines of an object apply function applies something to an object apply(object, dimension that you will do things, ) 1:20 defines a range from 1 to 20 functions can be defined on the fly while being passed as a parameter into a function thin line between using and programming R names tells you the names of each component in a list The $ sign used to access the interal components of a list. functional composition library to load a library/package in library It's better not to save the workspace from RStudio ? in front of function brings up help you can type example around a function to see an example run he likes to be explicit about using the names of variables being passed to a function apropos searches functions that have certain text in their name. vignettes give you several work examples (pdf user guides) sos package for using help cran task views coding style try not to go beyond column 80 variables are really a vector in r vectors hold elements of the same type. can't mix numbers and strings if a vector has more than 1 dimension, then it is a matrix dataframes can have elements of different types dataframes organize things in terms of columns like a spreadsheet dataframes are rectangular table objects dataframes are a type of list a list is a general container getwd() is get working directory the working directory can also be changed (session->set working directory) setwd() summary of a data frame str of a data frame (str is for structure) read.table function (the header parameter is VERY important to indicate and pay attention to) dealing with missing values; use an NA read csv function to get csv file You can save specific objects in R ggplot2 source command to run a script vectors regular sequences with sequence function range by 1 like 2:7 rep function (repeat function) output often length of largest object recycling rule can lead to nasty surprises he doesn't like using "=" for assignment identical function floating point arithmetic in r (be careful) which can give indices of elements in a list that meet criteria to get elements of a vector pass another vector with the positions you want You can name each element in a vector in R ages <- c(Juan = 23, Maria = 35, Irene = 12, Ana = 93) vectors that have names are not dataframes this is like a lookup table age factors factors are recoded to numeric things as.numeric() rbind to bind rows cbind to bind columns you can use rbind to add something to a matrix drop = FALSE to not drop dimensions the apply function can be used on a matrix, but not on a vector obtaining indices of an object of a matrix which(A==999, arr.ind = TRUE) row col [1,] 2 3 a list is a very general container s3 and s4 classes allow objects to have functions to allow for a type of object oriented programming s4 classes are more sophisticated whereas s3 is for things like "print", "plot", (more built-in things), etc. A dataframe is a special type of list transform datafram into matrix data.matrix(AB) as.matrix(AB) with function to get data from tables attach can add things to the list of packages that R searches through when finding a command, but this strategy is not used often anymore. iteration in R names.of.friends <- c("Ana", "Rebeca", "Marta", "Quique", "Virgilio") for(friend in names.of.friends) { cat("\n I should call", friend, "\n") } apply can be used instead of for loops and they are easier to parallelize defining a function in R example { multByTwo <- function(x) { z <- 2 * x return(z) } } optional arguments in R example { plotAndLm <- function(x, y, title = "A figure") { lm1 <- lm(y ~ x) cat("\n Printing the summary of x\n") print(summary(x)) cat("\n Printing the summary of y\n") print(summary(y)) cat("\n Printing the summary of the linear regression\n") print(summary(lm1)) plot(y ~ x, main = title) abline(lm1) return(lm1) } } }