You are on page 1of 2

R : Notes

R, and every package, provide help files for functions. The general syntax to search for help on any function,
function_name, from a specific function that is in a package loaded into the namespace Eg ?glm
The read.table function is used for reading in tabular data stored in a text file where the columns of data are
separated by punctuation characters such as CSV files (csv = comma-separated values). Tabs and commas are
the most common punctuation characters used to separate or delimit data points in csv files. For convenience R
provides 2 other versions of read.table. These are: read.csv for files where the data are separated with commas
and read.delim for files where the data are separated with tabs. Of these three functions read.csv is the most
commonly used. If needed it is possible to override the default delimiting punctuation marks for both read.csv
and read.delim.
To check data type we use = typeof()
To know the class of a data we use = class()
To check the structure we use = str()
Each row is an observation of different variables, itself a data.frame, and thus can be composed of elements of
different types = cats[1,]
To get the number of rows and columns in our dataset = nrow(), ncol()
To add a row or a column we use = rbind or cbind
we can use the != (not-equals) operator to construct a logical vector to skip names
Matrices are also subsetted using the [ function.
There are three functions used to subset lists. [, [[, and $.
iF else = # if
if (condition is true) { perform action
}# if ... else
if (condition is true) {
perform action
} else { # that is, if the condition is false,
perform alternative action
}
Repeating operation = for(iterator in set of values){
do a thing
}
Plotting plots and graphs =. library("ggplot2")
ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp)) +
geom_point()
Functions gather a sequence of operations into a whole, preserving it for ongoing use. Functions provide:
a name we can remember and invoke it by
relief from the need to remember the individual operations
a defined set of inputs and expected outputs
rich connections to the larger programming environment
Defensive programming encourages us to frequently check conditions and throw an error if something is wrong.
To save a plot =. ggsave("My_most_recent_plot.pdf")
To save as pdf = pdf("Life_Exp_vs_time.pdf", width=12, height=4)
ggplot(data=gapminder, aes(x=year, y=lifeExp, colour=country)) +
geom_line() +
theme(legend.position = "none")

# You then have to make sure to turn off the pdf device!

dev.off()
To write data = We can use the write.table
The dplyr package provides a number of very useful functions for manipulating dataframes in a way that will
reduce the above repetition, reduce the probability of making errors, and probably even save you some typing.
As an added bonus, you might even find the dplyr grammar easier to read.
Make code readable
The most important part of writing code is making it readable and understandable. You want someone else to
be able to pick up your code and be able to understand what it does: more often than not this someone will be
you 6 months down the line, who will otherwise be cursing past-self.
Documentation: tell us what and why, not how
When you first start out, your comments will often describe what a command does, since youre still learning
yourself and it can help to clarify concepts and remind you later. However, these comments arent particularly
useful later on when you dont remember what problem your code is trying to solve. Try to also include
comments that tell you why youre solving a problem, and what problem that is. The how can come after that:
its an implementation detail you ideally shouldnt have to worry about.
Keep your code modular
Our recommendation is that you should separate your functions from your analysis scripts, and store them in a
separate file that you source when you open the R session in your project. This approach is nice because it leaves
you with an uncluttered analysis script, and a repository of useful functions that can be loaded into any analysis
script in your project. It also lets you group related functions together easily.
Break down problem into bite size pieces
When you first start out, problem solving and function writing can be daunting tasks, and hard to separate from
code inexperience. Try to break down your problem into digestible chunks and worry about the implementation
details later: keep breaking down the problem into smaller and smaller functions until you reach a point where
you can code a solution, and build back up from there.
Know that your code is doing the right thing
Make sure to test your functions!
Dont repeat yourself
Functions enable easy reuse within a project. If you see blocks of similar lines of code through your project,
those are usually candidates for being moved into functions.
If your calculations are performed through a series of functions, then the project becomes more modular and
easier to change. This is especially the case for which a particular input always gives a particular output.
Remember to be stylish
Apply consistent style to your code.

You might also like