avsnitt 10 av 17
Pågår

Using functions in R

Functions in R

The default installation of R comes with many functions that perform various tasks, ranging from calculating means to computing advanced regression models. Some functions are very simple to use, while other functions require more care. All functions, however, have the same anatomy, which is shown in the diagram below:

As seen in the diagram, all functions start with the name of the function. Then follows one or more arguments in parenthesis. Arguments are simply the data that you pass into the function. The argument can be a simple numeric vector, a data frame, a regression model or the result of any other R function.

To use a function you must specify the name of the function as well as the arguments. Here is a simple function, mean, which calculates the mean value of a numeric vector. Note that we create the numeric vector and call it my_numbers and then we pass that vector to the function in order to obtain the mean:

my_values = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

mean(my_values)

## [1] 5.5

As seen above, an R function can take multiple arguments and they must be separated with a comma. Every argument in the function has a name. You specify which data that should be passed to the argument by setting the argument name equal to the data. You may notice, sooner or later, that it is optional to use the name of the argument, but if you decide not to specify the name of the argument then you must make sure that you pass the data to the arguments in the correct order, which is the order specified in the documentation (see above how to get help with a function)

Let’s see an example where the function returns an error and we’ll do so by introducing a missing value into the numeric vector. Specifically, we will replace the number 10 with a missing value (NA):

my_values = c(1, 2, 3, 4, 5, 6, 7, 8, 9, NA)

mean(my_values)

## [1] NA

This returns the result NA, which means that the function could not calculate the mean (NA = Not Applicable). This is easy to fix by specifying an additional argument to the function mean. Before we do that, let’s see the documentation for the meanfunction:

?mean


After executing the command ?mean we see the documentation for that particular function, and it appears as follows:

At the top it shows the function name and the package containing it (mean {base}). Then follows the title (Arithmetic Mean), as well as a description of the function. The important parts are Usage and Arguments. There are 3 arguments: xtrimand na.rm. In this particular case, we need to use the na.rm argument since it allows us to remove all missing values before calculating the mean, so let’s try that:

my_values = c(1, 2, 3, 4, 5, 6, 7, 8, 9, NA)

mean(my_values, na.rm=TRUE)

## [1] 5

It’s working!

Functions within functions

R gives you the ability to wrap functions within functions, which is extremely powerful. Consider the following example, which carries out a calculation in three sequential steps:

# Use c() functtion to create a numeric vector
my_values = c(10, 30.5, 99.43, 98.3)

# Calculate the mean of that vector
my_mean <- mean(my_values)

# Round the value to nearest integer
my_result <- round(my_mean)

# See results
my_result

## [1] 60

You can write all these functions within each other, as follows:

my_result <- round(mean(c(10, 30.5, 99.43, 98.3)))

# See results
my_result
## [1] 60

Hence, when you wrap a function within another function, R will execute the functions from the innermost operation to the outermost. In the example above it will start with the c() function, then apply the mean() function and finally the round().

Typically you will use a few functions very often and learn their arguments such that you need not to specify the names when calling the function. However, it is common to forget the names of the arguments. You can obtain the name of the arguments by using the function args. Let’s get the arguments for the function rnorm:

args(rnorm)
## function (n, mean = 0, sd = 1)
## NULL

That’s an interesting output because it seems that R has already assigned the mean to 0 and sd to 1 and this is very typical, namely that some arguments have default values. Arguments with default values are not mandatory to specify when calling a function; they will be set to their default values unless you specify anything else.

Recommendations

It can be recommended that you use the name of the arguments when specifying them. This makes your code easier to interpret by others and also by yourself at a later point in time. Debugging code will also be easier if you name the arguments. Remember, if you skip to write out the names of the arguments, then R will match your values to the arguments in the function by order.

Common base R functions

The default installation of R comes with numerous useful functions. Some of these functions are listed below. Recall that many of these functions have more or less been abandoned for the benefit of newer and better functions.

General

builtins() # List all built-in functions
options() # Set options to control how R computes & displays results

?NA        # Help page on handling of missing data values
abs(x)     # The absolute value of "x"
append()   # Add elements to a vector
c(x)       # A generic function which combines its arguments
cat(x)     # Prints the arguments
cbind()    # Combine vectors by row/column (cf. "paste" in Unix)
diff(x)    # Returns suitably lagged and iterated differences
gl()       # Generate factors with the pattern of their levels
grep()     # Pattern matching
identical()  # Test if 2 objects are *exactly* equal
jitter()     # Add a small amount of noise to a numeric vector
julian()     # Return Julian date
length(x)    # Return no. of elements in vector x
ls()         # List objects in current environment
mat.or.vec() # Create a matrix or vector
paste(x)     # Concatenate vectors after converting to character
range(x)     # Returns the minimum and maximum of x
rep(1,5)     # Repeat the number 1 five times
rev(x)       # List the elements of "x" in reverse order
seq(1,10,0.4)  # Generate a sequence (1 -> 10, spaced by 0.4)
sequence()     # Create a vector of sequences
sign(x)        # Returns the signs of the elements of x
sort(x)        # Sort the vector x
order(x)       # list sorted element numbers of x
tolower(),toupper()  # Convert string to lower/upper case letters
unique(x)      # Remove duplicate entries from vector
system("cmd")  # Execute "cmd" in operating system (outside of R)
vector()       # Produces a vector of given length and mode

formatC(x)     # Format x using 'C' style formatting specifications
floor(x), ceiling(x), round(x), signif(x), trunc(x)   # rounding functions

Sys.getenv(x)  # Get the value of the environment variable "x"
Sys.putenv(x)  # Set the value of the environment variable "x"
Sys.time()     # Return system time
Sys.Date()     # Return system date
getwd()        # Return working directory
setwd()        # Set working directory
?files         # Help on low-level interface to file system
list.files()   # List files in a give directory
file.info()    # Get information about files

# Built-in constants:
pi,letters,LETTERS   # Pi, lower & uppercase letters, e.g. letters[7] = "g"
month.abb,month.name # Abbreviated & full names for months

Maths

log(x),logb(),log10(),log2(),exp(),expm1(),log1p(),sqrt()   # Fairly obvious
cos(),sin(),tan(),acos(),asin(),atan(),atan2()       # Usual stuff
cosh(),sinh(),tanh(),acosh(),asinh(),atanh()         # Hyperbolic functions
union(),intersect(),setdiff(),setequal()             # Set operations
+,-,*,/,^,%%,%/%                                     # Arithmetic operators
<,>,<=,>=,==,!=                                      # Comparison operators
eigen()      # Computes eigenvalues and eigenvectors

deriv()      # Symbolic and algorithmic derivatives of simple expressions

sqrt(),sum()
?Control     # Help on control flow statements (e.g. if, for, while)
?Extract     # Help on operators acting to extract or replace subsets of vectors
?Logic       # Help on logical operators
?Mod         # Help on functions which support complex arithmetic in R
?Paren       # Help on parentheses
?regex       # Help on regular expressions used in R
?Syntax      # Help on R syntax and giving the precedence of operators
?Special     # Help on special functions related to beta and gamma functions

Graphical

help(package=graphics) # List all graphics functions

plot()                # Generic function for plotting of R objects
par()                 # Set or query graphical parameters
curve(5*x^3,add=T)    # Plot an equation as a curve
points(x,y)           # Add another set of points to an existing graph
arrows()              # Draw arrows [see errorbar script]
abline()              # Adds a straight line to an existing graph
lines()               # Join specified points with line segments
segments()            # Draw line segments between pairs of points
hist(x)               # Plot a histogram of x
pairs()               # Plot matrix of scatter plots
matplot()             # Plot columns of matrices

?device               # Help page on available graphical devices
postscript()          # Plot to postscript file
pdf()                 # Plot to pdf file
png()                 # Plot to PNG file
jpeg()                # Plot to JPEG file
X11()                 # Plot to X window
persp()               # Draws perspective plot
contour()             # Contour plot
image()               # Plot an image

Fitting / regression / optimisation

lm      # Fit liner model
glm     # Fit generalised linear model
nls     # non-linear (weighted) least-squares fitting
lqs     # "library(MASS)" resistant regression

optim       # general-purpose optimisation
optimize    # 1-dimensional optimisation
constrOptim # Constrained optimisation
nlm     # Non-linear minimisation
nlminb      # More robust (non-)constrained non-linear minimisation

Statistical

help(package=stats)   # List all stats functions

?Chisquare            # Help on chi-squared distribution functions
?Poisson              # Help on Poisson distribution functions
help(package=survival) # Survival analysis

cor.test()            # Perform correlation test
cumsum(); cumprod(); cummin(); cummax()   # Cumuluative functions for vectors
density(x)            # Compute kernel density estimates
ks.test()             # Performs one or two sample Kolmogorov-Smirnov tests
loess(), lowess()     # Scatter plot smoothing
mad()                 # Calculate median absolute deviation
mean(x), weighted.mean(x), median(x), min(x), max(x), quantile(x)
rnorm(), runif()  # Generate random data with Gaussian/uniform distribution
splinefun()           # Perform spline interpolation
smooth.spline()       # Fits a cubic smoothing spline
sd()                  # Calculate standard deviation
summary(x)            # Returns a summary of x: mean, min, max etc.
t.test()              # Student's t-test
var()                 # Calculate variance
sample()              # Random samples & permutations
ecdf()                # Empirical Cumulative Distribution Function
qqplot()              # quantile-quantile plot

List was obtained from: http://www.sr.bham.ac.uk/~ajrs/R/r-function_list.html

5/5 (1 Review)