Microcredencial “agRo-al” - Session 1 | Back to HOME


Introduction to R (R 101)

Before starting to work with R, we must know some of the essentials of this mathematical tool. R is a statistical and graph system created by Ross Ihaka and Robert Gentleman1. The nature of R is twofold: a program itself and a programming language.

R is freely distributed under the terms of the GNU (free software) General Public Licence -GPL- as a code source (Unix, Linux) or as precompiled binary files for Windows, Linux, and Macintosh operative systems. The source code or binary files to install locally R, as well as most R-based developed packages, are available at the Comprehensive R Archive Network - CRAN 2.

R has a wide variety of functions for statistical analysis and graphics. The latter can be visualized directly in the tool GUI or saved in multiple file formats (e.g., EPS, TIFF, PNG, JPEG, PDF, PS, etc.). Similarly, statistical results (p-values, correlation scores, residuals, estimates) can be stored in elements and exported to files.

For beginners, R could seem very complex as it must be executed via the command line. However, its code is quite intuitive, and there is much help on the internet to deal with any issue or purpose envisioned. To help understand why R is so popular and valuable in comparison with other statistical software and programming languages, we will review the main pros and cons of this tool:

Pros
  • Open Source: anyone can use R without any need for a license or fee.
  • Highly compatible: R can be paired with C, C++, Java, and Python.
  • Platform independent: it is a cross-platform programming language, it can be run easily on Windows, Linux, and Mac.
  • Flexibility: a vast array of freely available packages (CRAN repository, > 10,000 packages).
  • Plotting: it facilitates quality plotting and comprehensive graphing; ggplot2 produces superior and appealing graphs.
  • High-quality reports: Shiny and Markdown can help to reporting analysis results extremely easy.
  • Statistics: R is known as the lingua franca of statistics; besides, the CRAN repository provides a endless universe of statistical methods (e.g. Machine Learning, Deep Learning) for robust analyses.
Cons
  • Data handling: R stores the objects in the physical memory, so it utilizes more memory as compared with Python.
  • Security issues: R lacks basic security features; thus limiting its use in web applications.
  • Low speed: R programming language is much slower than other languages like MATLAB and Python.
  • Dependencies: R algorithms are spread across different packages. Difficult to implement and maintain dependencies.

How does R work?

R is not a compiled language; it is an interpreted language instead. That means all commands you type are directly executed without the need for compilation (building a complete program). Unlike functional programming, R is an Object Oriented Programming language (OOP). So, variables, data, graphs, functions, results, and scores are stored in Computer memory as objects with a unique name. Then, the users can execute different actions on these objects with operators and functions (being themselves objects). The use of operators is very intuitive.

A good example of how R works can be visualized in the image below:


R first taste, Let’s hands-on

R functions are stored in pre-installed libraries located in a particular site on your personal computer (PC). For a rapid look-up, please access the following folders according to your OS:

for Linux users using the latest R distribution (4.4.2)

/home/R/x86_64-pc-linux-gnu-library/4.4/

for Windows users using the latest R distribution (4.4.2)

C:\Program Files\R\R-4.4.2\library\

for MacOS users using the latest R distribution (4.4.2)

/Library/Frameworks/R.framework/Versions/4.4/Resources/library

Once you have identified where all your R functions and packages will be installed, please open a R console to start working.

Then, type the following command line:

mydata <- seq(1:10)

above you have created an object called “mydata” containing a sequence of numbers between 1 and 10

mydata

you inspect the object created to verify its content

mydata * 2

multiply two times all elements of your object


R Studio

RStudio is an integrated development environment (IDE) that makes R user-friendly. RStudio was created by the company Posit PBC 3, and it is available in two formats. RStudio Desktop is a regular desktop application, and RStudio Server runs on a remote server and allows accessing RStudio using a web browser.

Similar to R, RStudio is a platform-independent tool (runs in Windows, Linux, MacOS). RStudio includes a console, syntax-highlighting editor, and various robust tools for plotting, viewing history, debugging and managing your workspace. The primary functional features of RStudio Desktop are:

  • Locally access

  • Advanced syntax control, code completion, and smart indentation.

  • Execute R code directly from the editor.

  • Swift function definitions.

  • Easily manage multiple working directories using Projects.

  • Integrated R help and documentation.

  • Extensive package development tools.

  • Easily publish apps and reports.

For a rapid practical session, open the RStudio and execute the command lines above you tried with the R console.


  1. Ihaka R & Gentleman R. 1996. R: a language for data analysis and graphics. J Comput Graph Stat 5:299-314.↩︎

  2. https://cran.r-project.org/↩︎

  3. https://posit.co/download/rstudio-desktop/↩︎