install.packages(LIBRARY)1 Getting Started
1.1 R System
In this course, we will use a combination of R and RStudio:
- R serves as the computational engine, handling all the calculations.
- RStudio is the interface that facilitates writing and sending commands to R, as well as viewing the results. Ensure that the latest versions of both R and RStudio are installed on your computer.
1.2 Libraries and packages
The R engine includes a variety of built-in functions, but one of its greatest strengths is its extensibility through libraries, which can be created by anyone. While libraries can technically be installed from any file or website, in practice, the majority of commonly used libraries are distributed through two primary repositories:
- CRAN: The main repository for libraries related to statistical methods.
- Bioconductor: A specialized repository for libraries focused on bioinformatics.
To install a package from a library, use the command
Replace “LIBRARY” with the name of the library you wish to install. By default, R searches for the package in CRAN, but you can specify alternative repositories or file locations if needed.
For Windows and Mac systems, R typically works right out of the box. However, on other UNIX-based systems, you may need to install additional dependencies.
In this course we need following packages:
dplyr
ggplot2
ggpubr
Load those packages with
library(dplyr)
library(ggplot2)
library(ggpubr)1.3 Data format
In this course we will work with simulated biological data that can be saved as .csv. CSV (Comma-Separated Values) files are plain text files used to store tabular data, such as a spreadsheet or database. Each line in a CSV file represents a row, with individual values (or fields) separated by commas. CSV files are easier to read and write in R compared to .xlsx files because R has built-in functions specifically designed for handling CSV data:
- Reading CSV files: Use
read.csv("filename.csv")to load data into R. - Writing CSV files: Use
write.csv(data, "filename.csv")to save data to a CSV file.
These functions are straightforward and do not require additional libraries. In contrast, working with .xlsx files typically requires external packages, such as readxl or openxlsx, which add extra steps for installation and setup. CSV files are also more universally supported and simpler in structure, making them more efficient for many tasks.