2  R Coding Basics

Now, you have a basic understanding of R/RStudio, R Project, and R Markdown. Next, we will cover some fundamental syntax and concepts in R programming. These basics form the foundation for data journalism: from loading election data and survey results to calculating statistics that reveal hidden stories in the numbers.

2.1 Coding Basics

Basic Arithmetic in R

You can use R as a calculator to perform basic arithmetic operations. Here are some examples:

25 / 26 * 27
[1] 25.96154
(25 + 26) / 2
[1] 25.5
cos(pi)
[1] -1

Creating Objects

You can create objects (variables) in R using the assignment operator <-. Here are some examples:

x <- 2025

Note that the value of x is stored but not printed. To print the value of x, you can simply type x.

x
[1] 2025

Combining Elements into a Vector

You can combine multiple elements into a vector using the c() function. Here are some examples:

ages <- c(21, 22, 19)

Again, if you want to view the content of the vector ages, you can simply type ages.

ages
[1] 21 22 19

Basic arithmetic operations can also be performed on vectors, affecting each element of the vector.

ages + 1
[1] 22 23 20

Assignment Statement Structure

All R assignment statements follow the same structure:

object_name <- value

When reading the code, think of it as “object name gets value.” To save time, use the shortcut Alt + - (Windows) or Option + - (Mac) to insert the assignment operator <-.

Note

= can also be used for assignment, but it is generally recommended to use <- for assignment in R to avoid confusion with the equality operator ==, which is used for comparisons.

2.2 Comments in Code

R ignores any text following a # symbol on the same line. You can use comments to explain your code or add notes.

# Create vector of birth year
birth_year <- c(1995, 2003, 2007)

# Calculate age in 2025
2025 - birth_year
[1] 30 22 18
Tip

Use comments to explain the why behind your code, not the how or what. This prevents confusion in the future, especially when revisiting complex projects or after a long time.

2.3 Naming Conventions

Object names must start with a letter and can contain letters, numbers, underscores (_), and periods(.). We recommend using snake_case for multi-word object names, which means using lowercase letters and underscores to separate words (e.g., birth_year, student_name).

# this is a good object name
birth_year
# this is a bad object name
BirthYear
birth-year
birth year 

Remember R is case-sensitive, so birth_year and Birth_Year are considered different objects.

2.4 Calling Functions

R provides a variety of built-in functions. They are called by typing the function name followed by parentheses (). If the function requires arguments, you can pass them inside the parentheses.

function_name(argument1 = value1, argument2 = value2, ...)

For example, to create a sequence of numbers from 1 to 10, you can use the seq() function:

seq(from = 1, to = 10)
 [1]  1  2  3  4  5  6  7  8  9 10

You can omit argument names if the order is clear:

seq(1, 10)
 [1]  1  2  3  4  5  6  7  8  9 10
Tip

RStudio has an auto-complete feature that suggests function names and arguments as you type. You can also call ?function_name to access the help file for that function.

2.5 Common Data Types

There are several data types in R, but we will focus on four common ones: numeric, character, logical, and date.

Numeric

# check the data type of ages
class(ages)
[1] "numeric"

Character

Character data type represents text or string values. Character values are enclosed in double quotes (") or single quotes (').

language <- c("R", "Python")

class(language)
[1] "character"

Logical

Logical data type represents boolean values, which can be either TRUE or FALSE.

# create logical variables
is_student <- FALSE

class(is_student)
[1] "logical"

Date

Date data type represents dates. Standard date format in R is YYYY-MM-DD.

today <- as.Date("2025-03-02")

class(today)
[1] "Date"
Warning

To create a date object in R, you can use the as.Date() function with the date in the format YYYY-MM-DD. If you just use x <- "2025-03-02", it will be a character type.

2.6 Data Frame

Data frame is a two-dimensional data structure that stores data in rows and columns. Each row represents an observation, and each column represents a variable. Think of it as a spreadsheet: rows are individual records (a news story, survey respondent, or election result) and columns are attributes (title, date published, views).

We can build a data frame using the data.frame() function.

First, let’s create a simple data frame with two columns: name and age.

# Create a data frame
df <- data.frame(
  name = c("Alice", "Bob", "Charlie"),
  age = c(25, 30, 35))
df

This is a 3x2 data frame with 3 rows and 2 columns.

Exploring a Data Frame

In journalism, you often start with a dataset you’ve obtained (election results, crime statistics, public spending). R provides several functions to quickly understand what you’re working with:

# View the first few rows
head(df)
# Check the number of rows and columns
nrow(df)
[1] 3
ncol(df)
[1] 2
# See column names
colnames(df)
[1] "name" "age" 
# Get structure and data types
str(df)
'data.frame':   3 obs. of  2 variables:
 $ name: chr  "Alice" "Bob" "Charlie"
 $ age : num  25 30 35

Journalism Example

Here’s a more realistic example: a dataset of news articles with views and engagement:

articles <- data.frame(
  title = c("Election Results Shock Markets", "Climate Report Released", "Survey: Public Trust Declining"),
  date = as.Date(c("2025-03-01", "2025-02-28", "2025-02-27")),
  views = c(45000, 32000, 28000),
  shares = c(3200, 1800, 1500),
  section = c("Politics", "Environment", "Society"))

head(articles)
Note

In this course, we will primarily work with data frames and use the tidyverse package to manipulate, analyze, and visualize them for storytelling.

2.7 Errors/Warnings/Messages

  • Error: An error occurs when R encounters a problem that prevents it from executing the code. Errors are displayed in red text and typically include an error message that describes the problem.

  • Warning: A warning occurs when R encounters a potential problem but is able to continue executing the code. Warnings are displayed in yellow text and typically include a warning message that describes the potential issue.

  • Message: A message is a general output from R that provides information about the code execution. Messages are displayed in white text and are used to convey information about the code.

Note

You may ignore the messages, but you should always pay attention to errors and warnings.

Debug Tips?

  • Read the error message: The error message provides information about what went wrong. Read the error message carefully to understand the problem.

  • Check the code: Review the code that caused the error. Look for syntax errors, missing parentheses, brackets, or quotation marks, and other common mistakes.

  • Use the help panel: RStudio has a help panel that provides information about functions, packages, and error messages. Use the help panel to look up information related to the error.

  • Search online: If you are unable to resolve the error, search online for solutions. Websites like Stack Overflow, RStudio Community, and the R documentation can be helpful resources.

  • Posting your questions: If you are still unable to resolve the error, ask for help. Post your code and the error message on the RStudio Community or another forum to get assistance from the community.

Tips for learning R coding

  • “Copy, Paste, and Tweak”: When writing code, it’s common to copy and paste existing code and then tweak it to fit your needs. This can save time and reduce errors. Once you have a working script that analyzes one dataset, adapting it for another is much faster.

  • “Save, Save, Save”: Save your work frequently to avoid losing your progress. For journalism projects, also keep a clear record of what analysis you did—this becomes your methodology document.

  • Practice, practice, practice: The more you practice writing R code, the more comfortable and proficient you will become. Start with simple data questions and gradually tackle more complex analyses.

  • Call ? for help: If you are unsure about how to use a function or need more information, you can call ?function_name to access the help file for that function.

Best Practices for Data Journalism Projects

  • Name your scripts clearly: Use meaningful names like analysis_election_2025.R or data_cleaning_covid_survey.R so you know what each script does months later.

  • Document your data sources: Add comments at the top of your script noting where the data came from, when you obtained it, and any data issues you discovered.

  • Organize your project: Keep related files in the same R Project, with separate folders for raw data, processed data, scripts, and visualizations.

  • Validate your results: Always double-check calculations and findings against the original data before publishing. A single mistake can undermine your credibility.

2.8 Exercises

Identify Errors

my_variable <- 10
my_varıable
# Error: object 'my_varıable' not found
my_vector <- (1, 2, 3, 4, 5)
mean(my_vector)
# Error: unexpected ',' in "my_vector <- (1,"
my_vector1 <- C(1, 2, 3, 4, 5)
mean(my_vector1)
# Error in `contrasts<-`(`*tmp*`, how.many, value = contr) : contrasts can be applied only to factors with 2 or more levels

RStudio Shortcuts

Option + Shift + K/ Alt + Shift + K. What happens? How can you achieve the same through the menus?

2.9 Key Functions

Command Purpose Example
<- Assign value x <- 5
c() Create vector ages <- c(21, 25, 19)
data.frame() Create data frame df <- data.frame(id=1:3)
? Get help ?ggplot
# Add comment # Calculate average