R Programming
  • The wikipedia of R by me
  • Hello R
    • -What is R & RStudio
    • -Learning sources
    • -R online editor
    • -R environment
  • Data types
    • -Dealing with Number
    • -Dealing with String
    • -Dealing with Dates
    • -Dealing with NA's
    • -Dealing with Logicals
    • -Dealing with Factors
  • R data
    • -R object
    • -Data structures
      • --Basics
      • --Managing Vectors
      • --Managing Matrices
      • --Managing Data Frames
    • -Functions
    • -Importing/exporting data
    • -Shape&Transform data
    • -R management
  • Visualizations
  • Intro to R Bootcamp
    • -01-introduction
    • -02-data preparation
    • -03-data transformation
    • -04-visualization
  • R programming track
    • -a-Introduction to R
      • --1-Intro to basics
      • --2-Vectors
      • --3-Matrices
      • --4-Factors
      • --5-Data frames
      • --6-Lists
    • -b-Intermediate R
      • --1-Conditionals and Control Flow
      • --2-Loops
      • --3-Functions
      • --4-The apply family
      • --5-Utilities
    • -d-Writing Functions in R
      • --1-A quick refresher
      • --2-When and how you should write a function
      • --3-Functional programming
      • --4-Advanced inputs and outputs
      • --5-Robust functions
  • Data Wrangling with R
  • R-tutor
    • #R introduction
    • #Elementary Statistics with R
  • Hands-On Programming with R
  • R for Data Science
  • Advanced R
  • ggplot2
  • R packages
  • Statistik-1
  • Statistik-2
  • Statistik-3
  • Zeitreihen & Prognosen
  • Descriptive Analytics
  • Predictive Analytics
  • Prescriptive Analytics
  • R Graphics Cookbook
    • ggplot2 intro
    • ggplot2 custome
    • ggplot top-50
  • #Exploratory Data Analysis
    • -Data Summary
    • -Checklist Solution
  • #Data Mining
    • Untitled
    • Untitled
  • #Machine Learning
    • Intro to ML
    • Intro alghorithms
    • 1. Supervised Learning
  • Master R for Data Science
    • Learning R
    • Untitled
    • Untitled
  • Data Science Projects
    • Simple linear regression:
Powered by GitBook
On this page
  • #How to manage the numeric type (integer vs. double)
  • #The different ways of generating non-random numbers
  • #The different ways of generating random numbers
  • #Setting Seed Values
  • #Comparing Numeric Values
  • #Rounding numeric Values
  1. Data types

-Dealing with Number

Learn the basics of working with numbers in R

#How to manage the numeric type (integer vs. double)

Numeric Types (integer vs. double):

R automatically converts integers and double for mathematical purposes

Creating Integer and Double Vectors:

By default, c() function will produce a vector of double numeric values. Create integer by placing an L after each number

# create a double datatyp
double_var <- c(5, 9.5, 89.5)  

# placing an L after the values creates integers
integer_var <- c(1L, 6L, 10L)

Numeric Type Test:

typeof(double_var)
## [1] "double"
typeof(integer_var)
## [1] "integer"

Converting Between Integer and Double Values:

By default, using the x <- 1:10 method is integer data type. Change the datatyp with this methode:

as.double(integer_var)

# identical to as.double()
as.numeric(int_var)

as.integer(double_var)

#The different ways of generating non-random numbers

Specifing Numbers within a Sequence:

1:10         
##  [1]  1  2  3  4  5  6  7  8  9 10
c(1, 5, 10)   
## [1]  1  5 10
# save the vector of integers between 1 and 10 as object x
x <- 1:10 
x
##  [1]  1  2  3  4  5  6  7  8  9 10

Generating Regular Sequences:

seq(from = 1, to = 21, by = 2)             
##  [1]  1  3  5  7  9 11 13 15 17 19 21
seq(0, 21, length.out = 15)    
###########
##  [1]  0.0  1.5  3.0  4.5  6.0  7.5  9.0 10.5 
## 12.0 13.5 15.0 16.5 18.0 19.5
## [15] 21.0

Generating Repeated Sequences:

rep(1:4, times = 2)   
## [1] 1 2 3 4 1 2 3 4
rep(1:4, each = 2)    
## [1] 1 1 2 2 3 3 4 4

#The different ways of generating random numbers

R has pseudo-random number generators that allow you to simulate the most common probability distributions.

Uniform numbers:

# generate n random numbers between the default values of 0 and 1
runif(n)            

# generate n random numbers between 0 and 25
runif(n, min = 0, max = 25)       

# generate n random numbers between 0 and 25 (with replacement)
sample(0:25, n, replace = TRUE)   

# generate n random numbers between 0 and 25 (without replacement)
sample(0:25, n, replace = FALSE)

Non-uniform probability distribution have four primary functions:

  • r: random number generation

  • d: density or probability mass function

  • p: cumulative distribution

  • q: quantiles

Normal Distribution Numbers:

# generate n random numbers from a normal distribution with given 
# mean & st. dev.
rnorm(n, mean = 0, sd = 1)    

# generate CDF probabilities for value(s) in vector q 
pnorm(q, mean = 0, sd = 1)    

# generate quantile for probabilities in vector p
qnorm(p, mean = 0, sd = 1)    

# generate density function probabilites for value(s) in vector x
dnorm(x, mean = 0, sd = 1)

Binomial Distribution Numbers:

rbinom(n, size = 100, prob = 0.5)  

# generate CDF probabilities for value(s) in vector q
pbinom(q, size = 100, prob = 0.5) 

# generate quantile for probabilities in vector p
qbinom(p, size = 100, prob = 0.5) 

# generate density function probabilites for value(s) in vector x
dbinom(x, size = 100, prob = 0.5) 

Poisson Distribution Numbers:

rpois(n, lambda = 4)  

# generate CDF probabilities for value(s) in vector q when 
# lambda (mean rate) 
# equals 4.
ppois(q, lambda = 4)  

# generate quantile for probabilities in vector p when 
# lambda (mean rate) 
# equals 4.
qpois(p, lambda = 4)  

# generate density function probabilites for value(s) in vector x when 
#lambda 
# (mean rate) equals 4.
dpois(x, lambda = 4) 

Exponential Distribution Numbers:

rexp(n, rate = 1)   
# generate CDF probabilities for value(s) in vector q when rate = 4.
pexp(q, rate = 1)   

# generate quantile for probabilities in vector p when rate = 4.
qexp(p, rate = 1)   

# generate density function probabilites for value(s) in vector x 
# when rate = 4.
dexp(x, rate = 1)  

Gamma Distribution Numbers:

rgamma(n, shape = 1)   

# generate CDF probabilities for value(s) in vector q when 
# shape parameter = 1.
pgamma(q, shape = 1)   

# generate quantile for probabilities in vector p when 
# shape parameter = 1.
qgamma(p, shape = 1)   

# generate density function probabilites for value(s) in vector x 
# when shape 
# parameter = 1.
dgamma(x, shape = 1)

#Setting Seed Values

set.seed(197)
rnorm(n = 2, mean = 0, sd = 1)
##  [1]  0.6091700 -1.4391423  

set.seed(197)
rnorm(n = 10, mean = 0, sd = 1)
##  [1]  0.6091700 -1.4391423  

#Comparing Numeric Values

Comparison Operators:

x < y     # is x less than y
x > y     # is x greater than y
x <= y    # is x less than or equal to y
x >= y    # is x greater than or equal to y
x == y    # is x equal to y
x != y    # is x not equal to y

x <- 9
y <- 10
x == y

x <- c(1, 4, 9, 12)
y <- c(4, 4, 9, 13)

x == y

# How many pairwise equal values are in vectors x and y
sum(x == y)    

# Where are the pairwise equal values located in vectors x and y
which(x == y)

Exact Equality:

x <- c(4, 4, 9, 12)
y <- c(4, 4, 9, 13)

identical(x, y)

x <- c(4, 4, 9, 12)
y <- c(4, 4, 9, 12)

identical(x, y)

Floating Point Comparison:

x <- c(4.00000005, 4.00000008)
y <- c(4.00000002, 4.00000006)

all.equal(x, y)

# If the difference is greater than the tolerance level 
# the function will return the mean relative difference:
x <- c(4.005, 4.0008)
y <- c(4.002, 4.0006)

all.equal(x, y)
## [1] "Mean relative difference: 0.0003997102"

#Rounding numeric Values

x <- (1, 1.35, 1.7, 2.05, 2.4)
# Round to the nearest integer
round(x)
##  [1] 1 1 2 2 2 

# Round up
ceiling(x)
##  [1] 1 2 2 3 3
 
# Round down
floor(x)
##  [1] 1 1 1 2 2 
 
# Round to a specified decimal
round(x, digits = 1)
##  [1] 1.0 1.4 1.7 2.0 2.4
PreviousData typesNext-Dealing with String

Last updated 6 years ago