Linear Regression in R

Regression analysis is a very widely used statistical tool to establish a relationship model between two variables. One of these variables is called predictor variable whose value is gathered through experiments. The other variable is called response variable whose value is derived from the predictor variable. 

In Linear Regression these two variables are related through an equation, where exponent (power) of both these variables is 1. Mathematically a linear relationship represents a straight line when plotted as a graph. A non-linear relationship where the exponent of any variable is not equal to 1 creates a curve. 

 

The general mathematical equation for a linear regression is: 

y = ax+b 

 

Following is the description of the parameters used: 

  • y is the response variable. 
  • x is the predictor variable. 

a and b are constants which are called the coefficients.

Input Data 

Below is the sample data representing the observations: 

# Values of height 

151, 174, 138, 186, 128, 136, 179, 163, 152, 131 

 

# Values of weight. 

63, 81, 56, 91, 47, 57, 76, 72, 62, 48 

 

lm() Function 

This function creates the relationship model between the predictor and the response variable. 

 

Syntax 

The basic syntax for lm() function in linear regression is: 

lm(formula,data) 

 

Following is the description of the parameters used: 

  • formula is a symbol representing the relation between x and y. 

data is the vector on which the formula will be applied.

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131) 

y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48) 

relation <- lm(y~x)  

print(relation)

# Find weight of a person with height 170. 

a <- data.frame(x=170) 

result <-  predict(relation,a) 

print(result)