Data Science

Introduction to Linear Regression

Introduction to Linear Regression

What is Linear Regression?

If you are a beginner in data science, you may have heard many times about linear regression. Here we are going to discuss an introduction to linear regression and its algorithm. Linear regression attempts to understand the relationship between two variables by adding a linear equation to the observed data. One variable is considered as the descriptive variable and the other as the dependent variable. Let us see an example of stock price data below.

How/where do we use it?

Let us consider the data of some stocks in the stock market and analyze the stock price for a month. It has 6 columns, Time, Opening Price, Closing Price, Low, High and Volume Trade. What valid questions can we ask from this data-set?

For example,

• what were the highest and lowest prices this month?

• Is the opening price favor the stock price every day?

• Is the previous day’s closing price favor the next day’s stock price?

• Will the volume of stock traded each day affect the price of the stock?

• Are each price point interdependent?

When analyzing data we have to ask several questions to answer how some phenomena affect another or how many variables are related. In short, linear regression is one of the algorithms or techniques used to find the relationship between two variables. There are several other algorithms for finding linear relationships, the two most commonly used types of linear regression being simple linear regression and the other multivariate linear regression.

Algorithm for Linear Regression

If some data points are taken as x from the data set, then each x point must have a corresponding result y. That is, all the points in x are the domain of a function and y is the co-domain. Let us plot a line graph that best suits or represents the function. In linear regression, we try to find such an equation of line in the form y = mx + c, where m is the slope and c is the y-intercept. We actually try to minimize this distance (points and lines) as much as possible. The algorithm for linear regression is given below.

1. Start

2. Read Number of Data (n)

3. For i=1 to n:

Read Xi and Yi

Next i

4. Initialize:

sumX = 0

sumX2 = 0

sumY = 0

sumXY = 0

5. Calculate Required Sum

For i=1 to n:

sumX = sumX + Xi

sumX2 = sumX2 + Xi * Xi

sumY = sumY + Yi

sumXY = sumXY + Xi * Yi

Next i

6. Calculate Required Constant a and b of y = a + bx:

b = ( n * sumXY – sumX * sumY ) / ( n * sumX2 – sumX * sumX )

a = ( sumY – b * sumX ) / n

7. Display value of a and b

8. Stop

Real World Scenario

As we discussed earlier Linear regression is the most commonly used technique in statistics. It is used to measure the relationship between one or more predictor variables and one response variable. The following are some examples of the use of linear regression in the real world Best Data Science training institute in kochisituation.

#1 Businesses often use linear regression to understand the relationship between advertising costs and revenue.

#2 In medical field, the researchers use linear regression to understand the relationship between drug dosage and patients’ blood pressure.

#3 Agronomists often use linear regression to measure the effect of fertilizer and water on crop yields.

#4 Data scientists for professional sports teams use linear regression to measure the impact of different training conditions on player performance.

Summary

Here we discuss an introduction to linear regression and some real world examples. Linear regression is a widely used technique in statistics. It is a measure of the relationship between dependent and independent variables. When it comes to data science, there are more things and strategies to predict the behavior or relationships of data points in a data-set.

If you want to know more about Data Science and its technologies, you can refer to the best Data Science training in Kochi. Never stop learning, there may be plenty of free resources available, but an interactive class makes learning more effective and refers to free resources. Here you can check out the best Data Science training institute in Kochi. Happy learning.

Author: STEPS

Leave a Reply

Your email address will not be published. Required fields are marked *