Multivariate logistic regression

Multivariate logistic regression is a type of data analysis that predicts outcomes based on multiple independent variables.^[1]^[2]

Procedure

First, the baseline odds of a specific outcome compared to not having that outcome are calculated, giving a constant (intercept).^[3] Next, the independent variables are incorporated into the model, giving a regression coefficient (beta) and a "P" value for each independent variable.^[4] The "P" value determines how significantly the independent variable impacts the odds of having the outcome or not.^[5]

Types

The two main types of multivariate logistic regression are linear regression and logistic regression.

Linear regression

Linear regression produces results that show a linear relationship with a single independent variable (IV) and can be plotted on a graph as a straight line.^[6]

Logistic regression

In contrast, logistic regression produces results that show a nonlinear relationship. As a result, plotting the data on a graph produces a curved line called a sigmoid. Unlike linear regression, logistic regression produces results based on two or more independent variables.^[7]^[8]^[2]

There are three main types of logistic regression dependent variables (DVs): Binary, multi-class, and ordinal.^[9]

Binary

A binary dependent variable is a variable with only two outcomes, and the possible values must be opposites of each other.^[10]

Multi-class

A multi-class dependent variable is a variable with at least three qualitative (non-numerical) outcomes, usually with a constant numerical stand-in.^[11]

Ordinal

An ordinal dependent variable is a variable with at least three possible outcomes, which are numerically different.^[12]

Scientists

When scientists use logistic regression, they usually include as many independent variables as necessary.^[2]

Artificial intelligence

Multivariate logistic regressions are also used in machine learning.^[13]

References

^ "Multivariate logistic regression is a type of analysis that can help predict results when you're working with multiple variables." - [1] (Indeed)
^ ^a ^b ^c Sperandei, Sandro (2014). "Understanding logistic regression analysis". Biochemia Medica. 24 (1): 12–18. doi:10.11613/BM.2014.003. ISSN 1330-0962. PMC 3936971. PMID 24627710.
^ "The statistical program first calculates the baseline odds of having the outcome versus not having the outcome without using any predictor." - [2] (National Library of Medicine)
^ "Then, the chosen independent (input/predictor) variables are entered into the model, and a regression coefficient (known also as “beta”) and “P” value for each of these are calculated." - [3] (National Library of Medicine)
^ "The “P” value indicates whether the particular variable contributes significantly to the occurrence of the outcome or not." - [4] (National Library of Medicine)
^ "Linear regression has a continuous set of results that can easily be mapped on a graph as a straight line." - [5] (Indeed)
^ "Logistic regressions are non-linear and are portrayed on a graph with a curved shape called a sigmoid. Instead of a continuous set of results, a logistical regression has two or more categories for data." - [6] (Indeed)
^ "Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous)." - [7] (National Library of Medicine)
^ "Logistic regression includes three basic types: ..." - [8] (Indeed)
^ "A binary output is a variable where there are only two possible outcomes. These outcomes must be opposite of each other and mutually exclusive." - [9] (Indeed)
^ "A multi-class has three or more categories without any numerical value, though they usually have a numerical stand-in for datasets." - [10] (Indeed)
^ "An ordinal output also has three or more categories, though they're in a ranked output." - [11] (Indeed)
^ "This is a common classification algorithm used in data science and machine learning." - [12] (Indeed)

[1] "Multivariate logistic regression is a type of analysis that can help predict results when you're working with multiple variables." - [1] (Indeed)

[:0-2] Sperandei, Sandro (2014). "Understanding logistic regression analysis". Biochemia Medica. 24 (1): 12–18. doi:10.11613/BM.2014.003. ISSN 1330-0962. PMC 3936971. PMID 24627710.

[3] "The statistical program first calculates the baseline odds of having the outcome versus not having the outcome without using any predictor." - [2] (National Library of Medicine)

[4] "Then, the chosen independent (input/predictor) variables are entered into the model, and a regression coefficient (known also as “beta”) and “P” value for each of these are calculated." - [3] (National Library of Medicine)

[5] "The “P” value indicates whether the particular variable contributes significantly to the occurrence of the outcome or not." - [4] (National Library of Medicine)

[6] "Linear regression has a continuous set of results that can easily be mapped on a graph as a straight line." - [5] (Indeed)

[7] "Logistic regressions are non-linear and are portrayed on a graph with a curved shape called a sigmoid. Instead of a continuous set of results, a logistical regression has two or more categories for data." - [6] (Indeed)

[8] "Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous)." - [7] (National Library of Medicine)

[9] "Logistic regression includes three basic types: ..." - [8] (Indeed)

[10] "A binary output is a variable where there are only two possible outcomes. These outcomes must be opposite of each other and mutually exclusive." - [9] (Indeed)

[11] "A multi-class has three or more categories without any numerical value, though they usually have a numerical stand-in for datasets." - [10] (Indeed)

[12] "An ordinal output also has three or more categories, though they're in a ranked output." - [11] (Indeed)

[13] "This is a common classification algorithm used in data science and machine learning." - [12] (Indeed)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]