Multivariate logistic regression
Multivariate logistic regression is a type of data analysis that predicts outcomes based on multiple independent variables.[1][2]
Procedure
First, the baseline odds of a specific outcome compared to not having that outcome are calculated, giving a constant (intercept).[3] Next, the independent variables are incorporated into the model, giving a regression coefficient (beta) and a "P" value for each independent variable.[4] The "P" value determines how significantly the independent variable impacts the odds of having the outcome or not.[5]
Types
The two main types of multivariate logistic regression are linear regression and logistic regression.
Linear regression
Linear regression produces results that show a linear relationship with a single independent variable (IV) and can be plotted on a graph as a straight line.[6]
Logistic regression
In contrast, logistic regression produces results that show a nonlinear relationship. As a result, plotting the data on a graph produces a curved line called a sigmoid. Unlike linear regression, logistic regression produces results based on two or more independent variables.[7][8][2]
There are three main types of logistic regression dependent variables (DVs): Binary, multi-class, and ordinal.[9]
Binary
A binary dependent variable is a variable with only two outcomes, and the possible values must be opposites of each other.[10]
Multi-class
A multi-class dependent variable is a variable with at least three qualitative (non-numerical) outcomes, usually with a constant numerical stand-in.[11]
Ordinal
An ordinal dependent variable is a variable with at least three possible outcomes, which are numerically different.[12]
Scientists
When scientists use logistic regression, they usually include as many independent variables as necessary.[2]
Artificial intelligence
Multivariate logistic regressions are also used in machine learning.[13]
References
- ^ "Multivariate logistic regression is a type of analysis that can help predict results when you're working with multiple variables." - [1] (Indeed)
- ^ a b c Sperandei, Sandro (2014). "Understanding logistic regression analysis". Biochemia Medica. 24 (1): 12–18. doi:10.11613/BM.2014.003. ISSN 1330-0962. PMC 3936971. PMID 24627710.
- ^ "The statistical program first calculates the baseline odds of having the outcome versus not having the outcome without using any predictor." - [2] (National Library of Medicine)
- ^ "Then, the chosen independent (input/predictor) variables are entered into the model, and a regression coefficient (known also as “beta”) and “P” value for each of these are calculated." - [3] (National Library of Medicine)
- ^ "The “P” value indicates whether the particular variable contributes significantly to the occurrence of the outcome or not." - [4] (National Library of Medicine)
- ^ "Linear regression has a continuous set of results that can easily be mapped on a graph as a straight line." - [5] (Indeed)
- ^ "Logistic regressions are non-linear and are portrayed on a graph with a curved shape called a sigmoid. Instead of a continuous set of results, a logistical regression has two or more categories for data." - [6] (Indeed)
- ^ "Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous)." - [7] (National Library of Medicine)
- ^ "Logistic regression includes three basic types: ..." - [8] (Indeed)
- ^ "A binary output is a variable where there are only two possible outcomes. These outcomes must be opposite of each other and mutually exclusive." - [9] (Indeed)
- ^ "A multi-class has three or more categories without any numerical value, though they usually have a numerical stand-in for datasets." - [10] (Indeed)
- ^ "An ordinal output also has three or more categories, though they're in a ranked output." - [11] (Indeed)
- ^ "This is a common classification algorithm used in data science and machine learning." - [12] (Indeed)