Draft:Multivariate logistic regression
![]() | Review waiting, please be patient.
This may take 3 months or more, since drafts are reviewed in no specific order. There are 2,543 pending submissions waiting for review.
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
Reviewer tools
|
Multivariate logistic regression is a type of data analysis that predicts outcomes based on multiple independent variables.[1][2]
Procedure
[edit]First, the baseline odds of a specific outcome compared to not having that outcome are calculated, giving a constant (intercept).[3] Next, the independent variables are incorporated into the model, giving a regression coefficient (beta) and a "P" value for each independent variable.[4] The "P" value determines how significantly the independent variable impacts the odds of having the outcome or not.[5]
Types
[edit]The two main types of multivariate logistic regression are linear regression and logistic regression.
Linear regression
[edit]Linear regression produces results that show a linear relationship with a single independent variable (IV) and can be plotted on a graph as a straight line.[6]
Logistic regression
[edit]In contrast, logistic regression produces results that show a nonlinear relationship. As a result, plotting the data on a graph produces a curved line called a sigmoid. Unlike linear regression, logistic regression produces results based on two or more independent variables.[7][8][2]
Assumptions
[edit]Multivariate logistic regression assumes that the different observations are independent.[9] It also assumes that the natural logarithm of the odds ratio and the dependent variables show a linear relationship. However, it does not assume a normal distribution of the dependent variables.
Null hypothesis
[edit]A null hypothesis is an assumption that the independent variables do not have any impact on the dependent variable.[10]
Dependent variables
[edit]There are three main types of logistic regression dependent variables (DVs): Binary, multi-class, and ordinal.[11]
Binary
[edit]A binary dependent variable is a variable with only two outcomes, and the possible values must be opposites of each other.[12]
Multi-class
[edit]A multi-class dependent variable is a variable with at least three qualitative (non-numerical) outcomes, usually with a constant numerical stand-in.[13]
Ordinal
[edit]An ordinal dependent variable is a variable with at least three possible outcomes, which are numerically different.[14]
Scientists
[edit]When scientists use logistic regression, they usually include as many independent variables as necessary.[2]
Artificial intelligence
[edit]Multivariate logistic regressions are also used in machine learning.[15]
References
[edit]- ^ "Multivariate logistic regression is a type of analysis that can help predict results when you're working with multiple variables." - [1] (Indeed)
- ^ a b c Sperandei, Sandro (2014). "Understanding logistic regression analysis". Biochemia Medica. 24 (1): 12–18. doi:10.11613/BM.2014.003. ISSN 1330-0962. PMC 3936971. PMID 24627710.
- ^ "The statistical program first calculates the baseline odds of having the outcome versus not having the outcome without using any predictor." - [2] (National Library of Medicine)
- ^ "Then, the chosen independent (input/predictor) variables are entered into the model, and a regression coefficient (known also as “beta”) and “P” value for each of these are calculated." - [3] (National Library of Medicine)
- ^ "The “P” value indicates whether the particular variable contributes significantly to the occurrence of the outcome or not." - [4] (National Library of Medicine)
- ^ "Linear regression has a continuous set of results that can easily be mapped on a graph as a straight line." - [5] (Indeed)
- ^ "Logistic regressions are non-linear and are portrayed on a graph with a curved shape called a sigmoid. Instead of a continuous set of results, a logistical regression has two or more categories for data." - [6] (Indeed)
- ^ "Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous)." - [7] (National Library of Medicine)
- ^ "Multiple logistic regression assumes that the observations are independent." - [8] (Statistics LibreTexts)
- ^ "he main null hypothesis of a multiple logistic regression is that there is no relationship between the X variables and the Y variable;" - [9] (LibreTexts)
- ^ "Logistic regression includes three basic types: ..." - [10] (Indeed)
- ^ "A binary output is a variable where there are only two possible outcomes. These outcomes must be opposite of each other and mutually exclusive." - [11] (Indeed)
- ^ "A multi-class has three or more categories without any numerical value, though they usually have a numerical stand-in for datasets." - [12] (Indeed)
- ^ "An ordinal output also has three or more categories, though they're in a ranked output." - [13] (Indeed)
- ^ "This is a common classification algorithm used in data science and machine learning." - [14] (Indeed)