пятница, 27 сентября 2019 г.

Why is logistic regression a linear classifier Cross Validated

Logistic regression vs linear regression. Since we are using the logistic function to transform a linear combination of the input into a non-linear output, how can logistic regression be considered a linear classifier? Linear regression is just like a neural network without the hidden layer, so why are neural networks considered non-linear classifiers and logistic regression is linear? Logistic regression is linear in the sense that the predictions can be written as $$ \hat = \frac >>, \text \hat = \hat \cdot x. $$ Thus, the prediction can be written in terms of $\hat $, which is a linear function of $x$. (More precisely, the predicted log-odds is a linear function of $x$.) Conversely, there is no way to summarize the output of a neural network in terms of a linear function of $x$, and that is why neural networks are called non-linear. Also, for logistic regression, the decision boundary $\ = 0.5\>$ is linear: it's the solution to $\hat \cdot x = 0$. The decision boundary of a neural network is in general not linear. As Stefan Wagner notes, the decision boundary for a logistic classifier is linear. (The classifier needs the inputs to be linearly separable.) I wanted to expand on the math for this in case it's not obvious. The decision boundary is the set of x such that $$ >>> = 0.5$$ A little bit of algebra shows that this is equivalent to $$ >>$$ or, taking the natural log of both sides, $$0 = \theta \cdot x = \sum\limits_ ^ \theta_i x_i$$ so the decision boundary is linear. The reason the decision boundary for a neural network is not linear is because there are two layers of sigmoid functions in the neural network: one in each of the output nodes plus an additional sigmoid function to combine and threshold the results of each output node. It we have two classes, $C_ $ and $C_ $, then we can express the conditional probability as, $$ P(C_ |x) = \frac )P(C_ )> $$ applying the Bayes' theorem, $$ P(C_ |x) = \frac )P(C_ )> )P(C_ )+P(x|C_ )P(C_ )> = \frac )> )>-\log \frac )> )>\right)> $$ the denominator is expressed as $1+e^ $. Under which conditions reduces the first expression to a linear term?. If you consider the exponential family (a canonical form for the exponential distributions like Gauß or Poisson), $$ P(x|C_ ) = \exp \left(\frac x -b(\theta_ )> +c(x,\phi)\right) $$ then you end up having a linear form, $$ \log\frac )> )> = \left[ (\theta_ -\theta_ )x - b(\theta_ )+b(\theta_ ) \right]/a(\phi) $$ Notice that we assume that both distributions belong to the same family and have the same dispersion parameters. But, under that assumption, the logistic regression can model the probabilities for the whole family of exponential distributions.

Комментариев нет:

Отправить комментарий