You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -136,7 +157,7 @@ With ordinary least square (OLS), the objective function is, from maximum likeli
136
157
where `yᵢ` are the values of the response variable, `𝒙ᵢ` are the covectors of individual covariates
137
158
(rows of the model matrix `X`), `𝜷` is the vector of fitted coefficients and `rᵢ` are the individual residuals.
138
159
139
-
A `RobustLinearModel` solves instead for the following loss function: `L' = Σᵢ ρ(rᵢ)`
160
+
A `RobustLinearModel` solves instead for the following loss function[1]: `L' = Σᵢ ρ(rᵢ)`
140
161
(more precisely `L' = Σᵢ ρ(rᵢ/σ)` where `σ` is an (robust) estimate of the standard deviation of the residual).
141
162
Several loss functions are implemented:
142
163
@@ -150,7 +171,7 @@ Several loss functions are implemented:
150
171
-`CauchyLoss`: `ρ(r) = log(1+(r/c)²)`, non-convex estimator, that also corresponds to a Student's-t distribution (with fixed degree of freedom). It suppresses outliers more strongly but it is not sure to converge.
151
172
-`GemanLoss`: `ρ(r) = ½ (r/c)²/(1 + (r/c)²)`, non-convex and bounded estimator, it suppresses outliers more strongly.
152
173
-`WelschLoss`: `ρ(r) = ½ (1 - exp(-(r/c)²))`, non-convex and bounded estimator, it suppresses outliers more strongly.
153
-
-`TukeyLoss`: `ρ(r) = if r<c; ⅙(1 - (1-(r/c)²)³) else ⅙ end`, non-convex and bounded estimator, it suppresses outliers more strongly and it is the prefered estimator for most cases.
174
+
-`TukeyLoss`: `ρ(r) = if r<c; ⅙(1 - (1-(r/c)²)³) else ⅙ end`, non-convex and bounded estimator, it suppresses outliers more strongly and it is the preferred estimator for most cases.
154
175
-`YohaiZamarLoss`: `ρ(r)` is quadratic for `r/c < 2/3` and is bounded to 1; non-convex estimator, it is optimized to have the lowest bias for a given efficiency.
155
176
156
177
The value of the tuning constants `c` are optimized for each estimator so the M-estimators have a high efficiency of 0.95. However, these estimators have a low breakdown point.
@@ -178,9 +199,20 @@ the two loss functions should be the same but with different tuning constants.
178
199
### MQuantile-estimators
179
200
180
201
Using an asymmetric variant of the `L1Estimator`, quantile regression is performed
181
-
(although the `QuantileRegression` solver should be prefered because it gives an exact solution).
182
-
Identically, with an M-estimator using an asymetric version of the loss function,
183
-
a generalization of quantiles is obtained. For instance, using an asymetric `L2Loss` results in _Expectile Regression_.
202
+
(although the `QuantileRegression` solver should be preferred because it gives an exact solution).
203
+
Identically, with an M-estimator using an asymmetric version of the loss function,
204
+
a generalization of quantiles is obtained. For instance, using an asymmetric `L2Loss` results in _Expectile Regression_.
205
+
206
+
### Quantile regression
207
+
208
+
_Quantile regression_ results from minimizing the following objective function:
209
+
`L = Σᵢ wᵢ|yᵢ - 𝒙ᵢ 𝜷| = Σᵢ wᵢ(rᵢ) |rᵢ|`,
210
+
where `wᵢ = ifelse(rᵢ>0, τ, 1-τ)` and `τ` is the quantile of interest. `τ=0.5` corresponds to _Least Absolute Deviations_.
211
+
212
+
This problem is solved exactly using linear programming techniques,
213
+
specifically, interior point methods using the internal API of [Tulip](https://github.com/ds4dm/Tulip.jl).
214
+
The internal API is considered unstable, but it results in a much lighter dependency than
215
+
including the [JuMP](https://github.com/JuliaOpt/JuMP.jl) package with Tulip backend.
184
216
185
217
### Robust Ridge regression
186
218
@@ -192,27 +224,60 @@ By default, all the coefficients (except the intercept) have the same penalty, w
192
224
all the feature variables have the same scale. If it is not the case, use a robust estimate of scale
193
225
to normalize every column of the model matrix `X` before fitting the regression.
194
226
195
-
### Quantile regression
227
+
### Regularized Least Squares
196
228
197
-
_Quantile regression_ results from minimizing the following objective function:
198
-
`L = Σᵢ wᵢ|yᵢ - 𝒙ᵢ 𝜷| = Σᵢ wᵢ(rᵢ) |rᵢ|`,
199
-
where `wᵢ = ifelse(rᵢ>0, τ, 1-τ)` and `τ` is the quantile of interest. `τ=0.5` corresponds to _Least Absolute Deviations_.
229
+
_Regularized Least Squares_ regression is the solution of the minimization of following objective function:
230
+
`L = ½ Σᵢ |yᵢ - 𝒙ᵢ 𝜷|² + P(𝜷)`,
231
+
where `P(𝜷)` is a (sparse) penalty on the coefficients.
232
+
233
+
The following penalty function are defined:
234
+
- `NoPenalty`: `cost(𝐱) = 0`, no penalty.
235
+
- `SquaredL2Penalty`: `cost(𝐱) = λ ½||𝐱||₂²`, also called Ridge.
236
+
- `L1Penalty`: `cost(𝐱) = λ||𝐱||₁`, also called LASSO.
0 commit comments