The logistic_regression.py code shared on (https://github.com/statisticspoland/ecoicop_classification/blob/master/Logistic_Regression/logistic_regression.py) runs ML models on different values of the following parameters: C, fit_intercept, class_weight, solver and multi_class. The parameter max_iter is set at 200.

for c in [0.1, 1, 2, 3]:
    for fit_intercept in [True, False]:
        for class_weight in [None, 'balanced']:
            for solver in ["newton-cg", "lbfgs", "liblinear", "sag", "saga"]:
                for multi_class in ["ovr", "multinomial"]:

For each combination, the code calculates the accuracy on the model predictions from the validation subset (all predictions combined). It also calculates the accuracy from the training dataset and the F1 score on the validation subset (which is equal to the accuracy because it is calculated on all predictions).

The code outputs the following Excel file: results_logistic_regression.xlsx. Below are the first few records.



Cfit_interceptclass_weightsolvermulti_classmax_iterval_accuracytrain_accuracyf1_score_micro
00.1TRUE
newton-cgovr2000.82000.85660.8200
10.1TRUE
newton-cgmultinomial2000.82530.86670.8253
20.1TRUE
lbfgsovr2000.82000.85660.8200
30.1TRUE
lbfgsmultinomial2000.82530.86670.8253
40.1TRUE
liblinearovr2000.83270.86790.8327
50.1TRUE
sagovr2000.82000.85660.8200
  • No labels