### Install libauc Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/05_Optimizing_AUROC_Loss_with_DenseNet121_on_CheXpert.ipynb Installs the libauc library. ```python !pip install -U libauc ``` -------------------------------- ### Installation from source Source: https://github.com/optimization-ai/libauc/blob/1.4.0/README.md Installs the LibAUC library from its source code repository. ```bash $ git clone https://github.com/Optimization-AI/LibAUC.git $ cd LibAUC $ pip install . ``` -------------------------------- ### Model Initialization and CUDA Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Sets random seeds, initializes the NeuMF model, applies weight initialization, and moves the model to the GPU. ```python set_all_seeds(2022) model = NeuMF(n_users, n_items) model.apply(model.init_weights) model.cuda() ``` -------------------------------- ### Install libauc Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/01_Creating_Imbalanced_Benchmark_Datasets.ipynb Installs the libauc library version 1.2.0. ```python !pip install libauc==1.2.0 ``` -------------------------------- ### Model Initialization and Training Setup (Warm-up) Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Initializes the NeuMF model, sets up the data sampler, and defines the loss criterion and optimizer for the warm-up phase. ```python GAMMA0 = 0.1 n_users = 138493 n_items = 26744 id_mapper, num_relevant_pairs = trainSet.get_id_mapper() # save the model and log file RES_PATH = 'warm_up' os.mkdir(RES_PATH) ``` ```python train_sampler = DataSampler(labels=trainSet.targets, batch_size=BATCH_SIZE*(NUM_POS+NUM_NEG), num_pos=NUM_POS, num_tasks=BATCH_SIZE) ``` ```python set_all_seeds(2022) model = NeuMF(n_users, n_items) model.apply(model.init_weights) model.cuda() ``` ```python criterion = ListwiseCE_Loss(id_mapper=id_mapper, total_relevant_pairs=num_relevant_pairs, num_pos=NUM_POS, gamma0=GAMMA0) optimizer = SONG(params=model.parameters(), lr=LR, weight_decay=L2, mode=OPTIMIZER_STYLE) ``` ```python EPOCH = 20 train(model, trainSet, train_sampler, valSet, optimizer) ``` ```python result_dict = evaluate(model, testSet, TOPKS, METRICS) print("test results:" + format_metric(result_dict)) ``` -------------------------------- ### Install libauc Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/08_Optimizing_AUROC_Loss_with_DenseNet121_on_Melanoma.ipynb Installs the libauc library. ```python !pip install libauc ``` -------------------------------- ### Model Initialization and Training Setup (SONG) Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Defines hyperparameters, initializes the NeuMF model, sets up the data sampler, and defines the loss criterion and optimizer for the SONG algorithm. ```python LOSS = 'SONG' LR = 0.001 # learning rate of model parameters, \eta in the paper NUM_POS = 10 # number of positive items sampled per user NUM_NEG = 300 # number of negative items sampled per user L2 = 1e-7 # weight_decay OPTIMIZER_STYLE = 'adam' # 'sgd' or 'adam' # GAMMA0 is the moving average factor in our algo, you can tune BETA0 in (0.0, 1.0) for better performance GAMMA0 = 0.1 n_users = 138493 n_items = 26744 TOPK = -1 id_mapper, num_relevant_pairs = trainSet.get_id_mapper() # save the model and log file RES_PATH = 'song' os.mkdir(RES_PATH) ``` ```python train_sampler = DataSampler(labels=trainSet.targets, batch_size=BATCH_SIZE*(NUM_POS+NUM_NEG), num_pos=NUM_POS, num_tasks=BATCH_SIZE) ``` ```python model = NeuMF(n_users, n_items) model.apply(model.init_weights) model.cuda() ``` -------------------------------- ### Optimizer and Criterion Setup for K-SONG Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Loads a pre-trained model, resets the last layer, defines the NDCG loss criterion with K-SONG parameters, and sets up the SONG optimizer. ```python model.load_model('./warm_up/pretrained_model.pkl') model.reset_last_layer() criterion = NDCG_Loss(id_mapper, num_relevant_pairs, n_users, n_items, NUM_POS, gamma0=GAMMA0, topk=TOPK, topk_version=TOPK_V) optimizer = SONG(params=model.parameters(), lr=LR, weight_decay=L2, mode=OPTIMIZER_STYLE) ``` -------------------------------- ### Installation from pip Source: https://github.com/optimization-ai/libauc/blob/1.4.0/README.md Installs the LibAUC library using pip. ```bash $ pip install -U libauc ``` -------------------------------- ### Model and Loss Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/11_Optimizing_pAUC_Loss_with_SOTAs_on_Imbalanced_data.ipynb Initializes the model, loss function, and optimizer for training. ```python set_all_seeds(SEED) model = resnet18(pretrained=False, num_classes=1, last_activation=None) model = model.cuda() loss_fn = tpAUC_KL_Loss(pos_len=sampler.pos_len, Lambda=Lambda, tau=tau) optimizer = SOTAs(model, loss_fn=loss_fn, mode='adam', lr=lr, gammas=(gamma0, gamma1), weight_decay=weight_decay) ``` -------------------------------- ### Model, Loss & Optimizer Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/09_Optimizing_CompositionalAUC_Loss_with_ResNet20_on_CIFAR10.ipynb Initializes the ResNet20 model, the CompositionalAUC Loss function, and the PDSCA optimizer with specified hyperparameters. ```python set_all_seeds(123) model = ResNet20(pretrained=False, last_activation=None, activations='relu', num_classes=1) model = model.cuda() # Compositional Training loss_fn = CompositionalAUCLoss() optimizer = PDSCA(model, loss_fn=loss_fn, lr=lr, beta1=beta0, beta2=beta1, margin=margin, epoch_decay=epoch_decay, weight_decay=weight_decay) ``` -------------------------------- ### Model Initialization and Optimizer Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/12_Optimizing_AUROC_Loss_on_Tabular_Data.ipynb Initializes the MLP model, moves it to CUDA, and sets up the PESG optimizer with AUCMLoss. ```python set_all_seeds(SEED) model = MLP(input_dim=29, hidden_sizes=16, num_classes=1) model = model.cuda() print (model) loss_fn = AUCMLoss() optimizer = PESG(model, loss_fn=loss_fn, lr=lr, margin=margin, epoch_decay=epoch_decay, weight_decay=weight_decay) ``` -------------------------------- ### Model Creation and AUC Optimizer Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/02_Optimizing_AUROC_with_ResNet20_on_Imbalanced_CIFAR10.ipynb Initializes a ResNet20 model and moves it to the CUDA device. ```python # You can include sigmoid/l2 activations on model's outputs before computing loss model = ResNet20(pretrained=False, last_activation=None, num_classes=1) model = model.cuda() ``` -------------------------------- ### Data Loader Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/11_Optimizing_pAUC_Loss_on_Imbalanced_data_wrapper.ipynb Code to create PyTorch DataLoaders for the training and testing datasets, using a DualSampler for the training set. ```python trainSet = ImageDataset(train_images, train_labels) testSet = ImageDataset(test_images, test_labels, mode='test') sampler = DualSampler(trainSet, batch_size, sampling_rate=sampling_rate) trainloader = torch.utils.data.DataLoader(trainSet, batch_size=batch_size, sampler=sampler, shuffle=False, num_workers=1) testloader = torch.utils.data.DataLoader(testSet , batch_size=batch_size, shuffle=False, num_workers=1) ``` -------------------------------- ### Model and Loss Setup (SOPA Backend) Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/11_Optimizing_pAUC_Loss_on_Imbalanced_data_wrapper.ipynb Code to initialize the model, set random seeds, and configure the pAUC Loss function with the SOPA backend and its corresponding optimizer. ```python seed = 123 set_all_seeds(seed) model = resnet18(pretrained=False, num_classes=1, last_activation=None) model = model.cuda() loss_fn = pAUCLoss(pos_len=sampler.pos_len, backend='SOPA', beta=beta, margin=margin) optimizer = SOPA(model.parameters(), loss_fn=loss_fn.loss_fn, mode='adam', lr=lr, eta=eta, weight_decay=weight_decay) ``` -------------------------------- ### Import Required Packages Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Imports all necessary libraries and modules for the tutorial. ```python import os import sys import time import random import numpy as np import torch from torch.utils.data import DataLoader import libauc from libauc.datasets import MoiveLens from libauc.sampler.ranking import DataSampler from libauc.losses.ranking import NDCG_Loss, ListwiseCE_Loss from libauc.optimizers import SONG from libauc.models import NeuMF from libauc.utils.helper import batch_to_gpu, adjust_lr, format_metric, get_time ``` -------------------------------- ### Additional Imports Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Imports additional utility modules. ```python import os import sys import time import shutil from tqdm import tqdm, trange import numpy as np ``` -------------------------------- ### Install LibAUC Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/11_Optimizing_pAUC_Loss_with_SOTAs_on_Imbalanced_data.ipynb Installs version 1.2.0 of the LibAUC library. ```python ! ``` -------------------------------- ### Configuration Parameters Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Defines key hyperparameters and settings for the training process. ```python DATA_PATH = 'ml-20m' # path for the dataset file BATCH_SIZE = 256 # training batch size EVAL_BATCH_SIZE = 512 # evaluation batch size EPOCH = 120 # total training epochs NUM_WORKERS = 8 # number of workers in the dataloader LR_SCHEDULE = '[80]' # the lr will multiple 0.25 at 80 epochs TOPKS = eval('[5,10,20,50]') # k values for model evaluation (seperated by comma) METRICS = eval('["NDCG", "MAP"]') # the list of evaluation metrics (seperated by comma) MAIN_METRIC = "NDCG@5" # main metric when evaluation ``` -------------------------------- ### Model and Loss Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/03_Optimizing_AUPRC_Loss_on_Imbalanced_dataset.ipynb Initializes the ResNet18 model, the APLoss function, and the SOAP optimizer. ```python set_all_seeds(SEED) model = ResNet18(pretrained=False, last_activation=None) model = model.cuda() Loss = APLoss(pos_len=sampler.pos_len, margin=margin, gamma=gamma) optimizer = SOAP(model.parameters(), lr=lr, mode='adam', weight_decay=weight_decay) ``` -------------------------------- ### Data Sampler Initialization Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Initializes the data sampler for training, specifying batch size and sampling parameters. ```python train_sampler = DataSampler(labels=trainSet.targets, batch_size=BATCH_SIZE*(NUM_POS+NUM_NEG), num_pos=NUM_POS, num_tasks=BATCH_SIZE) ``` -------------------------------- ### Model and Loss Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/11_Optimizing_pAUC_Loss_with_SOPA_on_Imbalanced_data.ipynb Initializes the model, pAUC loss function, and the SOPA optimizer. ```python set_all_seeds(SEED) model = resnet18(pretrained=False, num_classes=1, last_activation=None) model = model.cuda() loss_fn = pAUC_CVaR_Loss(pos_len=sampler.pos_len, beta=beta) optimizer = SOPA(model, loss_fn=loss_fn, mode='adam', lr=lr, eta=eta, weight_decay=weight_decay) ``` -------------------------------- ### NDCG@5 on dev set for SONG and K-SONG Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb NDCG@5 scores for SONG and K-SONG methods on the development set. ```python song_ndcg_at_5 = [0.2128, 0.2216, 0.2642, 0.2864, 0.3003, 0.3087, 0.3163, 0.3218, 0.3273, 0.3326, 0.3365, 0.3389, 0.3418, 0.345, 0.3472, 0.349, 0.3505, 0.3525, 0.354, 0.3549, 0.3343, 0.3405, 0.346, 0.3516, 0.3555, 0.3565, 0.3601, 0.3632, 0.3646, 0.3637, 0.3679, 0.368, 0.369, 0.3697, 0.3721, 0.3724, 0.3724, 0.3733, 0.3739, 0.3741, 0.3733, 0.3754, 0.3762, 0.3761, 0.3781, 0.3782, 0.3782, 0.3792, 0.38, 0.3783, 0.3782, 0.3791, 0.3797, 0.3824, 0.3807, 0.3803, 0.3813, 0.3802, 0.38, 0.3813, 0.383, 0.3811, 0.3821, 0.3823, 0.3829, 0.3819, 0.3813, 0.3844, 0.3838, 0.3821, 0.3829, 0.3809, 0.3806, 0.3832, 0.3822, 0.3839, 0.3853, 0.385, 0.3804, 0.3857, 0.3869, 0.3885, 0.3893, 0.3881, 0.3901, 0.3914, 0.3915, 0.3929, 0.3935, 0.3923, 0.393, 0.3936, 0.3934, 0.3941, 0.3936, 0.3944, 0.3941, 0.3948, 0.3943, 0.3949, 0.3952, 0.3951, 0.3964, 0.3954, 0.3959, 0.3965, 0.3957, 0.3961, 0.3966, 0.3963, 0.3967, 0.3968, 0.3978, 0.3974, 0.3976, 0.3974, 0.3974, 0.3979, 0.3985, 0.3969] k_song_ndcg_at_5 = [0.2128, 0.2216, 0.2642, 0.2864, 0.3003, 0.3087, 0.3163, 0.3218, 0.3273, 0.3326, 0.3365, 0.3389, 0.3418, 0.345, 0.3472, 0.349, 0.3505, 0.3525, 0.354, 0.3549, 0.3286, 0.3372, 0.3435, 0.3487, 0.3534, 0.3555, 0.3594, 0.3622, 0.3639, 0.365, 0.3676, 0.369, 0.3703, 0.3716, 0.3732, 0.3737, 0.3747, 0.3756, 0.3765, 0.377, 0.3773, 0.379, 0.3785, 0.3802, 0.3818, 0.3817, 0.3824, 0.3838, 0.384, 0.3826, 0.3833, 0.3834, 0.3842, 0.386, 0.3857, 0.3855, 0.3856, 0.3857, 0.3854, 0.3863, 0.3869, 0.3861, 0.3871, 0.3874, 0.3888, 0.3868, 0.3879, 0.3891, 0.3888, 0.3885, 0.3887, 0.3882, 0.3876, 0.3896, 0.3882, 0.3897, 0.3899, 0.39, 0.3882, 0.3918, 0.3926, 0.3937, 0.3947, 0.3936, 0.3954, 0.3953, 0.3961, 0.3974, 0.3973, 0.397, 0.3975, 0.3979, 0.3979, 0.3985, 0.3983, 0.3985, 0.3987, 0.3989, 0.3988, 0.3992, 0.3998, 0.3994, 0.3998, 0.3997, 0.3997, 0.4, 0.3996, 0.4, 0.4004, 0.4005, 0.4006, 0.4004, 0.4008, 0.4009, 0.4009, 0.4009, 0.401, 0.4014, 0.4015, 0.401] ``` ```python warmup_ndcg_at_5 = [0.2125, 0.2236, 0.2632, 0.287, 0.3016, 0.3098, 0.3178, 0.3223, 0.3274, 0.3319, 0.336, 0.3386, 0.3418, 0.3441, 0.3463, 0.3479, 0.3498, 0.351, 0.3518, 0.3539, 0.3552, 0.357, 0.3567, 0.3582, 0.3593, 0.3601, 0.3607, 0.3605, 0.3612, 0.3614, 0.3613, 0.3632, 0.3629, 0.3634, 0.3646, 0.3653, 0.3652, 0.3648, 0.3655, 0.3668, 0.3649, 0.3673, 0.3664, 0.3665, 0.3672, 0.368, 0.3679, 0.3686, 0.368, 0.3685, 0.3688, 0.3686, 0.3684, 0.3686, 0.3696, 0.3684, 0.3702, 0.3691, 0.3684, 0.3697, 0.3684, 0.3699, 0.3697, 0.3691, 0.3686, 0.3702, 0.3681, 0.3691, 0.369, 0.3707, 0.3683, 0.3702, 0.3688, 0.3697, 0.3696, 0.3696, 0.3701, 0.3686, 0.3686, 0.3691, 0.37, 0.3698, 0.3698, 0.3709, 0.3709, 0.3716, 0.3718, 0.3714, 0.3724, 0.3729, 0.3727, 0.3726, 0.3723, 0.3727, 0.3726, 0.3722, 0.3725, 0.3713, 0.3719, 0.3718, 0.3723, 0.3716, 0.3714, 0.3719, 0.3715, 0.3717, 0.372, 0.3711, 0.3708, 0.3714, 0.3711, 0.3711, 0.3709, 0.3706, 0.37, 0.371, 0.3703, 0.3707, 0.37, 0.3708] song_wo_warmup = [0.221, 0.2212, 0.2211, 0.2265, 0.2409, 0.2604, 0.2735, 0.2844, 0.2923, 0.3002, 0.3062, 0.3107, 0.3173, 0.3203, 0.3246, 0.3275, 0.3314, 0.334, 0.3363, 0.3398, 0.3424, 0.3426, 0.3438, 0.347, 0.3486, 0.3491, 0.3511, 0.3525, 0.3545, 0.3532, 0.3567, 0.3573, 0.3578, 0.3596, 0.3603, 0.3598, 0.3616, 0.3624, 0.3621, 0.3646, 0.365, 0.3649, 0.3656, 0.3673, 0.3668, 0.3694, 0.3697, 0.37, 0.3712, 0.3711, 0.371, 0.3724, 0.3728, 0.3737, 0.3743, 0.375, 0.374, 0.3759, 0.3763, 0.3777, 0.3781, 0.3781, 0.3796, 0.3794, 0.3805, 0.3798, 0.3805, 0.3815, 0.3818, 0.3823, 0.3829, 0.3834, 0.3829, 0.3832, 0.3837, 0.3837, 0.3844, 0.3845, 0.3847, 0.3854, 0.3855, 0.3856, 0.3861, 0.3858, 0.386, 0.3865, 0.387, 0.3871, 0.387, 0.3867, 0.387, 0.3874, 0.3875, 0.3881, 0.3875, 0.388, 0.3882, 0.388, 0.3883, 0.3888, 0.3887, 0.3885, 0.3891, 0.3891, 0.389, 0.3892, 0.3893, 0.3896, 0.3898, 0.3899, 0.39, 0.39, 0.3906, 0.3905, 0.3908, 0.3907, 0.391, 0.3906, 0.3911, 0.3908] ``` -------------------------------- ### DataLoader Setup with DualSampler Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/03_Optimizing_AUPRC_Loss_on_Imbalanced_dataset.ipynb Sets up PyTorch DataLoaders for training and testing, using DualSampler for oversampling the minority class. ```python batch_size = 64 sampling_rate = 0.5 trainSet = ImageDataset(train_images, train_labels) trainSet_eval = ImageDataset(train_images, train_labels,mode='test') testSet = ImageDataset(test_images, test_labels, mode='test') sampler = DualSampler(trainSet, batch_size, sampling_rate=sampling_rate) trainloader = torch.utils.data.DataLoader(trainSet, batch_size=batch_size, sampler=sampler, num_workers=2) trainloader_eval = torch.utils.data.DataLoader(trainSet_eval, batch_size=batch_size, shuffle=False, num_workers=2) testloader = torch.utils.data.DataLoader(testSet, batch_size=batch_size, shuffle=False, num_workers=2) ``` -------------------------------- ### SONG Optimizer Initialization Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Initializes the model, resets the last layer, defines the NDCG loss criterion with SONG-specific parameters, and sets up the SONG optimizer. ```python model.load_model('./warm_up/pretrained_model.pkl') model.reset_last_layer() SONG_GAMMA0 = 0.1 criterion = NDCG_Loss(id_mapper, num_relevant_pairs, n_users, n_items, NUM_POS, gamma0=SONG_GAMMA0, topk=TOPK, topk_version='theo') optimizer = SONG(params=model.parameters(), lr=LR, weight_decay=L2, mode=OPTIMIZER_STYLE) ``` -------------------------------- ### Load Datasets Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Loads the training, validation, and testing datasets using the MoiveLens class. ```python trainSet = MoiveLens(root=DATA_PATH, phase='train') valSet = MoiveLens(root=DATA_PATH, phase='dev') testSet = MoiveLens(root=DATA_PATH, phase='test') ``` -------------------------------- ### Training with SONG Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Trains the model for a specified number of epochs using the SONG optimizer. ```python EPOCH = 100 train(model, trainSet, train_sampler, valSet, optimizer) ``` -------------------------------- ### Evaluating the Model Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Evaluates the trained model on the test set and prints the results. ```python result_dict = evaluate(model, testSet, TOPKS, METRICS) print("test results:" + format_metric(result_dict)) ``` -------------------------------- ### Hyper-parameters for Training Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/03_Optimizing_AUPRC_Loss_on_Imbalanced_dataset.ipynb Example of hyper-parameters that can be used for training the model, including learning rate, margin, gamma, weight decay, epochs, and seed. ```python lr = 1e-3 margin = 0.6 gamma = 0.1 weight_decay = 0 total_epoch = 60 decay_epoch = [30] SEED = 2022 ``` -------------------------------- ### Dataloader and Model Initialization for Pretraining Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/05_Optimizing_AUROC_Loss_with_DenseNet121_on_CheXpert.ipynb Sets up the dataloaders for the CheXpert dataset and initializes a DenseNet121 model for pretraining using Adam and CrossEntropyLoss. ```python # dataloader root = './CheXpert/CheXpert-v1.0-small/' # Index: -1 denotes multi-label mode including 5 diseases traindSet = CheXpert(csv_path=root+'train.csv', image_root_path=root, use_upsampling=False, use_frontal=True, image_size=224, verbose=True, mode='train', class_index=-1) testSet = CheXpert(csv_path=root+'valid.csv', image_root_path=root, use_upsampling=False, use_frontal=True, image_size=224, verbose=True, mode='valid', class_index=-1) trainloader = torch.utils.data.DataLoader(traindSet, batch_size=32, num_workers=2, shuffle=True) testloader = torch.utils.data.DataLoader(testSet, batch_size=32, num_workers=2, shuffle=False) # paramaters SEED = 123 BATCH_SIZE = 32 lr = 1e-4 weight_decay = 1e-5 # model set_all_seeds(SEED) model = DenseNet121(pretrained=True, last_activation=None, activations='relu', num_classes=5) model = model.cuda() # define loss & optimizer CELoss = CrossEntropyLoss() optimizer = Adam(model.parameters(), lr=lr, weight_decay=weight_decay) ``` -------------------------------- ### Model and Loss Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/11_Optimizing_pAUC_Loss_with_SOPAs_on_Imbalanced_data.ipynb This code block sets up the model architecture (resnet18), initializes the pAUC_DRO_Loss function, and configures the SOPAs optimizer. ```python seed = 123 set_all_seeds(seed) model = resnet18(pretrained=False, num_classes=1, last_activation=None) model = model.cuda() loss_fn = pAUC_DRO_Loss(pos_len=sampler.pos_len, margin=margin, gamma=gamma, Lambda=Lambda) optimizer = SOPAs(model.parameters(), loss_fn=loss_fn, mode='adam', lr=lr, weight_decay=weight_decay) ``` -------------------------------- ### Download and Unzip CheXpert Dataset Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/05_Optimizing_AUROC_Loss_with_DenseNet121_on_CheXpert.ipynb Copies the CheXpert dataset zip file to the content directory and unzips it. ```python !cp /content/drive/MyDrive/chexpert-dataset/CheXpert-v1.0-small.zip /content/ !mkdir CheXpert !unzip CheXpert-v1.0-small.zip -d /content/CheXpert/ ``` -------------------------------- ### Credit Fraud Dataset and DataLoader Setup Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/12_Optimizing_AUROC_Loss_on_Tabular_Data.ipynb Defines a custom Dataset for credit fraud data and sets up DataLoaders for training, validation, and testing, including a DualSampler for stratified sampling. ```python sampling_rate = 0.1 # e.g., this ensures 0.1*1024 = 102 positive samples in each mini-batch class CreditFraudDataset(Dataset): def __init__(self, data, target, shuffle=False): list_id = np.arange(len(data)) if shuffle: np.random.seed(123) np.random.shuffle(list_id) self.data = data.astype(np.float32)[list_id] # numpy array self.targets = target.astype(np.float32)[list_id] # numpy array def __getitem__(self, index): data = self.data[index] target = self.targets[index] return data, target def __len__(self): return self.data.shape[0] trainDataset = CreditFraudDataset(train_features, train_labels, shuffle=True) valDataset = CreditFraudDataset(val_features, val_labels) testDataset = CreditFraudDataset(test_features, test_labels) sampler = DualSampler(trainDataset, BATCH_SIZE, sampling_rate=sampling_rate) trainloader = torch.utils.data.DataLoader(trainDataset, batch_size=BATCH_SIZE, sampler=sampler, shuffle=False, num_workers=1, pin_memory=True) valloader = torch.utils.data.DataLoader(valDataset, batch_size=BATCH_SIZE, shuffle=False, num_workers=1, pin_memory=False) testloader = torch.utils.data.DataLoader(testDataset, batch_size=BATCH_SIZE, shuffle=False, num_workers=1, pin_memory=False) ``` -------------------------------- ### Example training pipeline for optimizing X-risk (e.g., AUROC) Source: https://github.com/optimization-ai/libauc/blob/1.4.0/README.md Demonstrates a typical training pipeline using LibAUC's AUCMLoss and PESG optimizer for X-risk optimization. ```python >>> #import our loss and optimizer >>> from libauc.losses import AUCMLoss >>> from libauc.optimizers import PESG >>> #pretraining your model through supervised learning or self-supervised learning >>> #load a pretrained encoder and random initialize the last linear layer >>> #define loss & optimizer >>> Loss = AUCMLoss() >>> optimizer = PESG() ... >>> #training >>> model.train() >>> for data, targets in trainloader: >>> data, targets = data.cuda(), targets.cuda() logits = model(data) preds = torch.sigmoid(logits) loss = Loss(preds, targets) optimizer.zero_grad() loss.backward() optimizer.step() ... >>> #update internal parameters >>> optimizer.update_regularizer() ``` -------------------------------- ### Set All Seeds for Reproducibility Source: https://github.com/optimization-ai/libauc/blob/1.4.0/examples/10_Optimizing_NDCG_Loss_on_MovieLens20M.ipynb Configures random seeds for various libraries to ensure reproducible results. ```python def set_all_seeds(SEED): import random random.seed(SEED) np.random.seed(SEED) torch.manual_seed(SEED) torch.cuda.manual_seed(SEED) torch.backends.cudnn.deterministic = True ```