Ishan Dutta
Ishan Dutta

Reputation: 957

Python: tqdm progress bar stuck at 0%

I have written the following code to train a bert model on my dataset, I have used from tqdm.notebook import tqdm this import for tqdm and have used it in the loops. But when I run the program the bar stays at 0% even after the entire code has run. How to fix this?

Code

Model

TRANSFORMERS = {
    "bert-multi-cased": (BertModel, BertTokenizer, "bert-base-uncased"),
}

class Transformer(nn.Module):
    def __init__(self, model, num_classes=1):
        """
        Constructor
    
    Arguments:
        model {string} -- Transformer to build the model on. Expects "camembert-base".
        num_classes {int} -- Number of classes (default: {1})
    """
    super().__init__()
    self.name = model

    model_class, tokenizer_class, pretrained_weights = TRANSFORMERS[model]

    bert_config = BertConfig.from_json_file(MODEL_PATHS[model] + 'bert_config.json')
    bert_config.output_hidden_states = True
    
    self.transformer = BertModel(bert_config)

    self.nb_features = self.transformer.pooler.dense.out_features

    self.pooler = nn.Sequential(
        nn.Linear(self.nb_features, self.nb_features), 
        nn.Tanh(),
    )

    self.logit = nn.Linear(self.nb_features, num_classes)

def forward(self, tokens):
    """
    Usual torch forward function
    
    Arguments:
        tokens {torch tensor} -- Sentence tokens
    
    Returns:
        torch tensor -- Class logits
    """
    _, _, hidden_states = self.transformer(
        tokens, attention_mask=(tokens > 0).long()
    )

    hidden_states = hidden_states[-1][:, 0] # Use the representation of the first token of the last layer

    ft = self.pooler(hidden_states)

    return self.logit(ft)

Training

def fit(model, train_dataset, val_dataset, epochs=1, batch_size=8, warmup_prop=0, lr=5e-4):
    
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

    optimizer = AdamW(model.parameters(), lr=lr)
    
    num_warmup_steps = int(warmup_prop * epochs * len(train_loader))
    num_training_steps = epochs * len(train_loader)
    
    scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps, num_training_steps)

    loss_fct = nn.BCEWithLogitsLoss(reduction='mean').cuda()
    
    for epoch in range(epochs):
        model.train()
        start_time = time.time()
        
        optimizer.zero_grad()
        avg_loss = 0
        
        for step, (x, y_batch) in tqdm(enumerate(train_loader), total=len(train_loader)):
            
            y_pred = model(x.to(device))
            
            loss = loss_fct(y_pred.view(-1).float(), y_batch.float().to(device))
            loss.backward()
            avg_loss += loss.item() / len(train_loader)

            xm.optimizer_step(optimizer, barrier=True)
            #optimizer.step()
            scheduler.step()
            model.zero_grad()
            optimizer.zero_grad()
                
        model.eval()
        preds = []
        truths = []
        avg_val_loss = 0.

        with torch.no_grad():
            for x, y_batch in tqdm(val_loader):                
                y_pred = model(x.to(device))
                loss = loss_fct(y_pred.detach().view(-1).float(), y_batch.float().to(device))
                avg_val_loss += loss.item() / len(val_loader)
                
                probs = torch.sigmoid(y_pred).detach().cpu().numpy()
                preds += list(probs.flatten())
                truths += list(y_batch.numpy().flatten())
            score = roc_auc_score(truths, preds)
            
        
        dt = time.time() - start_time
        lr = scheduler.get_last_lr()[0]
        print(f'Epoch {epoch + 1}/{epochs} \t lr={lr:.1e} \t t={dt:.0f}s \t loss={avg_loss:.4f} \t val_loss={avg_val_loss:.4f} \t val_auc={score:.4f}')

model = Transformer("bert-multi-cased")
device = torch.device('cuda:2')
model = model.to(device)

epochs = 3
batch_size = 32
warmup_prop = 0.1
lr = 1e-4

train_dataset = JigsawDataset(df_train)

val_dataset = JigsawDataset(df_val)
test_dataset = JigsawDataset(df_test)
fit(model, train_dataset, val_dataset, epochs=epochs, batch_size=batch_size, warmup_prop=warmup_prop, lr=lr)

Output

0%| | 0/6986 [00:00<?, ?it/s]

How to fix this?

Upvotes: 3

Views: 9768

Answers (2)

robertspierre
robertspierre

Reputation: 4380

Contrary to Ishan Dutta answer, tqdm.notebook.tqdm (and not tqdm.tqdm) is the correct function to use for both Jupyter Notebook and JupyterLab.

This problem can happen if you don't have ipywidgets installed or if you have installed ipywidgets before installing JupyterLab.

What fixed it for me was reinstalling ipywidgets:

pip3 uninstall ipywidgets --yes
pip3 install --upgrade ipywidgets

Upvotes: 3

Ishan Dutta
Ishan Dutta

Reputation: 957

The import should be:

from tqdm import tqdm

The error is in the training function, correct this loop:

for x, y_batch in tqdm(val_loader, total = len(val_loader)): 

Upvotes: 3

Related Questions