RROBINSON
RROBINSON

Reputation: 191

HuggingFace SciBert AutoModelForMaskedLM cannot be imported

I am trying to use the pretrained SciBERT model (https://huggingface.co/allenai/scibert_scivocab_uncased) from Huggingface to evaluate masked words in scientific/biomedical text for bias using CrowS-Pairs (https://github.com/nyu-mll/crows-pairs/). The CrowS-Pairs code works great with the built in models like BERT.

I modified the code of metric.py with the goal of allowing an option of using the SciBERT model -

import os
import csv
import json
import math
import torch
import argparse
import difflib
import logging
import numpy as np
import pandas as pd

from transformers import BertTokenizer, BertForMaskedLM
from transformers import AlbertTokenizer, AlbertForMaskedLM
from transformers import RobertaTokenizer, RobertaForMaskedLM
from transformers import AutoTokenizer, AutoModelForMaskedLM

and get the following error

2021-06-21 17:11:38.626413: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
  File "metric.py", line 15, in <module>
    from transformers import AutoTokenizer, AutoModelForMaskedLM
ImportError: cannot import name 'AutoModelForMaskedLM' from 'transformers' (/usr/local/lib/python3.7/dist-packages/transformers/__init__.py)

Later in the Python file, the AutoTokenizer and AutoModelForMaskedLM are defined as

tokenizer = AutoTokenizer.from_pretrained("allenai/scibert_scivocab_uncased")
model = AutoModelForMaskedLM.from_pretrained("allenai/scibert_scivocab_uncased") 

Libraries

huggingface-hub-0.0.8
sacremoses-0.0.45
tokenizers-0.10.3
transformers-4.7.0 

The error occurs with and without GPU support.

Upvotes: 0

Views: 1286

Answers (1)

mah65
mah65

Reputation: 588

Try this:

tokenizer = BertTokenizer.from_pretrained("allenai/scibert_scivocab_uncased", do_lower_case=True)

model = BertForMaskedLM.from_pretrained("allenai/scibert_scivocab_uncased")

Upvotes: 1

Related Questions