shahad
shahad

Reputation: 9

how to solve "TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'"

I tried to run this code:

EN_AR = load_dataset("iwslt2017", "iwslt2017-ar-en", split="train").select(range(2000))

def extract_languages(examples):
  inputs = [ex["ar"] for ex in examples['translation']]
  target = [ex["en"] for ex in examples['translation']]
  return {"inputs":inputs,"targets":target}

EN_AR = EN_AR.map(extract_languages,batched=True, remove_columns=["translation"])

from transformers import AutoTokenizer, MBart50TokenizerFast

model_name = "facebook/mbart-large-50"
tokenizer = AutoTokenizer.from_pretrained(model_name)
maxL = 128
def preprocess_func(examples):
  model_inputs = tokenizer(examples["inputs"],max_length=maxL,truncation=True)

  with tokenizer.as_target_tokenizer():
    labels = tokenizer(examples["targets"],max_length=maxL,truncation=True)

  model_inputs["labels"]= labels["input_ids"]
  return model_name

tokenized_datasets = EN_AR.map(preprocess_func, batched = True, remove_columns=["inputs","targets"])

but it keeps telling me


TypeError Traceback (most recent call last) in <cell line: 15>() 13 return model_name 14 ---> 15 tokenized_datasets = EN_AR.map(preprocess_func, batched = True, remove_columns=>>>["inputs","targets"])

10 frames /usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_fast.py in >convert_ids_to_tokens(self, ids, skip_special_tokens) 387 tokens = [] 388 for index in ids: --> 389 index = int(index) 390 if skip_special_tokens and index in self.all_special_ids: 391 continue

TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'

I took this code from the Hugging Face website for translation preprocessing, but I don't know why it does not work for me

Upvotes: 0

Views: 609

Answers (1)

Charles Merriam
Charles Merriam

Reputation: 20500

Welcome.

The question suggests you are new to Python, so forgive me if I cover some basics.

Each variable and expression in Python has a type. For example, you can type type(5) and get <class 'int'>. Functions will often return value None, of type <class 'NoneType'> when there is no answer.

The int() function converts variables from one type to another. For example, int(3), int(3.5), and int("3") are all 3. int(None) cannot be converted to an integer and gives the TypeError message you are seeing.

Unfortunately, the call to int is somewhere in your call stack. I suggest you use print statements or a debugger to look at the dataset you are loading, specifically for a place where you expect an integer weight and are missing it.

Add a comment about how it worked!

Upvotes: -1

Related Questions