Untitled

python3.9 finetune.py 
Dataset size: 64
Dataset is shuffled...
Dataset is splitted...
Fetching 4 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 85163.53it/s]
/home/ubuntu/.local/lib/python3.9/site-packages/transformers/convert_slow_tokenizer.py:562: UserWarning: The sentencepiece tokenizer that you are converting to a fast tokenizer uses the byte fallback option which is not implemented in the fast tokenizers. In practice this means that the fast version of the tokenizer can produce unknown tokens whereas the sentencepiece version would have converted these unknown tokens into a sequence of byte tokens matching the original piece of text.
  warnings.warn(
Compiling transformer encoder...
Compiling RNN...
Compiling span representation layer...
Compiling prompt representation layer...
/home/ubuntu/.local/lib/python3.9/site-packages/transformers/training_args.py:1494: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead
  warnings.warn(
  0%|                                                                                                                                                                                 | 0/24 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/ubuntu/gliner/finetune.py", line 69, in <module>
    trainer.train()
  File "/home/ubuntu/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1932, in train
    return inner_training_loop(
  File "/home/ubuntu/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2230, in _inner_training_loop
    for step, inputs in enumerate(epoch_iterator):
  File "/home/ubuntu/.local/lib/python3.9/site-packages/accelerate/data_loader.py", line 454, in __iter__
    current_batch = next(dataloader_iter)
  File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
  File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/_utils.py", line 705, in reraise
    raise exception
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
  File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
    return self.collate_fn(data)
  File "/home/ubuntu/.local/lib/python3.9/site-packages/transformers/trainer_utils.py", line 812, in __call__
    return self.data_collator(features)
  File "/home/ubuntu/.local/lib/python3.9/site-packages/gliner/data_processing/collator.py", line 29, in __call__
    raw_batch = self.data_processor.collate_raw_batch(input_x, entity_types = self.entity_types)
  File "/home/ubuntu/.local/lib/python3.9/site-packages/gliner/data_processing/processor.py", line 172, in collate_raw_batch
    class_to_ids, id_to_classes = self.batch_generate_class_mappings(batch_list, negatives)
  File "/home/ubuntu/.local/lib/python3.9/site-packages/gliner/data_processing/processor.py", line 148, in batch_generate_class_mappings
    negatives = self.get_negatives(batch_list, 100)
  File "/home/ubuntu/.local/lib/python3.9/site-packages/gliner/data_processing/processor.py", line 74, in get_negatives
    types = set([el[-1] for el in b['ner']])
KeyError: 'ner'

  0%|          | 0/24 [00:00<?, ?it/s]
Editor is loading...