Untitled
unknown
plain_text
a year ago
4.2 kB
14
Indexable
python3.9 finetune.py
Dataset size: 64
Dataset is shuffled...
Dataset is splitted...
Fetching 4 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 85163.53it/s]
/home/ubuntu/.local/lib/python3.9/site-packages/transformers/convert_slow_tokenizer.py:562: UserWarning: The sentencepiece tokenizer that you are converting to a fast tokenizer uses the byte fallback option which is not implemented in the fast tokenizers. In practice this means that the fast version of the tokenizer can produce unknown tokens whereas the sentencepiece version would have converted these unknown tokens into a sequence of byte tokens matching the original piece of text.
warnings.warn(
Compiling transformer encoder...
Compiling RNN...
Compiling span representation layer...
Compiling prompt representation layer...
/home/ubuntu/.local/lib/python3.9/site-packages/transformers/training_args.py:1494: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead
warnings.warn(
0%| | 0/24 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/ubuntu/gliner/finetune.py", line 69, in <module>
trainer.train()
File "/home/ubuntu/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1932, in train
return inner_training_loop(
File "/home/ubuntu/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2230, in _inner_training_loop
for step, inputs in enumerate(epoch_iterator):
File "/home/ubuntu/.local/lib/python3.9/site-packages/accelerate/data_loader.py", line 454, in __iter__
current_batch = next(dataloader_iter)
File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
data = self._next_data()
File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
return self._process_data(data)
File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
data.reraise()
File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/_utils.py", line 705, in reraise
raise exception
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/home/ubuntu/.local/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
return self.collate_fn(data)
File "/home/ubuntu/.local/lib/python3.9/site-packages/transformers/trainer_utils.py", line 812, in __call__
return self.data_collator(features)
File "/home/ubuntu/.local/lib/python3.9/site-packages/gliner/data_processing/collator.py", line 29, in __call__
raw_batch = self.data_processor.collate_raw_batch(input_x, entity_types = self.entity_types)
File "/home/ubuntu/.local/lib/python3.9/site-packages/gliner/data_processing/processor.py", line 172, in collate_raw_batch
class_to_ids, id_to_classes = self.batch_generate_class_mappings(batch_list, negatives)
File "/home/ubuntu/.local/lib/python3.9/site-packages/gliner/data_processing/processor.py", line 148, in batch_generate_class_mappings
negatives = self.get_negatives(batch_list, 100)
File "/home/ubuntu/.local/lib/python3.9/site-packages/gliner/data_processing/processor.py", line 74, in get_negatives
types = set([el[-1] for el in b['ner']])
KeyError: 'ner'
0%| | 0/24 [00:00<?, ?it/s]Editor is loading...
Leave a Comment