Untitled
user_3839718
python
2 years ago
488 B
6
Indexable
df = pd.DataFrame(es_client).sample(frac=1).reset_index(drop=True)
data = df[['title', 'ingredients', 'directions']]
data.loc[:, 'ingredients'] = data['ingredients'].apply(lambda x: ' | '.join(x))
data.loc[:, 'directions'] = data['directions'].apply(lambda x: ' | '.join(x))
data = data[data['directions'].apply(lambda x: len(x) <= 256)]
dataset = Dataset.from_pandas(data)
tokenized_datasets = dataset.map(tokenize_function, batched=True, batch_size=1000)Editor is loading...
Leave a Comment