Untitled
user_3839718
python
a year ago
488 B
5
Indexable
df = pd.DataFrame(es_client).sample(frac=1).reset_index(drop=True) data = df[['title', 'ingredients', 'directions']] data.loc[:, 'ingredients'] = data['ingredients'].apply(lambda x: ' | '.join(x)) data.loc[:, 'directions'] = data['directions'].apply(lambda x: ' | '.join(x)) data = data[data['directions'].apply(lambda x: len(x) <= 256)] dataset = Dataset.from_pandas(data) tokenized_datasets = dataset.map(tokenize_function, batched=True, batch_size=1000)
Editor is loading...
Leave a Comment