Untitled

 avatar
unknown
python
a year ago
4.9 kB
6
Indexable
import pandas as pd
from pymystem3 import Mystem
import datetime

try:
    data = pd.read_csv('toxic_comments.csv')
except:
    data = pd.read_csv('/datasets/toxic_comments.csv')
data

m = Mystem()

corpus = data['text'].values.astype('U')

def lemmatize(text):
    return "".join(m.lemmatize(text))

kol = 10
a=0
for i in corpus[:kol]:
    print()
    print(f"{datetime.datetime.now().strftime('%H:%M:%S.%f')[:-3]} - Лемматизированный текст {a+1}/{kol}: ")
    print(lemmatize(corpus[0]))
    a+=1

# Вывод
# 12:30:26.760 - Лемматизированный текст 1/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27
# 12:30:28.366 - Лемматизированный текст 2/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27
# 12:30:30.084 - Лемматизированный текст 3/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27
# 12:30:31.672 - Лемматизированный текст 4/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27
# 12:30:33.349 - Лемматизированный текст 5/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27
# 12:30:34.980 - Лемматизированный текст 6/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27
# 12:30:36.634 - Лемматизированный текст 7/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27
# 12:30:38.208 - Лемматизированный текст 8/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27
# 12:30:39.882 - Лемматизированный текст 9/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27
# 12:30:41.584 - Лемматизированный текст 10/10: 
# Explanation
# Why the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27


kol = 10
a=0
for i in corpus[:kol]:
    print()
    print(f"{datetime.datetime.now().strftime('%H:%M:%S.%f')[:-3]} - Лемматизированный текст {a+1}/{kol}: ")
    a+=1

# Вывод
# 12:31:34.197 - Лемматизированный текст 1/10: 
# 12:31:34.197 - Лемматизированный текст 2/10: 
# 12:31:34.197 - Лемматизированный текст 3/10: 
# 12:31:34.197 - Лемматизированный текст 4/10: 
# 12:31:34.197 - Лемматизированный текст 5/10: 
# 12:31:34.197 - Лемматизированный текст 6/10: 
# 12:31:34.197 - Лемматизированный текст 7/10: 
# 12:31:34.197 - Лемматизированный текст 8/10: 
# 12:31:34.197 - Лемматизированный текст 9/10: 
# 12:31:34.197 - Лемматизированный текст 10/10: 
Editor is loading...
Leave a Comment