mail@pastecode.io avatar
a month ago
1.7 kB
This is the code block from our code and we are supposed to run an API through this-

import pandas as pd
from nltk.corpus import stopwords
import string
import pickle
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from string import digits
from sklearn.feature_extraction.text import CountVectorizer
import numpy as np
import pdb
import re

#from fuzzywuzzy import fuzz

from feature_engineering import *
from param_config import config
from model_loading import loading_model

import time
start_time = time.time()
model_loading_start= time.time()

models = loading_model()

tfidf_matrix,tf_count_matrix,tfidf_vector,count_vector,df_act,Matching_data,embedding_dict,df_act_context = models.load_models()

model_loaded_time = time.time()
print(f"Models loaded in {model_loaded_time - model_loading_start} seconds.. ")

SO if u see carefully , models loading is there and it is trying to load below models-

-rw-r--r--. 1 root root   7134602 Nov  9 19:59 df_act_context.joblib
-rw-r--r--. 1 root root 104156726 Nov  9 19:59 glove_vector_dict.joblib
-rw-r--r--. 1 root root   7134602 Nov  9 19:59 raw_data.joblib
-rw-r--r--. 1 root root 475432850 Nov  9 19:59 tf_count.joblib
-rw-r--r--. 1 root root    201542 Nov  9 19:59 tf_countvector.joblib
-rw-r--r--. 1 root root 475432850 Nov  9 19:59 tfidf.joblib
-rw-r--r--. 1 root root    348059 Nov  9 19:59 tfidf_vector.joblib
-rw-r--r--. 1 root root      1899 Nov  9 19:53 unique_train_data_matrix.joblib

These models take more than 5 mins of time to load , and we cant afford to waste 5 mins everytime on an API call.. how to avoid this.

Is there any other better solution, can u help here please.
Leave a Comment