Untitled

mail@pastecode.io avatarunknown
plain_text
a month ago
3.0 kB
1
Indexable
Never
I added few print statements in code below-

def input_evalution(input_processed_text, df_train_mtrx,tfidf_vector,df_act):
    print("Into Input Evaluation function")
    text=input_processed_text
    print("Text : ",text)
    tfidf_vector=tfidf_vector
    print("TFIDF Vector : ",tfidf_vector)
    df_train_mtrx=df_train_mtrx
    print("DF Train Matrix : ",df_train_mtrx)
    
    input_tfidf=tfidf_vector.transform([text])
    print(input_tfidf)
    x=input_tfidf.todense()
    df_tst = pd.DataFrame(x, 
                      columns=tfidf_vector.get_feature_names(), 
                      index=['test123'])
    print("Df Test Input Evaluation : ",df_tst)
    ## Appending df_tst to df_train
    df_train_mtrx = df_train_mtrx.append(df_tst)
    print("DF Train Matrix after appending : ",df_train_mtrx)
    ## Calculating Cosine Similarity
    scr=cosine_similarity(df_train_mtrx, df_tst)
    print("Cosine Similarity : ",scr)
    df_chk = pd.DataFrame()
    df_chk['ticket_id']=df_train_mtrx.index
    df_chk['score']=scr
    score = df_chk[(df_chk['score']>0.50) & (df_chk['ticket_id']!='test123')]['score'].tolist()
    df_eval = df_act[df_act['ticket_id'].isin(df_chk[df_chk['score']>0.50]['ticket_id'])]
    df_eval['score'] = score
    
    return df_eval,df_tst
 
and getting output as -

Text :  process hro - payroll benefits payments incorrect result dear sir, per attached screen shots  - default
TFIDF Vector :  TfidfVectorizer()
DF Train Matrix :        aarthis  ajinkya  akashsharda  akashshukla  akshaykulkarni  aniketpawar  \
0         0.0      0.0          0.0          0.0             0.0          0.0   
1         0.0      0.0          0.0          0.0             0.0          0.0   
2         0.0      0.0          0.0          0.0             0.0          0.0   
3         0.0      0.0          0.0          0.0             0.0          0.0   
4         0.0      0.0          0.0          0.0             0.0          0.0   
...       ...      ...          ...          ...             ...          ...   
2667      0.0      0.0          0.0          0.0             0.0          0.0   
2668      0.0      0.0          0.0          0.0             0.0          0.0   
2669      0.0      0.0          0.0          0.0             0.0          0.0   
2670      0.0      0.0          0.0          0.0             0.0          0.0   
2671      0.0      0.0          0.0          0.0             0.0          0.0   
...
[2672 rows x 93 columns]

User Recommendations: []

when we are doing input_tfidf=tfidf_vector.transform([text])
    print(input_tfidf)

we are not getting any output, since tf-idf is trained on the target column containing names like-aarthis  ajinkya  akashsharda  akashshukla  akshaykulkarni  aniketpawar ..

and text is - process hro - payroll benefits payments incorrect result dear sir, per attached screen shots  - default

SO is it due to that , there is some discrepancy due to which it is not able to transform.

Can you help here please.