Download - Zhang first pdf

Transcript
Page 1: Zhang first pdf

want he said might be right,but not include enough information,

so we could thought this review as a useless review.

tokenize

POS-tagging

stopwords

lemmatize

score(wi) =c

⇤wi �

P5k 6=⇤,k=1 � · ckwi

)

log(

P5k=1 c

kwi)

, ⇤ is most frequency rating

score(rj) =

Pwi2rj

score(wi)

len(rj)

� : discount rate

c

kwi

: # of word i in rating k reviews

len(rj) : # of words in review j

1 2 3 4 5