論文初稿3.0(Ying Liang Chen)
Transcript of 論文初稿3.0(Ying Liang Chen)
-
8/8/2019 3.0(Ying Liang Chen)
1/113
-
8/8/2019 3.0(Ying Liang Chen)
2/113
-
8/8/2019 3.0(Ying Liang Chen)
3/113
-
8/8/2019 3.0(Ying Liang Chen)
4/113
-
8/8/2019 3.0(Ying Liang Chen)
5/113
3.2.5 The L-KGS.............................................................................................................................................................36
3.2.6 The results of knowledge gap score (KGS)...........................................................................................................36
3.3 T HE EXPERT FINDING BY K NOWLEDGE GAP......................................................................................................................37
4. PROBLEM DEFINITION AND OUR APPROACH....................................................................................................39
4.1 P ROBLEM DEFINITION .....................................................................................................................................................39
4.2 T HE FLOWCHART OF OUR APPROACH ................................................................................................................................40
4.3 E XPERTISE MODEL .........................................................................................................................................................40
4.4 S CORES .........................................................................................................................................................................42
4.5 D IFFICULTY DEGREE MODEL ...........................................................................................................................................42
4.5.1 The part of asking.................................................................................................................................................43
4.5.2 The part of answering...........................................................................................................................................44
4.6 R EINFORCEMENT MODEL .................................................................................................................................................45
5. EXPERIMENTS...............................................................................................................................................................47
5.1 D ATASET .......................................................................................................................................................................47
5.1.1 The basic statistics analysis..................................................................................................................................47
5.1.2 The asking and answering of users in each category...........................................................................................48
5
-
8/8/2019 3.0(Ying Liang Chen)
6/113
5.1.3 The number of answers in each question..............................................................................................................52
5.1.4 The level of users for each category.....................................................................................................................53
5.1.5 The level of askers for each category....................................................................................................................54
5.2 A NSWER SET ..................................................................................................................................................................56
5.2.1 The number of answers in a question for answer set............................................................................................56
5.2.2 The length of question...........................................................................................................................................57
5.2.3 The length of answer.............................................................................................................................................58
5.3 B ASELINE ......................................................................................................................................................................59
5.4 E VALUATION METRICS ......................................................................................................................................................60
5.5 P ARAMETER TUNING ........................................................................................................................................................61
5.5.1 The results of parameter ......................................................................................................................................61
5.5.2 The results of parameter ......................................................................................................................................62
5.5.3 The results of parameter ......................................................................................................................................64
5.6 C OMPARISON TO BASELINE ...............................................................................................................................................65
5.6.1 F measure of easy question and hard question.....................................................................................................66
5.6.2 Roc curve and AUC...............................................................................................................................................69
5.6.3 The analysis of the expertise of users....................................................................................................................72
5.6.4 The examples of the outputs compared with the basic approaches......................................................................75
6. CONCLUSION AND FURTHER WORK.....................................................................................................................79
6
-
8/8/2019 3.0(Ying Liang Chen)
7/113
7. REFERENCE....................................................................................................................................................................80
REFERENCE
7
-
8/8/2019 3.0(Ying Liang Chen)
8/113
8
-
8/8/2019 3.0(Ying Liang Chen)
9/113
1. FIGURE LISTING
FIG 1. T HE HOMEPAGE OF YAHOO ! A NSWERS ......................................................................................................................................16
FIG 2 . T HE KNOWLEDGE GAP .............................................................................................................................................................22
FIG 3 . T HE KNOWLEDGE GAP DIAGRAM ...............................................................................................................................................30
FIG 4 . T HE NON -EXPERT AND EXPERT IN THE KNOWLEDGE GAP DIAGRAM ..................................................................................................32
FIG 5 . T HE CLASSIFICATION FOR NON -EXPERT AND EXPERT IN THE KNOWLEDGE GAP DIAGRAM ......................................................................33
FIG 6 . T HE FLOWCHART OF KGB -DR ALGORITHM ..............................................................................................................................40
FIG 7 . T HE NETWORK IN YA.............................................................................................................................................................41
FIG 8 . T HE NUMBER OF ANSWERING FOR EACH USER IN CATEGORIES .........................................................................................................49
FIG 9 . T HE AVERAGE OF NUMBER OF TOP X% USERS IN ANSWERING .........................................................................................................50
FIG 10 . T HE NUMBER OF ASKING FOR EACH USER IN CATEGORIES ............................................................................................................51
FIG 11 . T HE AVERAGE OF NUMBER OF TOP X% USERS IN ASKING .............................................................................................................51
FIG 12 . T HE NUMBER OF ANSWERS FOR EACH CATEGORY ........................................................................................................................52
FIG 13 . T HE RATIO OF ANSWERS IN EACH QUESTION ..............................................................................................................................53
FIG 14 . T HE DISTRIBUTION OF LEVEL OF ASKERS FOR EACH CATEGORY .....................................................................................................55
FIG 15 . T HE NUMBER OF ANSWERS IN ANSWER SET ...............................................................................................................................57
FIG 16 . T HE QUESTION LENGTH FOR EACH QUESTION IN THE ANSWER SETS ................................................................................................58
FIG 17 . T HE LENGTH OF ANSWERS FOR EACH QUESTION IN THE ANSWER SETS ............................................................................................59
FIG 18 . T HE TUNING OF PARAMETER ..................................................................................................................................................62
9
-
8/8/2019 3.0(Ying Liang Chen)
10/113
FIG 19 . T HE TUNING OF PARAMETER ..................................................................................................................................................63
FIG 20 . T HE TUNING OF PARAMETER ..................................................................................................................................................64
FIG 21 . T HE ROC CURVE FOR EACH METHOD AND EACH CATEGORY ........................................................................................................70
FIG 22 . T HE DISTRIBUTION OF EXPERTISE OF THE USERS FOR EACH CATEGORY ...........................................................................................73
FIG 23 . T HE MAPPING OF USER LEVEL AND THE EXPERTISE COMPUTED BY OUR APPROACH FOR EACH CATEGORY ..............................................74
10
-
8/8/2019 3.0(Ying Liang Chen)
11/113
FIG 23. THE MAPPING OF USER LEVEL AND THE EXPERTISE COMPUTED BY OUR
APPROACH FOR EACH CATEGORY TABLE LISTING
TABLE 1. T HE MAPPING FROM THE USER LEVEL TO POINTS IN YA ............................................................................................................16
TABLE 2 . T HE MAPPING OF ACTION TO THE GAIN OF POINTS ....................................................................................................................17
TABLE 3 . E ASY QUESTION VS. HARD QUESTION ....................................................................................................................................18
TABLE 4 . T HE KNOWLEDGE GAP DIAGRAM IN HEALTH ...........................................................................................................................31
TABLE 5 . D O THE NORMALIZATION FROM TABLE 4...............................................................................................................................31
TABLE 6 . T HE FOUR ZONES IN THE KNOWLEDGE GAP DIAGRAM ................................................................................................................33
TABLE 7 . T HE EVALUATION OF EACH FACTOR ........................................................................................................................................35
TABLE 8 . T HE DISTRIBUTION OF ASKER FOR EACH LEVEL IN CATEGORY HEALTH .........................................................................................35
TABLE 9 . T HE KGS FOR EACH CATEGORY ...........................................................................................................................................36
TABLE 10 . D ATASET STATISTICS .........................................................................................................................................................48
TABLE 11 . T HE STATISTICS OF LEVEL OF USERS CRAWLED FROM YA ........................................................................................................53
TABLE 12 . T HE PERCENTAGE OF EACH LEVEL OMITTING THE MISSING DATA ...............................................................................................54
TABLE 13 . T HE DISTRIBUTION DIVIDES TO THE TWO PARTS FROM FIG 15..................................................................................................56
TABLE 14 . T HE ANSWER SET FOR EACH CATEGORIES .............................................................................................................................56
TABLE 15 . T HE PARAMETERS FOR THE TWO MODELS .............................................................................................................................65
TABLE 16 . T HE COMPARISON OF METHODS FOR EACH CATEGORY .............................................................................................................66
TABLE 17 . T HE F MEASURE OF HARD QUESTION FOR EACH METHOD AND EACH CATEGORY ...........................................................................67
11
-
8/8/2019 3.0(Ying Liang Chen)
12/113
TABLE 18 . T HE F MEASURE OF EASY QUESTION FOR EACH METHOD AND EACH CATEGORY ............................................................................68
TABLE 19 . T HE AUC FOR EACH METHODS AND CATEGORIES ..................................................................................................................70
Table 19. The AUC for each methods and categories
12
-
8/8/2019 3.0(Ying Liang Chen)
13/113
Web2.0 , .
,
. , ,
. , (KGB-DR)
, KGB-DR . ,
. , .
, , , .
. , ,
. ,
. , ,
KGB-DR .KGB-DR , ,
. , . , KGB-DR
. , .
, , .
13
-
8/8/2019 3.0(Ying Liang Chen)
14/113
14
-
8/8/2019 3.0(Ying Liang Chen)
15/113
ABSTRACT
The CQA service is a typical forum of Web2.0 in sharing knowledge for each people, and there are millions of
questions have posted and solved every day. Because of the above reasons and the variant users in CQA service,
the question search and question ranking are the most important researches in the portal. In this paper, we address
the problem of detecting the question being easy or hard by means of a probability model. In addition, we propose
the approach called knowledge-gap-based difficulty rank (KGB-DR) algorithm that combines the user-user
network and the architecture of the CQA service to solve this problem. The expert finding is the important subtask
for the problem, if we want to detect the question being easy or hard, then the participation of experts is a major
factor. There are many researches existing for the work of expert finding, unfortunately, it is not enough to apply
to our problem due to the habits of users. That is to say, the expert can not only answer the hard questions but also
easy questions but the prior works all omit it. We observe the phenomenon called knowledge gap that is related to
the habit of users and we add this phenomenon to our KGB-DR algorithm for completing the expert finding. The
KGB-DR algorithm include two steps, one is expert finding step, used to compute the expertise of users. And
another is difficulty degree detecting step, used to compute the difficulty of questions and rank questions by
difficulty. Specifically, we design two models for KGB-DR algorithm. The one of two is called a local difficulty
model, and the model is based on users. Another is called a global difficulty model, and the model is based on all
questions. The experimental results show the use of the local difficulty model is essential to our approach and our
approach has the best performance over the all baseline approach.
15
-
8/8/2019 3.0(Ying Liang Chen)
16/113
16
-
8/8/2019 3.0(Ying Liang Chen)
17/113
1.
Web2.0 , , ,
, Naver, .
, , ,
: , , . ,
. ,
: ?.
, , ,
. , ,
: , .
:
. , ,
, . ,
.
. .
17
-
8/8/2019 3.0(Ying Liang Chen)
18/113
2.
, . ,
, ,
. .
, ,
. ,
, ,
, , .
.
3.
, Fig 3 ,
( 1~ 7) ,
. 1~ 3 , 4~ 7 1~
3 4~ 7 . ,
6 7 6.
. Fig 3 , , (A) (1,1) (1,6) 1
1 1 6 .
Fig 3 , . .
18
-
8/8/2019 3.0(Ying Liang Chen)
19/113
.
, ,
. F , ,
.
, .
, :1. 2.
. -2~+2 ,+2 Fig 3 (f) .
, , .
, .
4.
, . , , ?,
:
phq=pqhphpq
pqh , pqh ,
q . Fig 6 , : .
19
-
8/8/2019 3.0(Ying Liang Chen)
20/113
, , HITS .
, .
,
, ,
. . , .
, t , t
,
. .
5.
40,000 , 5 ,
(Martial arts), (Cycling), (Health), (Pets), (Software),
8,000 , Table 10 . 10
, 70~90 , Table 14 .
Eigenrumour Fujimura, K., Inoue, T., Sugisaki, M., (2005)The EigenRumor
Algorithm for Ranking Weblogs, Proceedings of the 2nd Annual Workshop on the
Weblogging Ecosystem: Aggregation, Analysis and Dynamics, WWW 2005. ,
( 1), Hits , ,
. precision,recall,f-measure ROC curve AUC
20
-
8/8/2019 3.0(Ying Liang Chen)
21/113
. , Agichtein, E., Castillo, C.,
Donato, D., Gionis, A. and Mishne, G. (2008) Finding high-quality content in social media.
Proceedings of the international conference on Web searc h and web data mining. ACM,
Palo Alto, California, USA. ( ) , ,
, , f-measure,
f-measure ; , .
, .
6.
,
,
.
.
21
-
8/8/2019 3.0(Ying Liang Chen)
22/113
1. INTRODUCTION
1.1 Background
Recently, the forum system of Web 2.0 is more and more popular and interesting. People can share or seek any
information from any place in the world. One of the most popular and useful forum system is Community-based
Question-Answering (CQA) portals. For example, the typical communities such as Yahoo! Answers 1 in English,
Naver 2 in Korean, and Baidu Knows 3 in Chinese, can be regarded as variations of online forums. In this paper, We
choose Yahoo!Answers where approximately over one hundred million resolved questions in English for our
research .
Yahoo! Answers(YA) ( Fig 1. The homepage of Yahoo! Answers ) is a CQA service where people can ask or
answer questions on any topic. There are an enormous amount of questions and answers posted in English-
language yahoo answers. In YA, users can ask or answer any questions in any categories by their volition and it
exist the points system weighted to encourage users to answer questions and to limit spam questions. There are
also levels (with point thresholds) which give more site access. When users answer a question, they can gain some
points. Additionally, users can gain more points if their answers are Best answer which is selected by the
question's asker or voted by the other users. The user level system in YA is reported in Table 1 , and we can see
there are seven degree levels with users in YA and each user account only has just a particular level. The
1 http://answers.yahoo.com2 http://www.naver.com3 http://zhidao.baidu.com
22
http://en.wikipedia.org/wiki/Forum_spamhttp://answers.yahoo.com/http://www.naver.com/http://zhidao.baidu.com/http://answers.yahoo.com/http://www.naver.com/http://zhidao.baidu.com/http://en.wikipedia.org/wiki/Forum_spam -
8/8/2019 3.0(Ying Liang Chen)
23/113
promoting of the user level is decided by the earning of points. For example, if the user first creates an account in
YA then the user will get 100 points and the level of the user is one. When the user earns more points by doing
several actions in YA and it bring about the points this user obtain are more than 249, the level of the user will
promote to level 2. In general, as more points user earns, the level user obtains is higher; however, the level user
has is up to seven at most. How many points user earns is decided by what type of action user does in YA, and the
mapping of actions and points is listed in Table 2. For example, answering a question will get two points, but
users can obtain the extra ten points when their answers are selected as the best answer. In addition, there are
some actions will lose the points if user does. For example, if users ask a question then users will lose five points,
or users will lose two points if they delete their answer in the question.
23
-
8/8/2019 3.0(Ying Liang Chen)
24/113
Fig 1. The homepage of Yahoo! Answers
24
-
8/8/2019 3.0(Ying Liang Chen)
25/113
Table 1. The mapping from the user level to points in YA
Table 2. The mapping of action to the gain of points
25
-
8/8/2019 3.0(Ying Liang Chen)
26/113
1.2 Motivation
On account of the increasing progressively of questions and answers, user cant find the questions they want to
know or discuss effectively, on account of this reason, most of works in CQA service are to improve their
functionalities like question ranking, question search, or question recommendation. But the output of the prior
work do not consider the expertise (or authority) of users, i.e., for question ranking, the user may be a amateur in
this area and the outputs of the search engine are so hard that the user cant understand it, or the user who is expert
may wants to search the harder questions but the results are easy on the top of rank. Thus, in this paper, we are
26
http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E9%81%9E%E5%A2%9Ehttp://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E9%81%9E%E5%A2%9E -
8/8/2019 3.0(Ying Liang Chen)
27/113
concerned with how to rank questions by difficulty . The work we want to do is similar to find high-quality
questions Agichtein, E., Castillo, C., Donato, D., Gionis, A. and Mishne, G. (2008) Finding high-quality content
in social media. Proceedings of the international conference on Web searc h and web data mining. ACM, Palo
Alto, California, USA. Jurczyk, P. and Agichtein, E. (2007) Hits on question answer portals: exploration of link
analysis for author ranking. Proceedings of the 30th annual international ACM SIGIR conference on Research and
development in information retrieval. ACM, Amsterdam, The Netherlands. that identify the question as high-
quality or low-quality; however, our work is to tell the question as easy or hard. Moreover, the work of expert
finding is associated with our task strongly. The expert is the user who is familiar with the particular topic or
category, and the expert also can solve most of questions about the particular category. The non-expert is contrary
to the expert, and it includes the amateur who is strange to the category and the novice who attach the topic for a
little time but not much familiar with it. For example, we list three easy questions and hard questions about
karate respectively in Table 3 . The first easy question What ideas do you have in my first karate class? is can
be identified the asker as an amateur on karate and the question also can be solved by the amateur who have been
to the karate class over two times. On the contrary, the hard question such as What is the difference between
Traditional Okinawan Karate and the modern sport Karate? , if the user who want to solve or answer this
question, he/she must has the certain knowledge or experience in karate and the amateur who attach karate in a
little time must not be available to answer it.
Table 3. Easy question vs. Hard question
27
-
8/8/2019 3.0(Ying Liang Chen)
28/113
As a result of the above describing, we list the following principles from observation without considering the
text significance that easy or hard question must be included:
1. If the asker of the question is more close to expert, the question has higher probability to be a hard
question.
2. If most of the repliers on the question are amateur, then the question must be easy, instead, the more
expert on the question, the harder the question is.
3. If experts and amateurs have answered the question and the asker chooses amateur as best answer, it
shows that the question is so easy that the amateur can answers the right content that asker also believe it
can tackle his/her question.
In short, the difference between easy question and hard question is the ratio of non-expert and expert participating
in the question respectively, and we can separate the task as two parts: expert finding and question ranking
Determining the question as easy or hard is not trivial, there are some noisy in the real world. First, identifying
the question as hard or easy is determined by people, i.e. one may think the question being hard because he/she
has not been to see it ever and another think the question being easy due to his/her experience, i.e., it is subjective
28
-
8/8/2019 3.0(Ying Liang Chen)
29/113
for everyone. Second, there are some questions that asker just want to share or discuss something instead of
asking something, its hard to tell the question is easy or hard, the same thing also happen in the question that
exist the war of word or abusive content. Third, sometimes we consider there are more hard terms in the hard
question than in the easy question, but people can also use easy terms to ask or answer questions, so we cant
using the significance of terms to solve, on the contrary, we utilize the relationship between users and the
framework in YA to tackle this problem. Finally, noting that we consider the difficulty degree of a question is
determined by how many experts in it; however, the fact in real world is that the experts not only answer hard
questions but also easy questions and it bring out the difficulty of determining the question is easy or hard. In the
prior works, there exist some works that ranking blogs by using the PageRank Brin, S. & Page, L., (1998) The
anatomy of a large-scale hypertextual web search engine. Proceedings of the seventh international conference on
World Wide Web 7. Brisbane, Australia: Elsevier Science Publishers B. V. and HITS Kleinberg, J.M. (1999)
Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632. algorithm, such as EigenRumor
Fujimura, K., Inoue, T., Sugisaki, M., (2005)The EigenRumor Algorithm for Ranking Weblogs, Proceedings of
29
http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0 -
8/8/2019 3.0(Ying Liang Chen)
30/113
the 2nd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis algorithm. The EigenRumor
algorithm uses the interactions between posters, repliers, and blogs in order to rank blogs, and then users can
obtain the feedback of the blog scores to fix their authorities, and the iteration of the progress will terminate when
the score of blogs change significantly. The nodes and the architecture in EigenRumor algorithm are similar to us.
A question containing an asker and answerers corresponds to a blog containing a poster and repliers, but
EigenRumor algorithm cant apply to find hard questions in CQA service because of omitting the relationship
between hard questions and users. That is to say, if we use EigenRumor algorithm to compute the score of
questions, then the feedback of the scores will back to the users that have asked or answered the question. But the
experts can answer the easy questions in the real world, the feedback of expert may increasing the expertise of
other non-expert who have answered the same question with experts, and it is not reasonable to detect the
expertise of users and the difficulty degree of questions.
30
-
8/8/2019 3.0(Ying Liang Chen)
31/113
1.1 Method Abstract
Although the above difficulties are discussed in this task; however, we observe the phenomenon called
knowledge gap exists in the CQA service. The knowledge gap is a phenomenon which associates with users and it
is illustrated in Fig 2 . The stairs in Fig 2 represent the difficulty in the specific topic and the user who stands
higher stair represents he/she has more knowledge in this topic. For example, the expert in Fig 2 stands on the top
of the stairs indicates the expert is familiar to all things about the topic; however, the amateur and the novice stand
on the bottom of the stairs owing to their strange with the topic, and the knowledge gap is that the gap between
any two users. For example, the gap A in Fig 2 represents the knowledge gap between the expert and the novice
and the gap B represents the knowledge gap between the expert and the novice. The gap may bring about two
situations. The one of two is that the user who stands on the lower stair does not have ability to answer the
question that the user who stands on the higher stair has asked. Another is that the user who stands on the higher
stair may has no interesting in answering the question that the lower-stair user has asked although he/she has
enough knowledge. For instance, the one of the hard questions in Table 3 : What are advantages of karate over
other martial arts?, if the user wants to answer the question, the user must have more knowledge about karate
and other martial arts and then he/she can compare the difference between the karate and martial arts. The
progress is hard to a novice or an amateur because of their poor knowledge. On the other hand, the one of the easy
questions in Table 3 : What ideas do you have in my first karate class? , the question is so easy to experts who
have a lot of experiences in karate and experts may have less interesting in answering the question because the
experts can see the same questions every day. In addition, the boundary in Fig 2 represents the line that separates
the hard question and easy question. In the short word, there are two main principles on the knowledge gap. First,
31
-
8/8/2019 3.0(Ying Liang Chen)
32/113
the non-experts have no ability to answer the question that over their knowledge. Second, the experts have less
interesting in the question that under the certain difficulty degree. In this paper, we investigate the knowledge gap
how to and how much effect CQA service and we propose knowledge-gap-based difficulty rank (KGB-DR)
algorithm to tackle the task that determine the question being easy or hard. Specifically, KGB-DR makes use of
the structure of CQA service and the relationship between users, and there are the iteration of two steps namely an
expert finding step and a difficulty degree detecting step.
(1) Expert finding step:
This step is to compute the probability of a user being an expert in asking or answering, we use graph-based
algorithm to compute first and then adjust them by the feedback from the difficulty degree detecting step in
order to fit the knowledge gap between users.
(2) Difficulty degree detecting step:
The goal of this step is to compute the probability of a question being a hard question and rank these
questions by their difficulty degree score.
32
-
8/8/2019 3.0(Ying Liang Chen)
33/113
Fig 2. The knowledge gap
1.1 Application
Given a set of questions and rank the questions by difficulty degree can apply the following case potentially. First,
for CQA service and discussion board, it can shows types of difficulty-degree questions to types of users. For
example, if an amateur enters the particular category, then rout him/her the easy question. Second, it can promote
the performance of the search engine by adding the difficulty-degree factor, the outputs are different for each
types of users. Third, ranking question by difficulty is to find the suitable questions for the types of users, user can
select the types of difficulty questions that they want. It shows that can apply to the commercial advertisement.
For example, for the cycling, most of novice just need cheep or usable bicycle and expert may want to the other
functional such as lighter, tougher, or even high-level part in bicycle comparatively.
1.2 Paper Organization
The rest of this paper is structured as follows. In Section RELATED WORK we briefly discuss related work. In
Section The knowledge gap , we describe the knowledge gap in more detail. In Section Problem definition and
Our Approach , we define our task by the probability model and we present our approach called KGB-DR
algorithm to solve this task. Next, experimental results are reported in Section Experiments. Section concludes
the paper and our future work.
33
-
8/8/2019 3.0(Ying Liang Chen)
34/113
2. RELATED WORK
The task of identifying the question is easy or hard is similar the prior works such question ranking, question
search, or question recommendation. The common goal of the above tasks is to support users to find questions
they want to know effectively and the difference between us is that our goal is to rank the questions in the
particular category by the difficulty or recommend hard questions to expert, easy questions to non-expert.
Besides, the most of important subtask in this task is to find experts in the particular category and we utilize the
user-user reactions to model it. The related work we refer previously will be particularized in this section.
2.1 Link Analysis
PageRank Brin, S. & Page, L., (1998) The anatomy of a large-scale hypertextual web search engine.
Proceedings of the seventh international conference on World Wide Web 7. Brisbane, Australia: Elsevier Science
Publishers B. V. is a link analysis algorithm that calculates the importance score for each element of a group of
hyperlinked documents. By using the linking structure of the web pages, PageRank interprets the links as the
indicator for the importance of pages. Basically, a link from page A to page B is regarded as a vote from page A to
page B. The algorithm intends to estimate the probability distribution representing a user randomly clicking on
links. In addition, a page linked from a page with high PageRank score would be given a high score. Kleinberg
34
-
8/8/2019 3.0(Ying Liang Chen)
35/113
Kleinberg, J.M. (1999) Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632. proposed the
Hyperlink-Induced Topic Search (HITS) algorithm. This is another prominent algorithm for ranking web pages.
The critical concepts are the hubs and authorities. For each web page, two scores are calculated based on the
corresponding concepts. The hub score presents the quality of links to other pages about that topic, while the
authority score indicates the quality of the page content. Besides web pages, Yupeng et al. [1] investigated the co-
occurrences of people from web pages and communication patterns from emails to discover the relationships
among people.
2.2 Expert finding in social media
There are several researches focusing on how to find expert by modeling and computing user-user graph such
as McCallum, A, A. Corrada-Emanuel, and X. Wang. (2005) Topic and role discovery in social networks. In
Proceedings of 19th International Joint Conference on Artificial Intelligence. Jurczyk, P. and Agichtein, E. (2007)
Discovering authorities in question answer communities by using link analysis. Proceedings of the sixteenth ACM
conference on Conference on information and knowledge management. ACM, Lisbon, Portugal. Zhang, J.,
Ackerman, M.S. and Adamic, L. (2007) Expertise networks in online communities: structure and algorithms.
Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta, Canada. . McCallum
McCallum, A, A. Corrada-Emanuel, and X. Wang. (2005) Topic and role discovery in social networks. In
Proceedings of 19th International Joint Conference on Artificial Intelligence. et al. utilize the user-user graph to
35
-
8/8/2019 3.0(Ying Liang Chen)
36/113
find the experts for particular topics and then Zhang et al. Zhang, J., Ackerman, M.S. and Adamic, L. (2007)
Expertise networks in online communities: structure and algorithms. Proceedings of the 16th international
conference on World Wide Web. ACM, Banff, Alberta, Canada. use the network-based ranking algorithms HITS
Kleinberg, J.M. (1999) Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632. and PageRank
Brin, S. & Page, L., (1998) The anatomy of a large-scale hypertextual web search engine. Proceedings of the
seventh international conference on World Wide Web 7. Brisbane, Australia: Elsevier Science Publishers B. V. to
identify users with high expertise. Their results show high correlation between link-based metrics and the answer
quality. Next, Liu et al Liu, X., Croft, W.B. and Koll, M. (2005) Finding experts in community-based question-
answering services. Proceedings of the 14th ACM international conference on Information and knowledge
management. ACM, Bremen, Germany. use the features such as authors activity, number of clicks for finding the
best answers for a given question. Then Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question
answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on
information and knowledge management. ACM, Lisbon, Portugal. Jurczyk, P. and Agichtein, E. (2007) Hits on
question answer portals: exploration of link analysis for author ranking. Proceedings of the 30th annual
international ACM SIGIR conference on Research and development in information retrieval. ACM, Amsterdam,
The Netherlands. use HITS algorithm to find the authority of users in question/answering network that an asker is
linked to an answerer if the answerer replies the question that asker has been asked. The experiments show that
the obtained authority score is better than simply counting the number of answers an answerer has given.
Although the results demonstrate that HITS algorithm is a good approach to find expert, but there are some
potential factor like the habit of asking questions and replying answers may reduce the performance. Unlike
36
-
8/8/2019 3.0(Ying Liang Chen)
37/113
Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question answer communities by using link
analysis. Proceedings of the sixteenth ACM conference on Conference on information and knowledge
management. ACM, Lisbon, Portugal. and Zhang, J., Ackerman, M.S. and Adamic, L. (2007) Expertise networks
in online communities: structure and algorithms. Proceedings of the 16th international conference on World Wide
Web. ACM, Banff, Alberta, Canada. , Zhou et al. Zhou, Y., Cong, G., Cui, B., Jensen, C.S. and Yao, J. (2009)
Routing Questions to the Right Users in Online Communities. Proceedings of the 2009 IEEE International
Conference on Data Engineering. IEEE Computer Society. utilize the structural relations among users in the
forum system and use content-based probability model to find the experts for the particular question.
In the other social media, Balog et al. Balog, K., Azzopardi, L. and Rijke, M.d. (2006) Formal
models for expert finding in enterprise corpora. Proceedings of the 29th annual
international ACM SIGIR conference on Research and development in information retrieval.
ACM, Seattle, Washington, USA. propose extended language models to address the expert finding problem
in enterprise corpora. From 2005, Text REtrieval Conference (TREC) has provided a platform with the Enterprise
Search Track for researchers to empirically assess their methods for expert finding N. Craswell, A.P.d.V., and I.
Soboroff. (2005) Overview of the trec-2005 enterprise track TREC05, 199-205. . In addition, there are some
researches for finding experts over the e-mail corpus such as Campbell, C.S., Maglio, P.P., Cozzi, A. and Dom, B.
(2003) Expertise identification using email communications. Proceedings of the twelfth international conference
37
-
8/8/2019 3.0(Ying Liang Chen)
38/113
on Information and knowledge management. ACM, New Orleans, LA, USA.Dom, B., Eiron, I., Cozzi, A. and
Zhang, Y. (2003) Graph-based ranking algorithms for e-mail expertise analysis. Proceedings of the 8th ACM
SIGMOD workshop on Resea rch issues in data mining and knowledge discovery. ACM, San Diego, California..
2.3 Question Ranking
Question ranking is to rank the results for purpose of browsing-time decreasing, the typical approach is using
the information of Q&A content such as best ratings such as Adamic, L.A., Zhang, J., Bakshy, E. and Ackerman,
M.S. (2008) Knowledge sharing and yahoo answers: everyone knows something. Proceeding of the 17th
international conference on World Wide Web. ACM, Beijing, China.. Bian, J., Liu, Y., Agichtein, E. and Zha, H.
(2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of the 17th
international conference on World Wide Web. ACM, Beijing, China. .Jeon et al. Jeon, J., Croft, W.B., Lee, J.H. and
Park, S. (2006) A framework to predict the quality of answers with non-textual features. Proceedings of the 29th
annual international ACM SIGIR conference on Research and development in information retrieval. ACM,
Seattle, Washington, USA. addressed the answer quality problem in a community QA portal and tried to estimate
it, they used a set of non-textual features such as answer length, number of points received, etc. for determining
answer quality. Agichtein et al. Agichtein, E., Castillo, C., Donato, D., Gionis, A. and Mishne, G. (2008) Finding
high-quality content in social media. Proceedings of the international conference on Web searc h and web data
mining. ACM, Palo Alto, California, USA. expands on Jeon, J., Croft, W.B., Lee, J.H. and Park, S. (2006) A
38
-
8/8/2019 3.0(Ying Liang Chen)
39/113
framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual
international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle,
Washington, USA. by exploring a larger range of features including both structural, textual, and community
features. Su et al. Su, Q., Pavlov, D., Chow, J.-H. and Baker, W.C. (2007) Internet-scale collection of human-
reviewed data. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta,
Canada. analyzed the quality of each answers varies significantly. Bian et al. Bian, J., Liu, Y., Agichtein, E. and
Zha, H. (2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of
the 17th international conference on World Wide Web. ACM, Beijing, China. proposed to solve collaborative QA
by considering both answer quality and relevance and also used content-based quality answers without
considering user expertise. Because of the answer quality problem on Bian, J., Liu, Y., Agichtein, E. and Zha, H.
(2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of the 17th
international conference on World Wide Web. ACM, Beijing, China. Jeon, J., Croft, W.B., Lee, J.H. and Park, S.
(2006) A framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual
international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle,
Washington, USA. Su, Q., Pavlov, D., Chow, J.-H. and Baker, W.C. (2007) Internet-scale collection of human-
reviewed data. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta,
Canada. , Suryanoto et al. Suryanto, M.A., Lim, E.P., Sun, A. and Chiang, R.H.L. (2009) Quality-aware
collaborative question answering: methods and evaluation. Proceedings of the Second ACM International
Conference on Web Search and Data Mining. ACM, Barcelona, Spain. propose a quality-aware framework that
considers both answer relevance and answer quality derived from answer features and expertise of answerers.
39
-
8/8/2019 3.0(Ying Liang Chen)
40/113
Wang et al. Wang, X.-J., Tu, X., Feng, D. and Zhang, L. (2009) Ranking community answers by modeling
question-answer relationships via analogical reasoning. Proceedings of the 32nd international ACM SIGIR
conference on Research and development in information retrieval. ACM, Boston, MA, USA. consider questions
and their answers as relational data but instead model them as independent information and propose link-based
algorithm for evaluating the quality of answers.
2.4 Question Search
The research of question search is first conducted using FAQ data Burke, R.D., Hammond, K.J., Kulyukin,
V.A., Lytinen, S.L., Tomuro, N. and Schoenberg, S. (1997) Question Answering from Frequently Asked Question
Files: Experiences with the FAQ Finder System. University of Chicago., Lai, Y.-S., Fung, K.-A. and Wu, C.-H.
(2002) FAQ mining via list detection. proceedings of the 2002 conference on multilingual summarization and
question answering - Volume 19. Association for Computational Linguistics. ,Sneiders, E. (2002) Automated
Question Answering Using Question Templates That Cover the Conceptual Model of the Database. Proceedings
of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers.
Springer-Verlag. . FAQ finder Burke, R.D., Hammond, K.J., Kulyukin, V.A., Lytinen, S.L., Tomuro, N. and
Schoenberg, S. (1997) Question Answering from Frequently Asked Question Files: Experiences with the FAQ
Finder System. University of Chicago. heuristically combines statistical similarities and semantic similarities
between questions and FAQs. Sneiders Sneiders, E. (2002) Automated Question Answering Using Question
Templates That Cover the Conceptual Model of the Database. Proceedings of the 6th International Conference on
Applications of Natural Language to Information Systems-Revised Papers. Springer-Verlag. proposed template
40
-
8/8/2019 3.0(Ying Liang Chen)
41/113
based FAQ retrieval systems. Lai et al. Lai, Y.-S., Fung, K.-A. and Wu, C.-H. (2002) FAQ mining via list
detection. proceedings of the 2002 conference on multilingual summarization and question answering - Volume
19. Association for Computational Linguistics. proposed an approach to automatically mine FAQs from the web.
Recently, the research of question search has been further extended to the cQA. For example, Jeon et al. Jeon,
J., Croft, W.B. and Lee, J.H. (2005) Finding similar questions in large question and answer archives. Proceedings
of the 14th ACM international conference on Information and knowledge management. ACM, Bremen, Germany.J
eon, J., Croft, W.B. and Lee, J.H. (2005) Finding semantically similar questions based on their answers.
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in
information retrieval. ACM, Salvador, Brazil.Jeon, J.A.C., W. B., Learning translation-based language models
using q&a archives. Technical Report,University of Massachusetts. compared four different retrieval methods, i.e.
cosine similarity, Okapi, language model (LM), and statistical machine translation model (SMT), for
automatically fixing the lexical chasm between questions of question search. They found that the SMT-based
method performed the best.
2.5 Question Recommendation
The question recommendation is like question search, the goal of two search is to find the questions that user
prefer to know or reply. The first question recommendation problem was addressed by Cao et al. Cao, Y., Duan,
H., Lin, C.-Y., Yu, Y. and Hon, H.-W. (2008) Recommending questions using the mdl-based tree cut model.
Proceeding of the 17th international conference on World Wide Web. ACM, Beijing, China., they consider a good
41
-
8/8/2019 3.0(Ying Liang Chen)
42/113
recommendation should provide alternative aspects around users interest and they use MDL-based tree cut model
to tackle this problem. Sun et al. Sun, K., Cao, Y., Song, X., Song, Y.-I., Wang, X. and Lin, C.-Y. (2009) Learning
to recommend questions based on user ratings. Proceeding of the 18th ACM conference on Information and
knowledge management. ACM, Hong Kong, China. address the recommend question problem, they use user
rating and propose majority-based perceptron algorithm for avoiding the influence of noisy instances by
emphasizing its learning over data instances from majority users and they show the effectiveness of their approach
through intensive experiments.
The question recommendation and question ranking are similar to what we discuss in this paper, but they have
the same problem as the most related works about question search we discuss above: they focused on content-
based analysis and ignored the interaction between user and user. The recently works about the interaction of
users like Nam, K.K., Ackerman, M.S. and Adamic, L.A. (2009) Questions in, knowledge in?: a study of naver's
question answering community. Proceedings of the 27th international conference on Human factors in computing
systems. ACM, Boston, MA, USA. in Naver, Adamic, L.A., Zhang, J., Bakshy, E. and Ackerman, M.S. (2008)
Knowledge sharing and yahoo answers: everyone knows something. Proceeding of the 17th international
conference on World Wide Web. ACM, Beijing, China.. in YA, and Yang, J., and Wei, X.,(2009) Seeking and
Offering Expertise across Categories: A in Baidu analyzed the activity of users in Q&A forums and found the
types of users such as expert and non-expert having different habit in asking and answering.
42
-
8/8/2019 3.0(Ying Liang Chen)
43/113
3. THE KNOWLEDGE GAP
In this section, we investigate the knowledge gap how to and how much effect in YA. First, we will propose the
phenomenon that called knowledge gap in YA or other CQA service and explain the relationship between the
knowledge gap and the hard question in section The Knowledge Gap in YA . Second, we will show how to
quantify this phenomenon for each category in YA. Finally, we will propose some problems about expert finding
and what type of expert we want to extract in section The Expert Finding .
3.1 The Knowledge Gap in YA
According to Table 1 and Table 2 , we can assume the higher the level, the more expertise user has for the
moment, and the question that higher level user asked have a high probability being a hard question, and we
combine the level 6 and level 7 as level 6 for conveniently. Fig 3 shows that the proportion of each level of askers
replied by each level of the answerers in five categories. The color of a zone (i,j) is more darker than the one of a
zone (k,j) if the proportion of the level j askers replied by level i answerers is higher than the proportion of the
level j askers replied by level k answerers, for example, in Fig 3 (a), the zone (1,1) is more black than zone
(2~6,1). It indicates that the ratio of the questions level 1 users have asked are replied by the level 1 users more
than the users of other levels. In Fig 3 (a), we can see the questions that higher level user asked and the ratio of the
43
-
8/8/2019 3.0(Ying Liang Chen)
44/113
higher level user replied is higher and while the questions that low level user asked are contrary. There are two
explains to prove the above situation. First, the high level users tend to reply or participate in the challenging
question that can contribute their knowledge to other people. Second, the low level users just can reply the easy
question that the low level users asked because of their poor knowledge, so we can see the questions that level 1
answerers and level 2 answerers reply are focusing on the questions that from level 1 user to level 4 user ask. The
observation of Fig 3 tell us that the hard question q is related to the expertise of users that participate in the
question q, and we call this phenomenon as a Knowledge Gap and the graph corresponding to the knowledge
gap is called knowledge gap diagram . Besides, there are different degrees of the knowledge gap for each
category. For example, the degree of the knowledge gap in some categories is strong such as Martial arts, Pets,
and Health. However, the degree of the knowledge gap in some categories is weak such as Cycling and
Software, in fact, we can see the questions that low-level user have asked have been answered by high-level users
more than low-level users in Fig 3 (b)Cycling , and it indicates that the experts in Cycling are enthusiastic to solve
the question that amateur has asked. On the contrary, in Fig 3 (e)Software, we can see the low-level users also
have ability to answer the questions that high-level users have asked, and it shows that the most of
questions in Software are easy so that the low-level users can answer it. In addition, Fig 3 (f) is a perfect
knowledge gap diagram that low-level users all answer the questions that low-level users have asked and
high-level users just answer the questions that high-level users have asked. Therefore, the using of the
knowledge gap is most important in detecting the question being easy or hard, and we will show how to utilize the
knowledge gap in the next section.
44
-
8/8/2019 3.0(Ying Liang Chen)
45/113
45
-
8/8/2019 3.0(Ying Liang Chen)
46/113
Fig 3. The knowledge gap diagram
3.2 The Quantification of Knowledge Gap Diagram
3.2.1 The preliminary for quantifying the knowledge gap diagram
Table 4 is the prototype of knowledge gap diagram in category Health, and we will show how to quantify the
knowledge gap diagram by this table in this paragraph. First, we normalize each asker level from Table 4 , and the
result is shown as Table 5Table 4. Note that there are two principles in the knowledge gap, one of two is that the
non-experts have no ability to answer the question which over their knowledge and another is the experts have
less interesting in the question which under the certain difficulty degree. We then formulate the knowledge gap
diagram as follow by the two principles:
KGSC= ExpertC+NonexpertC
Where C is the category in YA, and ExpertC and Nonexpert(C) are the knowledge gap score for expert and non-
expert in the category C.
Table 4. The knowledge gap diagram in Health
46
-
8/8/2019 3.0(Ying Liang Chen)
47/113
Table 5Table 4. Do the normalization from Table 4
3.2.2 The four zones in the knowledge gap diagram
Now we assume if the level of the user is from 1 to 3 then the user is a non-expert, and if the level of the user is
from 4 to 6 then the user is an expert. Fig 4 illustrates the non-expert and expert in the knowledge gap diagram,
Fig 4 (a) means the area that non-expert and expert can move respectively, and Fig 4 (b) means the area of
interdiction that non-expert and expert cannot move frequently, i.e., the non-experts cannot answer the question
that experts have asked frequently and the experts cannot answer the question that non-experts have asked
frequently. According to the Fig 4 , we can divide four zones in the knowledge gap diagram, and the four zones
47
-
8/8/2019 3.0(Ying Liang Chen)
48/113
can be illustrated as Fig 5 . First, the zone of regular non-expert (RN), and this zone gathers the user who is a non-
expert and focuses on answering easy questions due to his/her poor knowledge. Second, the zone of promising
non-expert (PN), and this zone gathers the user who is a non-expert at present but has ability to answer hard
questions. Third, the zone of regular expert (RE), and this zone gathers the user who is an expert and has ability to
answer the hard questions that experts have asked. Finally, the zone of enthusiastic expert (EE), this zone gathers
the user who is an expert but focusing on answering easy questions due to his/her enthusiasm. Besides, Table 6 is
an example of four zones from the mapping of Table 5Table 4 .
Fig 4. The non-expert and expert in the knowledge gap diagram
48
-
8/8/2019 3.0(Ying Liang Chen)
49/113
Fig 5. The classification for non-expert and expert in the knowledge gap diagram
Table 6. The four zones in the knowledge gap diagram
3.2.3 The quantification of non-expert and expert
Recall that the knowledge gap score is based on the answering of the expert and non-expert, and we will show
how to quantify the two factors in four viewpoints. First, the behavior of answering questions for non-expert and
expert can be viewed as the responsibility. That is to say, the non-experts cannot answer hard questions due to
their inexperience, so answering easy questions is the responsibility for non-experts. In this paper, we evaluate the
responsibility of non-experts by F(RN)-F(PN), where F(RN) is the flow of non-experts in the zone RN and F(PN)
49
-
8/8/2019 3.0(Ying Liang Chen)
50/113
is the flow of non-experts in the area PN. If the value of F(RN)-F(PN) is bigger, then it indicates the non-experts
focus on answering easy questions than hard questions compared to the other experts. In addition, we quantify
F(RN) and F(PN) by summing all ratios in the zone RN and PN respectively. For example, F(RN)=1.51 and
F(PN)=1.23 in Table 6 . Second, the experts have more knowledge and experiences than the other non-experts, so
answering hard questions can be seen as their responsibility. In the same way, we quantify the value of the
responsibility of experts by F(RE)-F(EE). The rest of the viewpoints are about the asking of non-expert and
expert. We assume that the question that expert has asked is hard question before. However, we want to know the
difficulty is credible or not credible. In general, the hard question should be answered by experts and if non-
experts also can answer it then the difficulty of the hard question is doubt. We utilize F(RE)-F(PN) to evaluate the
credibility of hard questions that experts have asked, where F(RE) is the flow of experts in the zone RE and F(PN)
is the flow of non-experts in the zone PN and the value of F(RE)-F(PN) is bigger representing the hard questions
that experts have asked is credible. In the same way, we quantify the credibility of easy questions that non-experts
have asked by F(RN)-F(EE), where F(RN) is the flow of non-experts in the zone RN and F(EE) is the flow of
experts in the zone EE. The value of F(RN)-F(EE) is bigger representing the easy questions that non-experts have
asked is credible. Furthermore, we summarize this paragraph as Table 7 .
50
-
8/8/2019 3.0(Ying Liang Chen)
51/113
Table 7. The evaluation of each factor
3.2.4 The weight for non-expert and expert
In general, because of the number of askers in each level is variant, the weight of setting is necessary for expert
and non-expert, and we utilize the ratio of asker level to set the weight. The distribution of asker for each level in
category Health is shown as Table 8 , Table 8 (a) is the distribution from level 1 to level 7 and Table 8 (b) is the
ratio of non-expert (from level 1 to level 4) and expert (from level 4 to level 7), in addition, we choose the ratio in
Table 8 (b) to weight the expert and non-expert. The computation of non-expert and expert is shown as:
ExpertC=Wex*(2*FcEE-FcRE-Fc(PN))
51
-
8/8/2019 3.0(Ying Liang Chen)
52/113
NonexpertC=Wnon*(2*FcRN-FcEE-Fc(PN))
Where Wex is the ratio of the questions experts have asked and Wnon is the ratio of the questions non-experts
have asked, and FcRE , FcEE ,Fc(PN) ,and FcRN are the four zones of category C we define in Table 6 . Finally,
the value of KGS(Health) is 0.296*0.9023+0.796*0.0977=0.43.
Table 8. The distribution of asker for each level in category Health
3.2.5 The L-KGS
Note that the knowledge gap diagram is based on the level of users; however, we dont know the level of users is
gained from this category or other categories. Thus, we must add the other level index to adjust it. In general, we
can know that the more the question user answers the more points and level user obtains from Table 1 and Table
2. According to this reason, we utilize the average number of answers asker replies to adjust their user level.
Additionally, we also set the weight as the square root of the number, and the formula are followed as:
L-KGSC= ExpertC+NonexpertC
ExpertC=Wex*Lex*(2*FcRE-FcEE-Fc(PN))
52
-
8/8/2019 3.0(Ying Liang Chen)
53/113
NonexpertC=Wnon*Lnon*(2*FcRN-FcEE-Fc(PN))
Where Lex is the number of replies per asker who is an expert and Lnon is the number of replies per asker who is
a non-expert and both do the square root of its.
3.2.6 The results of knowledge gap score (KGS)
The results for all categories can be seen in Table 9 , and the order of KGS by descent is Pets>Health>
Software>Martial arts>Cycling. However, after considering the number of questions askers reply, the ranking of
L-KGS is Pets>Health>Martial arts>Software>Cycling. The ranking growth of the Martial arts in L-KGS due to
the more number of questions pre asker answered than the category Software.
Table 9 . The KGS for each category
53
-
8/8/2019 3.0(Ying Liang Chen)
54/113
3.3 The Expert Finding By Knowledge Gap
In general, we regard the higher-level user as the expert. But there are some noisy problems in QA portal, i.e.,
there exist some people that their user level is not high but they are experts. The following examples are the parts
of the noisy expert finding using user level in YA:
Case1. The level of users is from gaining points such as Table 2, and the points users gained are from different
categories. It shows that if the level of user is 7, it is not true that the user is all expert in the categories
he/she have been asked or replied.
54
-
8/8/2019 3.0(Ying Liang Chen)
55/113
Case2. If you usually reply questions in YA, your level should increase quickly, however, your level may increase
slowly if you reply questions rarely. In the real world, some real experts do not have much time to reply in
YA, it bring their level are not very high.
Case3. Although some users are novice in the category, their knowledge is less than real expert. But they are
enjoying replying the questions that the other amateur or novice asked in the category, and though they do
not have ability to reply the hard questions but they reply a lot of easy question instead. The result is that
their user level increases quickly but they are not real expert.
Case4. The most of users may not have only one account id in YA, however, they reply the question q1 using
account id 1 and reply the other question q2 using account id 2, this situation bring their expertise to
ambiguous. Additionally, some users may have bad intention to create many accounts to ask question and
reply by other accounts for increasing their level quickly.
The methods that the prior works Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question
answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on
information and knowledge management. ACM, Lisbon, Portugal. Jurczyk, P. and Agichtein, E. (2007) Hits on
question answer portals: exploration of link analysis for author ranking. Proceedings of the 30th annual
international ACM SIGIR conference on Research and development in information retrieval. ACM, Amsterdam,
The Netherlands. Zhang, J., Ackerman, M.S. and Adamic, L. (2007) Expertise networks in online communities:
55
-
8/8/2019 3.0(Ying Liang Chen)
56/113
structure and algorithms. Proceedings of the 16th international conference on World Wide Web. ACM, Banff,
Alberta, Canada. has proposed can solve most of problems, but it is not enough if we want to use the expertise of
users to detect the question being easy or hard. In fact, the habits of users in asking and answering are an
important factor and it reflects on the knowledge gap diagram for each category, but none of the previous works
consider it. For example, the non-experts could answer so many easy questions that bring about the abnormal
expertise increasing of the non-experts; nevertheless, the experts could answer easy questions though the quality
of the questions is low. In this paper, the expert we want to find is different from the prior works,
comparatively, we want to find the answer-hard-question-frequently expert according to the two principles of the
knowledge gap we discussed previously. That is to say, we want to find the user that not only an expert but also
focusing on answering hard questions more than the other users.
56
-
8/8/2019 3.0(Ying Liang Chen)
57/113
4. PROBLEM DEFINITION AND OUR APPROACH
In this section, we describe the progress of our problem as probability model and propose an algorithm called
knowledge-gap-based difficulty rank (KGB-DR) algorithm to predict the difficulty degree of the question is
easy or hard. The KGB-DR includes two steps: expert finding step and difficulty degree detecting step. Expert
finding step is not only to find the expert but also adjust the expertise by the knowledge gap, while difficulty
degree detecting step is to compute the difficulty degree of the questions, in addition, when the difficulty degree
score of the questions have been computed, the expert finding step can obtains the feedback from the difficulty
degree model and then the difficulty degree detecting step can computes the score of the questions again. We will
detail the two steps how to work.
4.1 Problem Definition
In this paper, we state the problem of the difficulty degree on questions q is by means of a probability model.
The probability model is used widely by the expert finding recently Balog, K., Azzopardi, L. and Rijke, M.d.
(2006) Formal models for expert finding in enterprise corpora. Proceedings of the 29th annual international ACM
SIGIR conference on Research and development in information retrieval. ACM, Seattle, Washington, USA. Fang,
H. and Zhai, C. (2007) Probabilistic models for expert finding. Proceedings of the 29th European conference on
57
-
8/8/2019 3.0(Ying Liang Chen)
58/113
IR research. Springer-Verlag, Rome, Italy. Zhou, Y., Cong, G., Cui, B., Jensen, C.S. and Yao, J. (2009) Routing
Questions to the Right Users in Online Communities. Proceedings of the 2009 IEEE International Conference on
Data En gineering. IEEE Computer Society., and we also use probability model to formulate our task, furthermore,
we add the link analysis and the knowledge gap to solve this problem. Given a question q, the probability of the
difficulty degree being hard is estimated as follows:
phq=pqhphpq (1)
where ph is the probability of the difficulty-degree being hard and pq is the probability of a question, which is the
same for all questions. In our task, we just identify the question is easy or hard for conveniently and set a
threshold t. If the difficulty rank of the question is less than t, the question is hard. So ph is estimated by t/n,
where n is the total number of questions in the category. Thus, the ranking of questions is proportional to the
probability of the question given the difficulty degree being hard. In this study our task is therefore to capture how
much hard on the question q by pqh .
4.2 The Flowchart of Our Approach
The flowchart of our approach is illustrated in Fig 6 . Given a particular category or a set of categories, we first
using expertise model by the asking-answering network to capture the expertise including asking and answering
of the users. Difficulty degree model is used to calculate the difficulty degree score of the input-questions by the
framework in YA, and question ranking is ranking the questions by their difficulty degree score. The
58
-
8/8/2019 3.0(Ying Liang Chen)
59/113
reinforcement model is to reinforce the expertise of users by the phenomenon of the knowledge gap via the DDR
index that prior step calculates. In short, the progress of our system only has two steps: the expert finding step and
the difficulty degree detecting step, and the expert finding is to find the answer-hard-question-frequently expert,
while the difficulty degree detecting is to determine the difficulty degree score of questions. The loop will
terminate when the difficulty degree score of the questions change significantly.
Fig 6. The flowchart of KGB-DR algorithm
4.3 Expertise Model
We utilize the prior work Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question answer
communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on information
and knowledge management. ACM, Lisbon, Portugal. as our expertise model, we will introduce this model briefly
in this paragraph. The link structure in the expertise model can be shown in Fig 7 , a particular question has a
number of answers that single user replies for each one. An edge from a user to a question means that the user
asked the question, and an edge from an answer to a user means that the answer was posted by this user. For
example, in Fig 7 (a), user 1 has posted question 1 and user 2 has posted question 2, but both of them has never
59
-
8/8/2019 3.0(Ying Liang Chen)
60/113
-
8/8/2019 3.0(Ying Liang Chen)
61/113
and are updated iteratively using the equation above. After each iteration, the values in the H and A vectors are
normalized, so that the highest hub and the highest authority values are 1.
4.4 Scores
The KGB-DR algorithm exist three major scores, two scores belong to users and used to combine the expertise
of users and knowledge gap , while the other one only belongs to the question.
pask(u|h) (the asker-part score of users)
It indicates that the probability of an asker-expert of the user u given the difficulty-degree model h. This
represents the ability of asking the hard question of the user, and the higher the score, the stronger the probability
of the user in asking hard questions. The score is initialized to the hub value that section Expertise Model referred.
pans(u|h) (the answerer-part score of users)
61
-
8/8/2019 3.0(Ying Liang Chen)
62/113
It indicates that the probability of an answerer-expert of the user u given the difficulty-degree model h. This
represents the ability of answering the hard question of the user, and the higher the score, the stronger the
probability of the user in answering hard questions. The score is initialized to the auth value that section Expertise
Model referred.
pqh (the difficulty degree score of questions)
It indicates that the probability of the question given the difficulty-degree being hard. This score is the goal we
want to evaluate in our task. This represents how much hard the question is, the higher the score, the stronger the
probability of the question being hard.
4.5 Difficulty Degree Model
Recalling our goal is to estimate the value of pqh , and the question can be divided to two major parts: the part of
asking and the part of answering and each part is assumed to be generated independently. Thus, the probability of
question q given the hard difficulty degree is obtained by taking the summation across the two parts users on the
question:
62
-
8/8/2019 3.0(Ying Liang Chen)
63/113
pqh=*pcqah+1-*pcqrh (2)
Where pcqah is the probability of the part in asking given the hard difficulty degree, and while pcqrh is the
probability of the part in answering given the hard difficulty degree. Additionally, we set a parameter [0,1] to
adjust the weight of the two parts, and we will discuss the two parts respectively
4.5.1 The part of asking
In the part of asking, we capture an estimate of pcqah by means of three types of architecture of CQA service.
The first is the relationship between the content that the asker has asked on the question and the question, and we
formalize it by the length of the content. The other is the relationship between the question and the other questions
that asker had posted before. Another is the relationship between the asker and the difficulty degree model. It can
be expressed as:
pcqah=pcqaq*pqa*paskah (3)
Where pcqaq is the probability of the content that asker a asks on the question q and pqa is how much knowledge
that the asker a contribute to the question q and paskah is the probability of an asker-expert of the asker a given
63
-
8/8/2019 3.0(Ying Liang Chen)
64/113
the difficulty degree model h. The value of paskah is initialized to the hub value that section Expertise Model
referred and we will discuss the change of the next value later in this section.
To compute an estimation of pcqaq , we use the length of the content and normalize it by the square root such as
the following:
pcquq=lqu(i qlqi) (4)
Where cqu is the content that user u has posted on the question q and lqu is the length of cqu and i qlqi is the
summation of the length of all the posts on the question q. According to above estimation, the estimation of pqa is
defined by:
pqu=lqumaxq' ulqu (5)
Where maxq' ulqu is the maximum of the length that user u had asked before. The estimation of pqu is similar
to pcquq , but we use maximum instead of summation to compute pqu . It is reasonable that the hard questions
must be using more word to ask and while the easy questions instead, and for the same asker, the question that the
64
-
8/8/2019 3.0(Ying Liang Chen)
65/113
maximum length posted by the asker is stronger probability of being hard question than the other questions that
this asker has asked.
4.5.2 The part of answering
In the part of answerer, we also divided this into two parts: the best answerer and the other repliers, so the
probability of the part of the answerer given the hard difficulty degree can be obtained as follows:
pcqrh=*pcqbh+1-* pcqr,h (6)
Where pcqbh is the probability of the content that the best answerer b replies on the question q, and while pcqr,h
is the probability of the content that the other answerers reply on the question q. The computing of the two factors
is similar to the equation (3), but paskuh changes to pansuh such as following:
pcqbh=pcqbq*pqb*pansbh (7)
pcqr,h=i qr' pcqiq*pqi*pans(i|h) (8)
65
-
8/8/2019 3.0(Ying Liang Chen)
66/113
-
8/8/2019 3.0(Ying Liang Chen)
67/113
-
8/8/2019 3.0(Ying Liang Chen)
68/113
Where H is the total number of hard questions in category C, and we called this equation as global difficulty
probability module. Because the above score may contain zero probability, we must smooth it for computing the
expertise of users. Thus, we conduct the probability model h to our calculating and construct two models from
equation 9 and equation 10. Therefore, the local difficulty model of paskuh and pansuh is represented as:
paskuhk+1=*paskuh+1-*paskuhk (11)
pansuhk+1=*pansuh+1-*pansuhk (12)
Where paskuhk and pansuhk are the kth-iteration-number score of paskuh and pansuh which paskuh0
and pansuh0 are hub and authority that base HITS algorithm computes in the prior section, and [0,1] .
Another way is global difficulty model such as :
paskuhk+1 *paskhu+1-*paskuhk (13)
pansuhk+1 *panshu+1-*pansuhk (14)
68
-
8/8/2019 3.0(Ying Liang Chen)
69/113
1. EXPERIMENTS
In this section, we will introduce and analyze our dataset crawled from YA in Section Dataset and the
answer set in Section Answer set . Next, the baselines we compare to our approach are listed in Section
Baseline and the use of evaluation metrics in Section Evaluation metrics . The results of parameter tuning
of our approach is shown as Section Parameter tuning. Finally, the results of performance with all
methods are exhibited in Section Comparison to baseline .
1.1 Dataset
1.1.1 The basic statistics analysis
We crawled 40,000 resolved questions from Yahoo! Answers service in English for our experiments, and these
questions from five categories respectively such as Martial arts , Cycling , Health , Pets , and Software, and there
are 8,000 questions from each category. The dataset statistics are shown in Table 10, we will analyze and discuss
these data before experiments clearly. From the attribute of # of answers and avg answers per question, we
can see the number of the answers per question in two categories Martial arts and Pets are more than the other
categories, furthermore, martial arts has the less number of users and less number of askers than other categories,
69
-
8/8/2019 3.0(Ying Liang Chen)
70/113
it shows that the users repeat to ask and answer with other users, in a short word, there are highest activity
between users in the category martial arts. However, the category Cycling is contrary to Martial arts, the number
of answers and the users are the least than other categories, and the ratio of questions with only one answer is
25%. Furthermore, the number of answerers and the length of content that user have posted are also least than the
other categories, and it indicates that there are low activity between users in this category. The category Health is
as same as the category Cycling, the ratio of questions with only one answer is higher than the other categories,
however, the number of users and answerers in Health are much more than other categories else. It shows that
users posted contents in category Health only once mostly. The average answers per question in the category
Software is only higher than Cycling, but the number of questions with one answer is least than the other
categories. Additionally, the number of askers in Software is 7,232, but the number of total questions is 8,000. It
says that there are a lot of unique users ask and do it only once in this category, and this is a reason that brings
about the knowledge gap diagram of Software in Fig 3 (e) i.e., the questions are easy mostly in this category.
Specifically, for the types of attribute in Table 10, we find that some attributes are related to the knowledge gap
probably. First, the average answers per question, we can see the categories with strong knowledge gap such as
Martial arts, Health, and Pets have more average answers per question than the categories with weak knowledge
gap such as Cycling and Software. That is to say, the more answers in the question represents the high probability
of the question being hard. Second, the average of length of content that user post, it is make sense because we
use more word to describe the question if the more hard the question is, in addition, the length of asking is more
related to the knowledge gap than the length of answering.
70
-
8/8/2019 3.0(Ying Liang Chen)
71/113
Table 10. Dataset statistics
1.1.2 The asking and answering of users in each category
The number of answering for each user in categories is shown in Fig 8 and the average of number of answering
for top x%~y% users is shown in Fig 9, where y-x=5, and we can see how many users always answer questions in
each category. First, we can see the tail of three categories (a)Martial arts ,(b)Health, and (c)Pets are shorter than
the other two categories in Fig 8 , and the shorter tail indicates that the less users answer just once in the category.
However, the length of head in Fig 9 represents that the number of users usually answer in the category more than
71
-
8/8/2019 3.0(Ying Liang Chen)
72/113
once, and it is interesting that two of the longest length of the head are Martial arts and Cycling, but the length of
tail of Cycling in Fig 8 is also longer. It shows that there are many users answer questions over a long period of
time and there are also many passing visitors answer question just once. The category Health and Pets are contrary
to the category Cycling, although the length of tail of the two categories in Fig 8 are shorter, the length of head of
the two categories are also shorter, and it indicates that there are less passing visitors who answer question just
once and also less users who answer question frequently.
Fig 8. The number of answering for each user in categories
72
-
8/8/2019 3.0(Ying Liang Chen)
73/113
Fig 9. The average of number of top x% users in answering
Now we investigate the asking of users for each category, and Fig 10 shows that the number of asking for each
user in categories and Fig 11 represents the average of number of asking for top x%~y% users, where y-x=5, and
we can see how many users always ask questions in each category. In Fig 10 , we can see the length of tail for each
category is as same as long and the category Software has the longest tail compare to other categories, and the
length of tail of asking is shorter than answering for most of categories. However, the category Software is only a
category that the length of tail of asking longer than answering, and we can infer this is why there are most of
easy questions in this category, the most of users ask few questions so they cant ask more harder question and the
most of users answer few questions so they cant obtain more experience in this category. Besides, we can regard
this feature as the type of the knowledge gap diagram of the category Software. The length of head of the two
category Martial arts and Cycling is longer than other categories in Fig 11 , and it shows that there are stable users
who ask questions in the two categories.
Fig 10. The number of asking for each user in categories
73
-
8/8/2019 3.0(Ying Liang Chen)
74/113
-
8/8/2019 3.0(Ying Liang Chen)
75/113
Martial arts and Pets, and another group contains the others. From the first group and Fig 13 , the ratio of the two
categories Martial arts and Pets is lower in the number of answers for each question between one and three, but
the ratio is higher followed by the increasing of answers, and it indicates that the two categories belong to the
category that most of questions replied by many users. Second, on the contrary, the questions of the other three
categories such as Cycling, Health and Software are replied by a few users frequently, and the questions which
replied by more than ten users are rarely, due to the intuition, we can infer the three categories belong to the most
of questions are easy question so that the questions can resolved by a few users.
Fig 12. The number of answers for each category
Fig 13. The ratio of answers in each question
75
-
8/8/2019 3.0(Ying Liang Chen)
76/113
1.1.4 The level of users for each category
The statistics of each level of users crawled from YA are shown in Table 11 and the percentage of each level
omitting the missing data is reported in Table 12 .
Table 11. The statistics of level of users crawled from YA
76
-
8/8/2019 3.0(Ying Liang Chen)
77/113
Table 12. The percentage of each level omitting the missing data
1.1.5 The level of askers for each category
Although the level of users cannot equals to the expertise of users, however, the level of users is similar to the
experience of users in YA, and we still regard the level of the askers as the difficulty degree of the questions
according to this assumption. The distribution of each level of askers for each category is shown in Fig 14, and the
distribution of the lv1 askers occupy the most of percentage for all categories, it shows that the percentage of the
easy question is more than hard question for all categories. The category Pets has the least ratio 39.05% of lv1
askers and the other category has mo