論文初稿3.0(Ying Liang Chen)

8/8/2019 3.0(Ying Liang Chen)

1/113


2/113


3/113


4/113


5/113

3.2.5 The L-KGS.............................................................................................................................................................36

3.2.6 The results of knowledge gap score (KGS)...........................................................................................................36

3.3 T HE EXPERT FINDING BY K NOWLEDGE GAP......................................................................................................................37

4. PROBLEM DEFINITION AND OUR APPROACH....................................................................................................39

4.1 P ROBLEM DEFINITION .....................................................................................................................................................39

4.2 T HE FLOWCHART OF OUR APPROACH ................................................................................................................................40

4.3 E XPERTISE MODEL .........................................................................................................................................................40

4.4 S CORES .........................................................................................................................................................................42

4.5 D IFFICULTY DEGREE MODEL ...........................................................................................................................................42

4.5.1 The part of asking.................................................................................................................................................43

4.5.2 The part of answering...........................................................................................................................................44

4.6 R EINFORCEMENT MODEL .................................................................................................................................................45

5. EXPERIMENTS...............................................................................................................................................................47

5.1 D ATASET .......................................................................................................................................................................47

5.1.1 The basic statistics analysis..................................................................................................................................47

5.1.2 The asking and answering of users in each category...........................................................................................48

5


6/113

5.1.3 The number of answers in each question..............................................................................................................52

5.1.4 The level of users for each category.....................................................................................................................53

5.1.5 The level of askers for each category....................................................................................................................54

5.2 A NSWER SET ..................................................................................................................................................................56

5.2.1 The number of answers in a question for answer set............................................................................................56

5.2.2 The length of question...........................................................................................................................................57

5.2.3 The length of answer.............................................................................................................................................58

5.3 B ASELINE ......................................................................................................................................................................59

5.4 E VALUATION METRICS ......................................................................................................................................................60

5.5 P ARAMETER TUNING ........................................................................................................................................................61

5.5.1 The results of parameter ......................................................................................................................................61



5.6 C OMPARISON TO BASELINE ...............................................................................................................................................65

5.6.1 F measure of easy question and hard question.....................................................................................................66

5.6.2 Roc curve and AUC...............................................................................................................................................69

5.6.3 The analysis of the expertise of users....................................................................................................................72

5.6.4 The examples of the outputs compared with the basic approaches......................................................................75

6. CONCLUSION AND FURTHER WORK.....................................................................................................................79

6


7/113

7. REFERENCE....................................................................................................................................................................80

REFERENCE

7


8/113

8


9/113

1. FIGURE LISTING

FIG 1. T HE HOMEPAGE OF YAHOO ! A NSWERS ......................................................................................................................................16

FIG 2 . T HE KNOWLEDGE GAP .............................................................................................................................................................22

FIG 3 . T HE KNOWLEDGE GAP DIAGRAM ...............................................................................................................................................30

FIG 4 . T HE NON -EXPERT AND EXPERT IN THE KNOWLEDGE GAP DIAGRAM ..................................................................................................32

FIG 5 . T HE CLASSIFICATION FOR NON -EXPERT AND EXPERT IN THE KNOWLEDGE GAP DIAGRAM ......................................................................33

FIG 6 . T HE FLOWCHART OF KGB -DR ALGORITHM ..............................................................................................................................40

FIG 7 . T HE NETWORK IN YA.............................................................................................................................................................41

FIG 8 . T HE NUMBER OF ANSWERING FOR EACH USER IN CATEGORIES .........................................................................................................49

FIG 9 . T HE AVERAGE OF NUMBER OF TOP X% USERS IN ANSWERING .........................................................................................................50

FIG 10 . T HE NUMBER OF ASKING FOR EACH USER IN CATEGORIES ............................................................................................................51

FIG 11 . T HE AVERAGE OF NUMBER OF TOP X% USERS IN ASKING .............................................................................................................51

FIG 12 . T HE NUMBER OF ANSWERS FOR EACH CATEGORY ........................................................................................................................52

FIG 13 . T HE RATIO OF ANSWERS IN EACH QUESTION ..............................................................................................................................53

FIG 14 . T HE DISTRIBUTION OF LEVEL OF ASKERS FOR EACH CATEGORY .....................................................................................................55

FIG 15 . T HE NUMBER OF ANSWERS IN ANSWER SET ...............................................................................................................................57

FIG 16 . T HE QUESTION LENGTH FOR EACH QUESTION IN THE ANSWER SETS ................................................................................................58

FIG 17 . T HE LENGTH OF ANSWERS FOR EACH QUESTION IN THE ANSWER SETS ............................................................................................59

FIG 18 . T HE TUNING OF PARAMETER ..................................................................................................................................................62

9


10/113



FIG 21 . T HE ROC CURVE FOR EACH METHOD AND EACH CATEGORY ........................................................................................................70

FIG 22 . T HE DISTRIBUTION OF EXPERTISE OF THE USERS FOR EACH CATEGORY ...........................................................................................73

FIG 23 . T HE MAPPING OF USER LEVEL AND THE EXPERTISE COMPUTED BY OUR APPROACH FOR EACH CATEGORY ..............................................74

10


11/113

FIG 23. THE MAPPING OF USER LEVEL AND THE EXPERTISE COMPUTED BY OUR

APPROACH FOR EACH CATEGORY TABLE LISTING

TABLE 1. T HE MAPPING FROM THE USER LEVEL TO POINTS IN YA ............................................................................................................16

TABLE 2 . T HE MAPPING OF ACTION TO THE GAIN OF POINTS ....................................................................................................................17

TABLE 3 . E ASY QUESTION VS. HARD QUESTION ....................................................................................................................................18

TABLE 4 . T HE KNOWLEDGE GAP DIAGRAM IN HEALTH ...........................................................................................................................31

TABLE 5 . D O THE NORMALIZATION FROM TABLE 4...............................................................................................................................31

TABLE 6 . T HE FOUR ZONES IN THE KNOWLEDGE GAP DIAGRAM ................................................................................................................33

TABLE 7 . T HE EVALUATION OF EACH FACTOR ........................................................................................................................................35

TABLE 8 . T HE DISTRIBUTION OF ASKER FOR EACH LEVEL IN CATEGORY HEALTH .........................................................................................35

TABLE 9 . T HE KGS FOR EACH CATEGORY ...........................................................................................................................................36

TABLE 10 . D ATASET STATISTICS .........................................................................................................................................................48

TABLE 11 . T HE STATISTICS OF LEVEL OF USERS CRAWLED FROM YA ........................................................................................................53

TABLE 12 . T HE PERCENTAGE OF EACH LEVEL OMITTING THE MISSING DATA ...............................................................................................54

TABLE 13 . T HE DISTRIBUTION DIVIDES TO THE TWO PARTS FROM FIG 15..................................................................................................56

TABLE 14 . T HE ANSWER SET FOR EACH CATEGORIES .............................................................................................................................56

TABLE 15 . T HE PARAMETERS FOR THE TWO MODELS .............................................................................................................................65

TABLE 16 . T HE COMPARISON OF METHODS FOR EACH CATEGORY .............................................................................................................66

TABLE 17 . T HE F MEASURE OF HARD QUESTION FOR EACH METHOD AND EACH CATEGORY ...........................................................................67

11


12/113

TABLE 18 . T HE F MEASURE OF EASY QUESTION FOR EACH METHOD AND EACH CATEGORY ............................................................................68

TABLE 19 . T HE AUC FOR EACH METHODS AND CATEGORIES ..................................................................................................................70

Table 19. The AUC for each methods and categories

12


13/113

Web2.0 , .

,

. , ,

. , (KGB-DR)

, KGB-DR . ,

. , .

, , , .

. , ,

. ,

. , ,

KGB-DR .KGB-DR , ,

. , . , KGB-DR

. , .

, , .

13


14/113

14


15/113

ABSTRACT

The CQA service is a typical forum of Web2.0 in sharing knowledge for each people, and there are millions of

questions have posted and solved every day. Because of the above reasons and the variant users in CQA service,

the question search and question ranking are the most important researches in the portal. In this paper, we address

the problem of detecting the question being easy or hard by means of a probability model. In addition, we propose

the approach called knowledge-gap-based difficulty rank (KGB-DR) algorithm that combines the user-user

network and the architecture of the CQA service to solve this problem. The expert finding is the important subtask

for the problem, if we want to detect the question being easy or hard, then the participation of experts is a major

factor. There are many researches existing for the work of expert finding, unfortunately, it is not enough to apply

to our problem due to the habits of users. That is to say, the expert can not only answer the hard questions but also

easy questions but the prior works all omit it. We observe the phenomenon called knowledge gap that is related to

the habit of users and we add this phenomenon to our KGB-DR algorithm for completing the expert finding. The

KGB-DR algorithm include two steps, one is expert finding step, used to compute the expertise of users. And

another is difficulty degree detecting step, used to compute the difficulty of questions and rank questions by

difficulty. Specifically, we design two models for KGB-DR algorithm. The one of two is called a local difficulty

model, and the model is based on users. Another is called a global difficulty model, and the model is based on all

questions. The experimental results show the use of the local difficulty model is essential to our approach and our

approach has the best performance over the all baseline approach.

15


16/113

16


17/113

1.

Web2.0 , , ,

, Naver, .

, , ,

: , , . ,

. ,

: ?.

, , ,

. , ,

: , .

:

. , ,

, . ,

.

. .

17


18/113

2.

, . ,

, ,

. .

, ,

. ,

, ,

, , .

.

3.

, Fig 3 ,

( 1~ 7) ,

. 1~ 3 , 4~ 7 1~

3 4~ 7 . ,

6 7 6.

. Fig 3 , , (A) (1,1) (1,6) 1

1 1 6 .

Fig 3 , . .

18


19/113

.

, ,

. F , ,

.

, .

, :1. 2.

. -2~+2 ,+2 Fig 3 (f) .

, , .

, .

4.

, . , , ?,

:

phq=pqhphpq

pqh , pqh ,

q . Fig 6 , : .

19


20/113

, , HITS .

, .

,

, ,

. . , .

, t , t

,

. .

5.

40,000 , 5 ,

(Martial arts), (Cycling), (Health), (Pets), (Software),

8,000 , Table 10 . 10

, 70~90 , Table 14 .

Eigenrumour Fujimura, K., Inoue, T., Sugisaki, M., (2005)The EigenRumor

Algorithm for Ranking Weblogs, Proceedings of the 2nd Annual Workshop on the

Weblogging Ecosystem: Aggregation, Analysis and Dynamics, WWW 2005. ,

( 1), Hits , ,

. precision,recall,f-measure ROC curve AUC

20


21/113

. , Agichtein, E., Castillo, C.,

Donato, D., Gionis, A. and Mishne, G. (2008) Finding high-quality content in social media.

Proceedings of the international conference on Web searc h and web data mining. ACM,

Palo Alto, California, USA. ( ) , ,

, , f-measure,

f-measure ; , .

, .

6.

,

,

.

.

21


22/113

1. INTRODUCTION

1.1 Background

Recently, the forum system of Web 2.0 is more and more popular and interesting. People can share or seek any

information from any place in the world. One of the most popular and useful forum system is Community-based

Question-Answering (CQA) portals. For example, the typical communities such as Yahoo! Answers 1 in English,

Naver 2 in Korean, and Baidu Knows 3 in Chinese, can be regarded as variations of online forums. In this paper, We

choose Yahoo!Answers where approximately over one hundred million resolved questions in English for our

research .

Yahoo! Answers(YA) ( Fig 1. The homepage of Yahoo! Answers ) is a CQA service where people can ask or

answer questions on any topic. There are an enormous amount of questions and answers posted in English-

language yahoo answers. In YA, users can ask or answer any questions in any categories by their volition and it

exist the points system weighted to encourage users to answer questions and to limit spam questions. There are

also levels (with point thresholds) which give more site access. When users answer a question, they can gain some

points. Additionally, users can gain more points if their answers are Best answer which is selected by the

question's asker or voted by the other users. The user level system in YA is reported in Table 1 , and we can see

there are seven degree levels with users in YA and each user account only has just a particular level. The

1 http://answers.yahoo.com2 http://www.naver.com3 http://zhidao.baidu.com

22
http://en.wikipedia.org/wiki/Forum_spamhttp://answers.yahoo.com/http://www.naver.com/http://zhidao.baidu.com/http://answers.yahoo.com/http://www.naver.com/http://zhidao.baidu.com/http://en.wikipedia.org/wiki/Forum_spam


23/113

promoting of the user level is decided by the earning of points. For example, if the user first creates an account in

YA then the user will get 100 points and the level of the user is one. When the user earns more points by doing

several actions in YA and it bring about the points this user obtain are more than 249, the level of the user will

promote to level 2. In general, as more points user earns, the level user obtains is higher; however, the level user

has is up to seven at most. How many points user earns is decided by what type of action user does in YA, and the

mapping of actions and points is listed in Table 2. For example, answering a question will get two points, but

users can obtain the extra ten points when their answers are selected as the best answer. In addition, there are

some actions will lose the points if user does. For example, if users ask a question then users will lose five points,

or users will lose two points if they delete their answer in the question.

23


24/113

Fig 1. The homepage of Yahoo! Answers

24


25/113

Table 1. The mapping from the user level to points in YA

Table 2. The mapping of action to the gain of points

25


26/113

1.2 Motivation

On account of the increasing progressively of questions and answers, user cant find the questions they want to

know or discuss effectively, on account of this reason, most of works in CQA service are to improve their

functionalities like question ranking, question search, or question recommendation. But the output of the prior

work do not consider the expertise (or authority) of users, i.e., for question ranking, the user may be a amateur in

this area and the outputs of the search engine are so hard that the user cant understand it, or the user who is expert

may wants to search the harder questions but the results are easy on the top of rank. Thus, in this paper, we are

26
http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E9%81%9E%E5%A2%9Ehttp://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E9%81%9E%E5%A2%9E


27/113

concerned with how to rank questions by difficulty . The work we want to do is similar to find high-quality

questions Agichtein, E., Castillo, C., Donato, D., Gionis, A. and Mishne, G. (2008) Finding high-quality content

in social media. Proceedings of the international conference on Web searc h and web data mining. ACM, Palo

Alto, California, USA. Jurczyk, P. and Agichtein, E. (2007) Hits on question answer portals: exploration of link

analysis for author ranking. Proceedings of the 30th annual international ACM SIGIR conference on Research and

development in information retrieval. ACM, Amsterdam, The Netherlands. that identify the question as high-

quality or low-quality; however, our work is to tell the question as easy or hard. Moreover, the work of expert

finding is associated with our task strongly. The expert is the user who is familiar with the particular topic or

category, and the expert also can solve most of questions about the particular category. The non-expert is contrary

to the expert, and it includes the amateur who is strange to the category and the novice who attach the topic for a

little time but not much familiar with it. For example, we list three easy questions and hard questions about

karate respectively in Table 3 . The first easy question What ideas do you have in my first karate class? is can

be identified the asker as an amateur on karate and the question also can be solved by the amateur who have been

to the karate class over two times. On the contrary, the hard question such as What is the difference between

Traditional Okinawan Karate and the modern sport Karate? , if the user who want to solve or answer this

question, he/she must has the certain knowledge or experience in karate and the amateur who attach karate in a

little time must not be available to answer it.

Table 3. Easy question vs. Hard question

27


28/113

As a result of the above describing, we list the following principles from observation without considering the

text significance that easy or hard question must be included:

1. If the asker of the question is more close to expert, the question has higher probability to be a hard

question.

2. If most of the repliers on the question are amateur, then the question must be easy, instead, the more

expert on the question, the harder the question is.

3. If experts and amateurs have answered the question and the asker chooses amateur as best answer, it

shows that the question is so easy that the amateur can answers the right content that asker also believe it

can tackle his/her question.

In short, the difference between easy question and hard question is the ratio of non-expert and expert participating

in the question respectively, and we can separate the task as two parts: expert finding and question ranking

Determining the question as easy or hard is not trivial, there are some noisy in the real world. First, identifying

the question as hard or easy is determined by people, i.e. one may think the question being hard because he/she

has not been to see it ever and another think the question being easy due to his/her experience, i.e., it is subjective

28


29/113

for everyone. Second, there are some questions that asker just want to share or discuss something instead of

asking something, its hard to tell the question is easy or hard, the same thing also happen in the question that

exist the war of word or abusive content. Third, sometimes we consider there are more hard terms in the hard

question than in the easy question, but people can also use easy terms to ask or answer questions, so we cant

using the significance of terms to solve, on the contrary, we utilize the relationship between users and the

framework in YA to tackle this problem. Finally, noting that we consider the difficulty degree of a question is

determined by how many experts in it; however, the fact in real world is that the experts not only answer hard

questions but also easy questions and it bring out the difficulty of determining the question is easy or hard. In the

prior works, there exist some works that ranking blogs by using the PageRank Brin, S. & Page, L., (1998) The

anatomy of a large-scale hypertextual web search engine. Proceedings of the seventh international conference on

World Wide Web 7. Brisbane, Australia: Elsevier Science Publishers B. V. and HITS Kleinberg, J.M. (1999)

Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632. algorithm, such as EigenRumor

Fujimura, K., Inoue, T., Sugisaki, M., (2005)The EigenRumor Algorithm for Ranking Weblogs, Proceedings of

29
http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0


30/113

the 2nd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis algorithm. The EigenRumor

algorithm uses the interactions between posters, repliers, and blogs in order to rank blogs, and then users can

obtain the feedback of the blog scores to fix their authorities, and the iteration of the progress will terminate when

the score of blogs change significantly. The nodes and the architecture in EigenRumor algorithm are similar to us.

A question containing an asker and answerers corresponds to a blog containing a poster and repliers, but

EigenRumor algorithm cant apply to find hard questions in CQA service because of omitting the relationship

between hard questions and users. That is to say, if we use EigenRumor algorithm to compute the score of

questions, then the feedback of the scores will back to the users that have asked or answered the question. But the

experts can answer the easy questions in the real world, the feedback of expert may increasing the expertise of

other non-expert who have answered the same question with experts, and it is not reasonable to detect the

expertise of users and the difficulty degree of questions.

30


31/113

1.1 Method Abstract

Although the above difficulties are discussed in this task; however, we observe the phenomenon called

knowledge gap exists in the CQA service. The knowledge gap is a phenomenon which associates with users and it

is illustrated in Fig 2 . The stairs in Fig 2 represent the difficulty in the specific topic and the user who stands

higher stair represents he/she has more knowledge in this topic. For example, the expert in Fig 2 stands on the top

of the stairs indicates the expert is familiar to all things about the topic; however, the amateur and the novice stand

on the bottom of the stairs owing to their strange with the topic, and the knowledge gap is that the gap between

any two users. For example, the gap A in Fig 2 represents the knowledge gap between the expert and the novice

and the gap B represents the knowledge gap between the expert and the novice. The gap may bring about two

situations. The one of two is that the user who stands on the lower stair does not have ability to answer the

question that the user who stands on the higher stair has asked. Another is that the user who stands on the higher

stair may has no interesting in answering the question that the lower-stair user has asked although he/she has

enough knowledge. For instance, the one of the hard questions in Table 3 : What are advantages of karate over

other martial arts?, if the user wants to answer the question, the user must have more knowledge about karate

and other martial arts and then he/she can compare the difference between the karate and martial arts. The

progress is hard to a novice or an amateur because of their poor knowledge. On the other hand, the one of the easy

questions in Table 3 : What ideas do you have in my first karate class? , the question is so easy to experts who

have a lot of experiences in karate and experts may have less interesting in answering the question because the

experts can see the same questions every day. In addition, the boundary in Fig 2 represents the line that separates

the hard question and easy question. In the short word, there are two main principles on the knowledge gap. First,

31


32/113

the non-experts have no ability to answer the question that over their knowledge. Second, the experts have less

interesting in the question that under the certain difficulty degree. In this paper, we investigate the knowledge gap

how to and how much effect CQA service and we propose knowledge-gap-based difficulty rank (KGB-DR)

algorithm to tackle the task that determine the question being easy or hard. Specifically, KGB-DR makes use of

the structure of CQA service and the relationship between users, and there are the iteration of two steps namely an

expert finding step and a difficulty degree detecting step.

(1) Expert finding step:

This step is to compute the probability of a user being an expert in asking or answering, we use graph-based

algorithm to compute first and then adjust them by the feedback from the difficulty degree detecting step in

order to fit the knowledge gap between users.

(2) Difficulty degree detecting step:

The goal of this step is to compute the probability of a question being a hard question and rank these

questions by their difficulty degree score.

32


33/113

Fig 2. The knowledge gap

1.1 Application

Given a set of questions and rank the questions by difficulty degree can apply the following case potentially. First,

for CQA service and discussion board, it can shows types of difficulty-degree questions to types of users. For

example, if an amateur enters the particular category, then rout him/her the easy question. Second, it can promote

the performance of the search engine by adding the difficulty-degree factor, the outputs are different for each

types of users. Third, ranking question by difficulty is to find the suitable questions for the types of users, user can

select the types of difficulty questions that they want. It shows that can apply to the commercial advertisement.

For example, for the cycling, most of novice just need cheep or usable bicycle and expert may want to the other

functional such as lighter, tougher, or even high-level part in bicycle comparatively.

1.2 Paper Organization

The rest of this paper is structured as follows. In Section RELATED WORK we briefly discuss related work. In

Section The knowledge gap , we describe the knowledge gap in more detail. In Section Problem definition and

Our Approach , we define our task by the probability model and we present our approach called KGB-DR

algorithm to solve this task. Next, experimental results are reported in Section Experiments. Section concludes

the paper and our future work.

33


34/113

2. RELATED WORK

The task of identifying the question is easy or hard is similar the prior works such question ranking, question

search, or question recommendation. The common goal of the above tasks is to support users to find questions

they want to know effectively and the difference between us is that our goal is to rank the questions in the

particular category by the difficulty or recommend hard questions to expert, easy questions to non-expert.

Besides, the most of important subtask in this task is to find experts in the particular category and we utilize the

user-user reactions to model it. The related work we refer previously will be particularized in this section.

2.1 Link Analysis

PageRank Brin, S. & Page, L., (1998) The anatomy of a large-scale hypertextual web search engine.

Proceedings of the seventh international conference on World Wide Web 7. Brisbane, Australia: Elsevier Science

Publishers B. V. is a link analysis algorithm that calculates the importance score for each element of a group of

hyperlinked documents. By using the linking structure of the web pages, PageRank interprets the links as the

indicator for the importance of pages. Basically, a link from page A to page B is regarded as a vote from page A to

page B. The algorithm intends to estimate the probability distribution representing a user randomly clicking on

links. In addition, a page linked from a page with high PageRank score would be given a high score. Kleinberg

34


35/113

Kleinberg, J.M. (1999) Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632. proposed the

Hyperlink-Induced Topic Search (HITS) algorithm. This is another prominent algorithm for ranking web pages.

The critical concepts are the hubs and authorities. For each web page, two scores are calculated based on the

corresponding concepts. The hub score presents the quality of links to other pages about that topic, while the

authority score indicates the quality of the page content. Besides web pages, Yupeng et al. [1] investigated the co-

occurrences of people from web pages and communication patterns from emails to discover the relationships

among people.

2.2 Expert finding in social media

There are several researches focusing on how to find expert by modeling and computing user-user graph such

as McCallum, A, A. Corrada-Emanuel, and X. Wang. (2005) Topic and role discovery in social networks. In

Proceedings of 19th International Joint Conference on Artificial Intelligence. Jurczyk, P. and Agichtein, E. (2007)

Discovering authorities in question answer communities by using link analysis. Proceedings of the sixteenth ACM

conference on Conference on information and knowledge management. ACM, Lisbon, Portugal. Zhang, J.,

Ackerman, M.S. and Adamic, L. (2007) Expertise networks in online communities: structure and algorithms.

Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta, Canada. . McCallum

McCallum, A, A. Corrada-Emanuel, and X. Wang. (2005) Topic and role discovery in social networks. In

Proceedings of 19th International Joint Conference on Artificial Intelligence. et al. utilize the user-user graph to

35


36/113

find the experts for particular topics and then Zhang et al. Zhang, J., Ackerman, M.S. and Adamic, L. (2007)

Expertise networks in online communities: structure and algorithms. Proceedings of the 16th international

conference on World Wide Web. ACM, Banff, Alberta, Canada. use the network-based ranking algorithms HITS

Kleinberg, J.M. (1999) Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632. and PageRank

Brin, S. & Page, L., (1998) The anatomy of a large-scale hypertextual web search engine. Proceedings of the

seventh international conference on World Wide Web 7. Brisbane, Australia: Elsevier Science Publishers B. V. to

identify users with high expertise. Their results show high correlation between link-based metrics and the answer

quality. Next, Liu et al Liu, X., Croft, W.B. and Koll, M. (2005) Finding experts in community-based question-

answering services. Proceedings of the 14th ACM international conference on Information and knowledge

management. ACM, Bremen, Germany. use the features such as authors activity, number of clicks for finding the

best answers for a given question. Then Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question

answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on

information and knowledge management. ACM, Lisbon, Portugal. Jurczyk, P. and Agichtein, E. (2007) Hits on

question answer portals: exploration of link analysis for author ranking. Proceedings of the 30th annual

international ACM SIGIR conference on Research and development in information retrieval. ACM, Amsterdam,

The Netherlands. use HITS algorithm to find the authority of users in question/answering network that an asker is

linked to an answerer if the answerer replies the question that asker has been asked. The experiments show that

the obtained authority score is better than simply counting the number of answers an answerer has given.

Although the results demonstrate that HITS algorithm is a good approach to find expert, but there are some

potential factor like the habit of asking questions and replying answers may reduce the performance. Unlike

36


37/113

Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question answer communities by using link

analysis. Proceedings of the sixteenth ACM conference on Conference on information and knowledge

management. ACM, Lisbon, Portugal. and Zhang, J., Ackerman, M.S. and Adamic, L. (2007) Expertise networks

in online communities: structure and algorithms. Proceedings of the 16th international conference on World Wide

Web. ACM, Banff, Alberta, Canada. , Zhou et al. Zhou, Y., Cong, G., Cui, B., Jensen, C.S. and Yao, J. (2009)

Routing Questions to the Right Users in Online Communities. Proceedings of the 2009 IEEE International

Conference on Data Engineering. IEEE Computer Society. utilize the structural relations among users in the

forum system and use content-based probability model to find the experts for the particular question.

In the other social media, Balog et al. Balog, K., Azzopardi, L. and Rijke, M.d. (2006) Formal

models for expert finding in enterprise corpora. Proceedings of the 29th annual

international ACM SIGIR conference on Research and development in information retrieval.

ACM, Seattle, Washington, USA. propose extended language models to address the expert finding problem

in enterprise corpora. From 2005, Text REtrieval Conference (TREC) has provided a platform with the Enterprise

Search Track for researchers to empirically assess their methods for expert finding N. Craswell, A.P.d.V., and I.

Soboroff. (2005) Overview of the trec-2005 enterprise track TREC05, 199-205. . In addition, there are some

researches for finding experts over the e-mail corpus such as Campbell, C.S., Maglio, P.P., Cozzi, A. and Dom, B.

(2003) Expertise identification using email communications. Proceedings of the twelfth international conference

37


38/113

on Information and knowledge management. ACM, New Orleans, LA, USA.Dom, B., Eiron, I., Cozzi, A. and

Zhang, Y. (2003) Graph-based ranking algorithms for e-mail expertise analysis. Proceedings of the 8th ACM

SIGMOD workshop on Resea rch issues in data mining and knowledge discovery. ACM, San Diego, California..

2.3 Question Ranking

Question ranking is to rank the results for purpose of browsing-time decreasing, the typical approach is using

the information of Q&A content such as best ratings such as Adamic, L.A., Zhang, J., Bakshy, E. and Ackerman,

M.S. (2008) Knowledge sharing and yahoo answers: everyone knows something. Proceeding of the 17th

international conference on World Wide Web. ACM, Beijing, China.. Bian, J., Liu, Y., Agichtein, E. and Zha, H.

(2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of the 17th

international conference on World Wide Web. ACM, Beijing, China. .Jeon et al. Jeon, J., Croft, W.B., Lee, J.H. and

Park, S. (2006) A framework to predict the quality of answers with non-textual features. Proceedings of the 29th

annual international ACM SIGIR conference on Research and development in information retrieval. ACM,

Seattle, Washington, USA. addressed the answer quality problem in a community QA portal and tried to estimate

it, they used a set of non-textual features such as answer length, number of points received, etc. for determining

answer quality. Agichtein et al. Agichtein, E., Castillo, C., Donato, D., Gionis, A. and Mishne, G. (2008) Finding

high-quality content in social media. Proceedings of the international conference on Web searc h and web data

mining. ACM, Palo Alto, California, USA. expands on Jeon, J., Croft, W.B., Lee, J.H. and Park, S. (2006) A

38


39/113

framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual

international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle,

Washington, USA. by exploring a larger range of features including both structural, textual, and community

features. Su et al. Su, Q., Pavlov, D., Chow, J.-H. and Baker, W.C. (2007) Internet-scale collection of human-

reviewed data. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta,

Canada. analyzed the quality of each answers varies significantly. Bian et al. Bian, J., Liu, Y., Agichtein, E. and

Zha, H. (2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of

the 17th international conference on World Wide Web. ACM, Beijing, China. proposed to solve collaborative QA

by considering both answer quality and relevance and also used content-based quality answers without

considering user expertise. Because of the answer quality problem on Bian, J., Liu, Y., Agichtein, E. and Zha, H.

(2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of the 17th

international conference on World Wide Web. ACM, Beijing, China. Jeon, J., Croft, W.B., Lee, J.H. and Park, S.

(2006) A framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual

international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle,

Washington, USA. Su, Q., Pavlov, D., Chow, J.-H. and Baker, W.C. (2007) Internet-scale collection of human-

reviewed data. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta,

Canada. , Suryanoto et al. Suryanto, M.A., Lim, E.P., Sun, A. and Chiang, R.H.L. (2009) Quality-aware

collaborative question answering: methods and evaluation. Proceedings of the Second ACM International

Conference on Web Search and Data Mining. ACM, Barcelona, Spain. propose a quality-aware framework that

considers both answer relevance and answer quality derived from answer features and expertise of answerers.

39


40/113

Wang et al. Wang, X.-J., Tu, X., Feng, D. and Zhang, L. (2009) Ranking community answers by modeling

question-answer relationships via analogical reasoning. Proceedings of the 32nd international ACM SIGIR

conference on Research and development in information retrieval. ACM, Boston, MA, USA. consider questions

and their answers as relational data but instead model them as independent information and propose link-based

algorithm for evaluating the quality of answers.

2.4 Question Search

The research of question search is first conducted using FAQ data Burke, R.D., Hammond, K.J., Kulyukin,

V.A., Lytinen, S.L., Tomuro, N. and Schoenberg, S. (1997) Question Answering from Frequently Asked Question

Files: Experiences with the FAQ Finder System. University of Chicago., Lai, Y.-S., Fung, K.-A. and Wu, C.-H.

(2002) FAQ mining via list detection. proceedings of the 2002 conference on multilingual summarization and

question answering - Volume 19. Association for Computational Linguistics. ,Sneiders, E. (2002) Automated

Question Answering Using Question Templates That Cover the Conceptual Model of the Database. Proceedings

of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers.

Springer-Verlag. . FAQ finder Burke, R.D., Hammond, K.J., Kulyukin, V.A., Lytinen, S.L., Tomuro, N. and

Schoenberg, S. (1997) Question Answering from Frequently Asked Question Files: Experiences with the FAQ

Finder System. University of Chicago. heuristically combines statistical similarities and semantic similarities

between questions and FAQs. Sneiders Sneiders, E. (2002) Automated Question Answering Using Question

Templates That Cover the Conceptual Model of the Database. Proceedings of the 6th International Conference on

Applications of Natural Language to Information Systems-Revised Papers. Springer-Verlag. proposed template

40


41/113

based FAQ retrieval systems. Lai et al. Lai, Y.-S., Fung, K.-A. and Wu, C.-H. (2002) FAQ mining via list

detection. proceedings of the 2002 conference on multilingual summarization and question answering - Volume

19. Association for Computational Linguistics. proposed an approach to automatically mine FAQs from the web.

Recently, the research of question search has been further extended to the cQA. For example, Jeon et al. Jeon,

J., Croft, W.B. and Lee, J.H. (2005) Finding similar questions in large question and answer archives. Proceedings

of the 14th ACM international conference on Information and knowledge management. ACM, Bremen, Germany.J

eon, J., Croft, W.B. and Lee, J.H. (2005) Finding semantically similar questions based on their answers.

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in

information retrieval. ACM, Salvador, Brazil.Jeon, J.A.C., W. B., Learning translation-based language models

using q&a archives. Technical Report,University of Massachusetts. compared four different retrieval methods, i.e.

cosine similarity, Okapi, language model (LM), and statistical machine translation model (SMT), for

automatically fixing the lexical chasm between questions of question search. They found that the SMT-based

method performed the best.

2.5 Question Recommendation

The question recommendation is like question search, the goal of two search is to find the questions that user

prefer to know or reply. The first question recommendation problem was addressed by Cao et al. Cao, Y., Duan,

H., Lin, C.-Y., Yu, Y. and Hon, H.-W. (2008) Recommending questions using the mdl-based tree cut model.

Proceeding of the 17th international conference on World Wide Web. ACM, Beijing, China., they consider a good

41


42/113

recommendation should provide alternative aspects around users interest and they use MDL-based tree cut model

to tackle this problem. Sun et al. Sun, K., Cao, Y., Song, X., Song, Y.-I., Wang, X. and Lin, C.-Y. (2009) Learning

to recommend questions based on user ratings. Proceeding of the 18th ACM conference on Information and

knowledge management. ACM, Hong Kong, China. address the recommend question problem, they use user

rating and propose majority-based perceptron algorithm for avoiding the influence of noisy instances by

emphasizing its learning over data instances from majority users and they show the effectiveness of their approach

through intensive experiments.

The question recommendation and question ranking are similar to what we discuss in this paper, but they have

the same problem as the most related works about question search we discuss above: they focused on content-

based analysis and ignored the interaction between user and user. The recently works about the interaction of

users like Nam, K.K., Ackerman, M.S. and Adamic, L.A. (2009) Questions in, knowledge in?: a study of naver's

question answering community. Proceedings of the 27th international conference on Human factors in computing

systems. ACM, Boston, MA, USA. in Naver, Adamic, L.A., Zhang, J., Bakshy, E. and Ackerman, M.S. (2008)

Knowledge sharing and yahoo answers: everyone knows something. Proceeding of the 17th international

conference on World Wide Web. ACM, Beijing, China.. in YA, and Yang, J., and Wei, X.,(2009) Seeking and

Offering Expertise across Categories: A in Baidu analyzed the activity of users in Q&A forums and found the

types of users such as expert and non-expert having different habit in asking and answering.

42


43/113

3. THE KNOWLEDGE GAP

In this section, we investigate the knowledge gap how to and how much effect in YA. First, we will propose the

phenomenon that called knowledge gap in YA or other CQA service and explain the relationship between the

knowledge gap and the hard question in section The Knowledge Gap in YA . Second, we will show how to

quantify this phenomenon for each category in YA. Finally, we will propose some problems about expert finding

and what type of expert we want to extract in section The Expert Finding .

3.1 The Knowledge Gap in YA

According to Table 1 and Table 2 , we can assume the higher the level, the more expertise user has for the

moment, and the question that higher level user asked have a high probability being a hard question, and we

combine the level 6 and level 7 as level 6 for conveniently. Fig 3 shows that the proportion of each level of askers

replied by each level of the answerers in five categories. The color of a zone (i,j) is more darker than the one of a

zone (k,j) if the proportion of the level j askers replied by level i answerers is higher than the proportion of the

level j askers replied by level k answerers, for example, in Fig 3 (a), the zone (1,1) is more black than zone

(2~6,1). It indicates that the ratio of the questions level 1 users have asked are replied by the level 1 users more

than the users of other levels. In Fig 3 (a), we can see the questions that higher level user asked and the ratio of the

43


44/113

higher level user replied is higher and while the questions that low level user asked are contrary. There are two

explains to prove the above situation. First, the high level users tend to reply or participate in the challenging

question that can contribute their knowledge to other people. Second, the low level users just can reply the easy

question that the low level users asked because of their poor knowledge, so we can see the questions that level 1

answerers and level 2 answerers reply are focusing on the questions that from level 1 user to level 4 user ask. The

observation of Fig 3 tell us that the hard question q is related to the expertise of users that participate in the

question q, and we call this phenomenon as a Knowledge Gap and the graph corresponding to the knowledge

gap is called knowledge gap diagram . Besides, there are different degrees of the knowledge gap for each

category. For example, the degree of the knowledge gap in some categories is strong such as Martial arts, Pets,

and Health. However, the degree of the knowledge gap in some categories is weak such as Cycling and

Software, in fact, we can see the questions that low-level user have asked have been answered by high-level users

more than low-level users in Fig 3 (b)Cycling , and it indicates that the experts in Cycling are enthusiastic to solve

the question that amateur has asked. On the contrary, in Fig 3 (e)Software, we can see the low-level users also

have ability to answer the questions that high-level users have asked, and it shows that the most of

questions in Software are easy so that the low-level users can answer it. In addition, Fig 3 (f) is a perfect

knowledge gap diagram that low-level users all answer the questions that low-level users have asked and

high-level users just answer the questions that high-level users have asked. Therefore, the using of the

knowledge gap is most important in detecting the question being easy or hard, and we will show how to utilize the

knowledge gap in the next section.

44


45/113

45


46/113

Fig 3. The knowledge gap diagram

3.2 The Quantification of Knowledge Gap Diagram

3.2.1 The preliminary for quantifying the knowledge gap diagram

Table 4 is the prototype of knowledge gap diagram in category Health, and we will show how to quantify the

knowledge gap diagram by this table in this paragraph. First, we normalize each asker level from Table 4 , and the

result is shown as Table 5Table 4. Note that there are two principles in the knowledge gap, one of two is that the

non-experts have no ability to answer the question which over their knowledge and another is the experts have

less interesting in the question which under the certain difficulty degree. We then formulate the knowledge gap

diagram as follow by the two principles:

KGSC= ExpertC+NonexpertC

Where C is the category in YA, and ExpertC and Nonexpert(C) are the knowledge gap score for expert and non-

expert in the category C.

Table 4. The knowledge gap diagram in Health

46


47/113

Table 5Table 4. Do the normalization from Table 4

3.2.2 The four zones in the knowledge gap diagram

Now we assume if the level of the user is from 1 to 3 then the user is a non-expert, and if the level of the user is

from 4 to 6 then the user is an expert. Fig 4 illustrates the non-expert and expert in the knowledge gap diagram,

Fig 4 (a) means the area that non-expert and expert can move respectively, and Fig 4 (b) means the area of

interdiction that non-expert and expert cannot move frequently, i.e., the non-experts cannot answer the question

that experts have asked frequently and the experts cannot answer the question that non-experts have asked

frequently. According to the Fig 4 , we can divide four zones in the knowledge gap diagram, and the four zones

47


48/113

can be illustrated as Fig 5 . First, the zone of regular non-expert (RN), and this zone gathers the user who is a non-

expert and focuses on answering easy questions due to his/her poor knowledge. Second, the zone of promising

non-expert (PN), and this zone gathers the user who is a non-expert at present but has ability to answer hard

questions. Third, the zone of regular expert (RE), and this zone gathers the user who is an expert and has ability to

answer the hard questions that experts have asked. Finally, the zone of enthusiastic expert (EE), this zone gathers

the user who is an expert but focusing on answering easy questions due to his/her enthusiasm. Besides, Table 6 is

an example of four zones from the mapping of Table 5Table 4 .

Fig 4. The non-expert and expert in the knowledge gap diagram

48


49/113

Fig 5. The classification for non-expert and expert in the knowledge gap diagram

Table 6. The four zones in the knowledge gap diagram

3.2.3 The quantification of non-expert and expert

Recall that the knowledge gap score is based on the answering of the expert and non-expert, and we will show

how to quantify the two factors in four viewpoints. First, the behavior of answering questions for non-expert and

expert can be viewed as the responsibility. That is to say, the non-experts cannot answer hard questions due to

their inexperience, so answering easy questions is the responsibility for non-experts. In this paper, we evaluate the

responsibility of non-experts by F(RN)-F(PN), where F(RN) is the flow of non-experts in the zone RN and F(PN)

49


50/113

is the flow of non-experts in the area PN. If the value of F(RN)-F(PN) is bigger, then it indicates the non-experts

focus on answering easy questions than hard questions compared to the other experts. In addition, we quantify

F(RN) and F(PN) by summing all ratios in the zone RN and PN respectively. For example, F(RN)=1.51 and

F(PN)=1.23 in Table 6 . Second, the experts have more knowledge and experiences than the other non-experts, so

answering hard questions can be seen as their responsibility. In the same way, we quantify the value of the

responsibility of experts by F(RE)-F(EE). The rest of the viewpoints are about the asking of non-expert and

expert. We assume that the question that expert has asked is hard question before. However, we want to know the

difficulty is credible or not credible. In general, the hard question should be answered by experts and if non-

experts also can answer it then the difficulty of the hard question is doubt. We utilize F(RE)-F(PN) to evaluate the

credibility of hard questions that experts have asked, where F(RE) is the flow of experts in the zone RE and F(PN)

is the flow of non-experts in the zone PN and the value of F(RE)-F(PN) is bigger representing the hard questions

that experts have asked is credible. In the same way, we quantify the credibility of easy questions that non-experts

have asked by F(RN)-F(EE), where F(RN) is the flow of non-experts in the zone RN and F(EE) is the flow of

experts in the zone EE. The value of F(RN)-F(EE) is bigger representing the easy questions that non-experts have

asked is credible. Furthermore, we summarize this paragraph as Table 7 .

50


51/113

Table 7. The evaluation of each factor

3.2.4 The weight for non-expert and expert

In general, because of the number of askers in each level is variant, the weight of setting is necessary for expert

and non-expert, and we utilize the ratio of asker level to set the weight. The distribution of asker for each level in

category Health is shown as Table 8 , Table 8 (a) is the distribution from level 1 to level 7 and Table 8 (b) is the

ratio of non-expert (from level 1 to level 4) and expert (from level 4 to level 7), in addition, we choose the ratio in

Table 8 (b) to weight the expert and non-expert. The computation of non-expert and expert is shown as:

ExpertC=Wex*(2*FcEE-FcRE-Fc(PN))

51


52/113

NonexpertC=Wnon*(2*FcRN-FcEE-Fc(PN))

Where Wex is the ratio of the questions experts have asked and Wnon is the ratio of the questions non-experts

have asked, and FcRE , FcEE ,Fc(PN) ,and FcRN are the four zones of category C we define in Table 6 . Finally,

the value of KGS(Health) is 0.296*0.9023+0.796*0.0977=0.43.

Table 8. The distribution of asker for each level in category Health

3.2.5 The L-KGS

Note that the knowledge gap diagram is based on the level of users; however, we dont know the level of users is

gained from this category or other categories. Thus, we must add the other level index to adjust it. In general, we

can know that the more the question user answers the more points and level user obtains from Table 1 and Table

2. According to this reason, we utilize the average number of answers asker replies to adjust their user level.

Additionally, we also set the weight as the square root of the number, and the formula are followed as:

L-KGSC= ExpertC+NonexpertC

ExpertC=Wex*Lex*(2*FcRE-FcEE-Fc(PN))

52


53/113

NonexpertC=Wnon*Lnon*(2*FcRN-FcEE-Fc(PN))

Where Lex is the number of replies per asker who is an expert and Lnon is the number of replies per asker who is

a non-expert and both do the square root of its.

3.2.6 The results of knowledge gap score (KGS)

The results for all categories can be seen in Table 9 , and the order of KGS by descent is Pets>Health>

Software>Martial arts>Cycling. However, after considering the number of questions askers reply, the ranking of

L-KGS is Pets>Health>Martial arts>Software>Cycling. The ranking growth of the Martial arts in L-KGS due to

the more number of questions pre asker answered than the category Software.

Table 9 . The KGS for each category

53


54/113

3.3 The Expert Finding By Knowledge Gap

In general, we regard the higher-level user as the expert. But there are some noisy problems in QA portal, i.e.,

there exist some people that their user level is not high but they are experts. The following examples are the parts

of the noisy expert finding using user level in YA:

Case1. The level of users is from gaining points such as Table 2, and the points users gained are from different

categories. It shows that if the level of user is 7, it is not true that the user is all expert in the categories

he/she have been asked or replied.

54


55/113

Case2. If you usually reply questions in YA, your level should increase quickly, however, your level may increase

slowly if you reply questions rarely. In the real world, some real experts do not have much time to reply in

YA, it bring their level are not very high.

Case3. Although some users are novice in the category, their knowledge is less than real expert. But they are

enjoying replying the questions that the other amateur or novice asked in the category, and though they do

not have ability to reply the hard questions but they reply a lot of easy question instead. The result is that

their user level increases quickly but they are not real expert.

Case4. The most of users may not have only one account id in YA, however, they reply the question q1 using

account id 1 and reply the other question q2 using account id 2, this situation bring their expertise to

ambiguous. Additionally, some users may have bad intention to create many accounts to ask question and

reply by other accounts for increasing their level quickly.

The methods that the prior works Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question

answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on

information and knowledge management. ACM, Lisbon, Portugal. Jurczyk, P. and Agichtein, E. (2007) Hits on

question answer portals: exploration of link analysis for author ranking. Proceedings of the 30th annual

international ACM SIGIR conference on Research and development in information retrieval. ACM, Amsterdam,

The Netherlands. Zhang, J., Ackerman, M.S. and Adamic, L. (2007) Expertise networks in online communities:

55


56/113

structure and algorithms. Proceedings of the 16th international conference on World Wide Web. ACM, Banff,

Alberta, Canada. has proposed can solve most of problems, but it is not enough if we want to use the expertise of

users to detect the question being easy or hard. In fact, the habits of users in asking and answering are an

important factor and it reflects on the knowledge gap diagram for each category, but none of the previous works

consider it. For example, the non-experts could answer so many easy questions that bring about the abnormal

expertise increasing of the non-experts; nevertheless, the experts could answer easy questions though the quality

of the questions is low. In this paper, the expert we want to find is different from the prior works,

comparatively, we want to find the answer-hard-question-frequently expert according to the two principles of the

knowledge gap we discussed previously. That is to say, we want to find the user that not only an expert but also

focusing on answering hard questions more than the other users.

56


57/113

4. PROBLEM DEFINITION AND OUR APPROACH

In this section, we describe the progress of our problem as probability model and propose an algorithm called

knowledge-gap-based difficulty rank (KGB-DR) algorithm to predict the difficulty degree of the question is

easy or hard. The KGB-DR includes two steps: expert finding step and difficulty degree detecting step. Expert

finding step is not only to find the expert but also adjust the expertise by the knowledge gap, while difficulty

degree detecting step is to compute the difficulty degree of the questions, in addition, when the difficulty degree

score of the questions have been computed, the expert finding step can obtains the feedback from the difficulty

degree model and then the difficulty degree detecting step can computes the score of the questions again. We will

detail the two steps how to work.

4.1 Problem Definition

In this paper, we state the problem of the difficulty degree on questions q is by means of a probability model.

The probability model is used widely by the expert finding recently Balog, K., Azzopardi, L. and Rijke, M.d.

(2006) Formal models for expert finding in enterprise corpora. Proceedings of the 29th annual international ACM

SIGIR conference on Research and development in information retrieval. ACM, Seattle, Washington, USA. Fang,

H. and Zhai, C. (2007) Probabilistic models for expert finding. Proceedings of the 29th European conference on

57


58/113

IR research. Springer-Verlag, Rome, Italy. Zhou, Y., Cong, G., Cui, B., Jensen, C.S. and Yao, J. (2009) Routing

Questions to the Right Users in Online Communities. Proceedings of the 2009 IEEE International Conference on

Data En gineering. IEEE Computer Society., and we also use probability model to formulate our task, furthermore,

we add the link analysis and the knowledge gap to solve this problem. Given a question q, the probability of the

difficulty degree being hard is estimated as follows:

phq=pqhphpq (1)

where ph is the probability of the difficulty-degree being hard and pq is the probability of a question, which is the

same for all questions. In our task, we just identify the question is easy or hard for conveniently and set a

threshold t. If the difficulty rank of the question is less than t, the question is hard. So ph is estimated by t/n,

where n is the total number of questions in the category. Thus, the ranking of questions is proportional to the

probability of the question given the difficulty degree being hard. In this study our task is therefore to capture how

much hard on the question q by pqh .

4.2 The Flowchart of Our Approach

The flowchart of our approach is illustrated in Fig 6 . Given a particular category or a set of categories, we first

using expertise model by the asking-answering network to capture the expertise including asking and answering

of the users. Difficulty degree model is used to calculate the difficulty degree score of the input-questions by the

framework in YA, and question ranking is ranking the questions by their difficulty degree score. The

58


59/113

reinforcement model is to reinforce the expertise of users by the phenomenon of the knowledge gap via the DDR

index that prior step calculates. In short, the progress of our system only has two steps: the expert finding step and

the difficulty degree detecting step, and the expert finding is to find the answer-hard-question-frequently expert,

while the difficulty degree detecting is to determine the difficulty degree score of questions. The loop will

terminate when the difficulty degree score of the questions change significantly.

Fig 6. The flowchart of KGB-DR algorithm

4.3 Expertise Model

We utilize the prior work Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question answer

communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on information

and knowledge management. ACM, Lisbon, Portugal. as our expertise model, we will introduce this model briefly

in this paragraph. The link structure in the expertise model can be shown in Fig 7 , a particular question has a

number of answers that single user replies for each one. An edge from a user to a question means that the user

asked the question, and an edge from an answer to a user means that the answer was posted by this user. For

example, in Fig 7 (a), user 1 has posted question 1 and user 2 has posted question 2, but both of them has never

59


60/113


61/113

and are updated iteratively using the equation above. After each iteration, the values in the H and A vectors are

normalized, so that the highest hub and the highest authority values are 1.

4.4 Scores

The KGB-DR algorithm exist three major scores, two scores belong to users and used to combine the expertise

of users and knowledge gap , while the other one only belongs to the question.

pask(u|h) (the asker-part score of users)

It indicates that the probability of an asker-expert of the user u given the difficulty-degree model h. This

represents the ability of asking the hard question of the user, and the higher the score, the stronger the probability

of the user in asking hard questions. The score is initialized to the hub value that section Expertise Model referred.

pans(u|h) (the answerer-part score of users)

61


62/113

It indicates that the probability of an answerer-expert of the user u given the difficulty-degree model h. This

represents the ability of answering the hard question of the user, and the higher the score, the stronger the

probability of the user in answering hard questions. The score is initialized to the auth value that section Expertise

Model referred.

pqh (the difficulty degree score of questions)

It indicates that the probability of the question given the difficulty-degree being hard. This score is the goal we

want to evaluate in our task. This represents how much hard the question is, the higher the score, the stronger the

probability of the question being hard.

4.5 Difficulty Degree Model

Recalling our goal is to estimate the value of pqh , and the question can be divided to two major parts: the part of

asking and the part of answering and each part is assumed to be generated independently. Thus, the probability of

question q given the hard difficulty degree is obtained by taking the summation across the two parts users on the

question:

62


63/113

pqh=*pcqah+1-*pcqrh (2)

Where pcqah is the probability of the part in asking given the hard difficulty degree, and while pcqrh is the

probability of the part in answering given the hard difficulty degree. Additionally, we set a parameter [0,1] to

adjust the weight of the two parts, and we will discuss the two parts respectively

4.5.1 The part of asking

In the part of asking, we capture an estimate of pcqah by means of three types of architecture of CQA service.

The first is the relationship between the content that the asker has asked on the question and the question, and we

formalize it by the length of the content. The other is the relationship between the question and the other questions

that asker had posted before. Another is the relationship between the asker and the difficulty degree model. It can

be expressed as:

pcqah=pcqaq*pqa*paskah (3)

Where pcqaq is the probability of the content that asker a asks on the question q and pqa is how much knowledge

that the asker a contribute to the question q and paskah is the probability of an asker-expert of the asker a given

63


64/113

the difficulty degree model h. The value of paskah is initialized to the hub value that section Expertise Model

referred and we will discuss the change of the next value later in this section.

To compute an estimation of pcqaq , we use the length of the content and normalize it by the square root such as

the following:

pcquq=lqu(i qlqi) (4)

Where cqu is the content that user u has posted on the question q and lqu is the length of cqu and i qlqi is the

summation of the length of all the posts on the question q. According to above estimation, the estimation of pqa is

defined by:

pqu=lqumaxq' ulqu (5)

Where maxq' ulqu is the maximum of the length that user u had asked before. The estimation of pqu is similar

to pcquq , but we use maximum instead of summation to compute pqu . It is reasonable that the hard questions

must be using more word to ask and while the easy questions instead, and for the same asker, the question that the

64


65/113

maximum length posted by the asker is stronger probability of being hard question than the other questions that

this asker has asked.

4.5.2 The part of answering

In the part of answerer, we also divided this into two parts: the best answerer and the other repliers, so the

probability of the part of the answerer given the hard difficulty degree can be obtained as follows:

pcqrh=*pcqbh+1-* pcqr,h (6)

Where pcqbh is the probability of the content that the best answerer b replies on the question q, and while pcqr,h

is the probability of the content that the other answerers reply on the question q. The computing of the two factors

is similar to the equation (3), but paskuh changes to pansuh such as following:

pcqbh=pcqbq*pqb*pansbh (7)

pcqr,h=i qr' pcqiq*pqi*pans(i|h) (8)

65


66/113


67/113


68/113

Where H is the total number of hard questions in category C, and we called this equation as global difficulty

probability module. Because the above score may contain zero probability, we must smooth it for computing the

expertise of users. Thus, we conduct the probability model h to our calculating and construct two models from

equation 9 and equation 10. Therefore, the local difficulty model of paskuh and pansuh is represented as:

paskuhk+1=*paskuh+1-*paskuhk (11)

pansuhk+1=*pansuh+1-*pansuhk (12)

Where paskuhk and pansuhk are the kth-iteration-number score of paskuh and pansuh which paskuh0

and pansuh0 are hub and authority that base HITS algorithm computes in the prior section, and [0,1] .

Another way is global difficulty model such as :

paskuhk+1 *paskhu+1-*paskuhk (13)

pansuhk+1 *panshu+1-*pansuhk (14)

68


69/113

1. EXPERIMENTS

In this section, we will introduce and analyze our dataset crawled from YA in Section Dataset and the

answer set in Section Answer set . Next, the baselines we compare to our approach are listed in Section

Baseline and the use of evaluation metrics in Section Evaluation metrics . The results of parameter tuning

of our approach is shown as Section Parameter tuning. Finally, the results of performance with all

methods are exhibited in Section Comparison to baseline .

1.1 Dataset

1.1.1 The basic statistics analysis

We crawled 40,000 resolved questions from Yahoo! Answers service in English for our experiments, and these

questions from five categories respectively such as Martial arts , Cycling , Health , Pets , and Software, and there

are 8,000 questions from each category. The dataset statistics are shown in Table 10, we will analyze and discuss

these data before experiments clearly. From the attribute of # of answers and avg answers per question, we

can see the number of the answers per question in two categories Martial arts and Pets are more than the other

categories, furthermore, martial arts has the less number of users and less number of askers than other categories,

69


70/113

it shows that the users repeat to ask and answer with other users, in a short word, there are highest activity

between users in the category martial arts. However, the category Cycling is contrary to Martial arts, the number

of answers and the users are the least than other categories, and the ratio of questions with only one answer is

25%. Furthermore, the number of answerers and the length of content that user have posted are also least than the

other categories, and it indicates that there are low activity between users in this category. The category Health is

as same as the category Cycling, the ratio of questions with only one answer is higher than the other categories,

however, the number of users and answerers in Health are much more than other categories else. It shows that

users posted contents in category Health only once mostly. The average answers per question in the category

Software is only higher than Cycling, but the number of questions with one answer is least than the other

categories. Additionally, the number of askers in Software is 7,232, but the number of total questions is 8,000. It

says that there are a lot of unique users ask and do it only once in this category, and this is a reason that brings

about the knowledge gap diagram of Software in Fig 3 (e) i.e., the questions are easy mostly in this category.

Specifically, for the types of attribute in Table 10, we find that some attributes are related to the knowledge gap

probably. First, the average answers per question, we can see the categories with strong knowledge gap such as

Martial arts, Health, and Pets have more average answers per question than the categories with weak knowledge

gap such as Cycling and Software. That is to say, the more answers in the question represents the high probability

of the question being hard. Second, the average of length of content that user post, it is make sense because we

use more word to describe the question if the more hard the question is, in addition, the length of asking is more

related to the knowledge gap than the length of answering.

70


71/113

Table 10. Dataset statistics

1.1.2 The asking and answering of users in each category

The number of answering for each user in categories is shown in Fig 8 and the average of number of answering

for top x%~y% users is shown in Fig 9, where y-x=5, and we can see how many users always answer questions in

each category. First, we can see the tail of three categories (a)Martial arts ,(b)Health, and (c)Pets are shorter than

the other two categories in Fig 8 , and the shorter tail indicates that the less users answer just once in the category.

However, the length of head in Fig 9 represents that the number of users usually answer in the category more than

71


72/113

once, and it is interesting that two of the longest length of the head are Martial arts and Cycling, but the length of

tail of Cycling in Fig 8 is also longer. It shows that there are many users answer questions over a long period of

time and there are also many passing visitors answer question just once. The category Health and Pets are contrary

to the category Cycling, although the length of tail of the two categories in Fig 8 are shorter, the length of head of

the two categories are also shorter, and it indicates that there are less passing visitors who answer question just

once and also less users who answer question frequently.

Fig 8. The number of answering for each user in categories

72


73/113

Fig 9. The average of number of top x% users in answering

Now we investigate the asking of users for each category, and Fig 10 shows that the number of asking for each

user in categories and Fig 11 represents the average of number of asking for top x%~y% users, where y-x=5, and

we can see how many users always ask questions in each category. In Fig 10 , we can see the length of tail for each

category is as same as long and the category Software has the longest tail compare to other categories, and the

length of tail of asking is shorter than answering for most of categories. However, the category Software is only a

category that the length of tail of asking longer than answering, and we can infer this is why there are most of

easy questions in this category, the most of users ask few questions so they cant ask more harder question and the

most of users answer few questions so they cant obtain more experience in this category. Besides, we can regard

this feature as the type of the knowledge gap diagram of the category Software. The length of head of the two

category Martial arts and Cycling is longer than other categories in Fig 11 , and it shows that there are stable users

who ask questions in the two categories.

Fig 10. The number of asking for each user in categories

73


74/113


75/113

Martial arts and Pets, and another group contains the others. From the first group and Fig 13 , the ratio of the two

categories Martial arts and Pets is lower in the number of answers for each question between one and three, but

the ratio is higher followed by the increasing of answers, and it indicates that the two categories belong to the

category that most of questions replied by many users. Second, on the contrary, the questions of the other three

categories such as Cycling, Health and Software are replied by a few users frequently, and the questions which

replied by more than ten users are rarely, due to the intuition, we can infer the three categories belong to the most

of questions are easy question so that the questions can resolved by a few users.

Fig 12. The number of answers for each category

Fig 13. The ratio of answers in each question

75


76/113

1.1.4 The level of users for each category

The statistics of each level of users crawled from YA are shown in Table 11 and the percentage of each level

omitting the missing data is reported in Table 12 .

Table 11. The statistics of level of users crawled from YA

76


77/113

Table 12. The percentage of each level omitting the missing data

1.1.5 The level of askers for each category

Although the level of users cannot equals to the expertise of users, however, the level of users is similar to the

experience of users in YA, and we still regard the level of the askers as the difficulty degree of the questions

according to this assumption. The distribution of each level of askers for each category is shown in Fig 14, and the

distribution of the lv1 askers occupy the most of percentage for all categories, it shows that the percentage of the

easy question is more than hard question for all categories. The category Pets has the least ratio 39.05% of lv1

askers and the other category has mo

論文初稿3.0(Ying Liang Chen)

Documents

Transcript of 論文初稿3.0(Ying Liang Chen)