論文初稿3.0(Ying Liang Chen)

download 論文初稿3.0(Ying Liang Chen)

of 64

Transcript of 論文初稿3.0(Ying Liang Chen)

  • 8/8/2019 3.0(Ying Liang Chen)

    1/113

  • 8/8/2019 3.0(Ying Liang Chen)

    2/113

  • 8/8/2019 3.0(Ying Liang Chen)

    3/113

  • 8/8/2019 3.0(Ying Liang Chen)

    4/113

  • 8/8/2019 3.0(Ying Liang Chen)

    5/113

    3.2.5 The L-KGS.............................................................................................................................................................36

    3.2.6 The results of knowledge gap score (KGS)...........................................................................................................36

    3.3 T HE EXPERT FINDING BY K NOWLEDGE GAP......................................................................................................................37

    4. PROBLEM DEFINITION AND OUR APPROACH....................................................................................................39

    4.1 P ROBLEM DEFINITION .....................................................................................................................................................39

    4.2 T HE FLOWCHART OF OUR APPROACH ................................................................................................................................40

    4.3 E XPERTISE MODEL .........................................................................................................................................................40

    4.4 S CORES .........................................................................................................................................................................42

    4.5 D IFFICULTY DEGREE MODEL ...........................................................................................................................................42

    4.5.1 The part of asking.................................................................................................................................................43

    4.5.2 The part of answering...........................................................................................................................................44

    4.6 R EINFORCEMENT MODEL .................................................................................................................................................45

    5. EXPERIMENTS...............................................................................................................................................................47

    5.1 D ATASET .......................................................................................................................................................................47

    5.1.1 The basic statistics analysis..................................................................................................................................47

    5.1.2 The asking and answering of users in each category...........................................................................................48

    5

  • 8/8/2019 3.0(Ying Liang Chen)

    6/113

    5.1.3 The number of answers in each question..............................................................................................................52

    5.1.4 The level of users for each category.....................................................................................................................53

    5.1.5 The level of askers for each category....................................................................................................................54

    5.2 A NSWER SET ..................................................................................................................................................................56

    5.2.1 The number of answers in a question for answer set............................................................................................56

    5.2.2 The length of question...........................................................................................................................................57

    5.2.3 The length of answer.............................................................................................................................................58

    5.3 B ASELINE ......................................................................................................................................................................59

    5.4 E VALUATION METRICS ......................................................................................................................................................60

    5.5 P ARAMETER TUNING ........................................................................................................................................................61

    5.5.1 The results of parameter ......................................................................................................................................61

    5.5.2 The results of parameter ......................................................................................................................................62

    5.5.3 The results of parameter ......................................................................................................................................64

    5.6 C OMPARISON TO BASELINE ...............................................................................................................................................65

    5.6.1 F measure of easy question and hard question.....................................................................................................66

    5.6.2 Roc curve and AUC...............................................................................................................................................69

    5.6.3 The analysis of the expertise of users....................................................................................................................72

    5.6.4 The examples of the outputs compared with the basic approaches......................................................................75

    6. CONCLUSION AND FURTHER WORK.....................................................................................................................79

    6

  • 8/8/2019 3.0(Ying Liang Chen)

    7/113

    7. REFERENCE....................................................................................................................................................................80

    REFERENCE

    7

  • 8/8/2019 3.0(Ying Liang Chen)

    8/113

    8

  • 8/8/2019 3.0(Ying Liang Chen)

    9/113

    1. FIGURE LISTING

    FIG 1. T HE HOMEPAGE OF YAHOO ! A NSWERS ......................................................................................................................................16

    FIG 2 . T HE KNOWLEDGE GAP .............................................................................................................................................................22

    FIG 3 . T HE KNOWLEDGE GAP DIAGRAM ...............................................................................................................................................30

    FIG 4 . T HE NON -EXPERT AND EXPERT IN THE KNOWLEDGE GAP DIAGRAM ..................................................................................................32

    FIG 5 . T HE CLASSIFICATION FOR NON -EXPERT AND EXPERT IN THE KNOWLEDGE GAP DIAGRAM ......................................................................33

    FIG 6 . T HE FLOWCHART OF KGB -DR ALGORITHM ..............................................................................................................................40

    FIG 7 . T HE NETWORK IN YA.............................................................................................................................................................41

    FIG 8 . T HE NUMBER OF ANSWERING FOR EACH USER IN CATEGORIES .........................................................................................................49

    FIG 9 . T HE AVERAGE OF NUMBER OF TOP X% USERS IN ANSWERING .........................................................................................................50

    FIG 10 . T HE NUMBER OF ASKING FOR EACH USER IN CATEGORIES ............................................................................................................51

    FIG 11 . T HE AVERAGE OF NUMBER OF TOP X% USERS IN ASKING .............................................................................................................51

    FIG 12 . T HE NUMBER OF ANSWERS FOR EACH CATEGORY ........................................................................................................................52

    FIG 13 . T HE RATIO OF ANSWERS IN EACH QUESTION ..............................................................................................................................53

    FIG 14 . T HE DISTRIBUTION OF LEVEL OF ASKERS FOR EACH CATEGORY .....................................................................................................55

    FIG 15 . T HE NUMBER OF ANSWERS IN ANSWER SET ...............................................................................................................................57

    FIG 16 . T HE QUESTION LENGTH FOR EACH QUESTION IN THE ANSWER SETS ................................................................................................58

    FIG 17 . T HE LENGTH OF ANSWERS FOR EACH QUESTION IN THE ANSWER SETS ............................................................................................59

    FIG 18 . T HE TUNING OF PARAMETER ..................................................................................................................................................62

    9

  • 8/8/2019 3.0(Ying Liang Chen)

    10/113

    FIG 19 . T HE TUNING OF PARAMETER ..................................................................................................................................................63

    FIG 20 . T HE TUNING OF PARAMETER ..................................................................................................................................................64

    FIG 21 . T HE ROC CURVE FOR EACH METHOD AND EACH CATEGORY ........................................................................................................70

    FIG 22 . T HE DISTRIBUTION OF EXPERTISE OF THE USERS FOR EACH CATEGORY ...........................................................................................73

    FIG 23 . T HE MAPPING OF USER LEVEL AND THE EXPERTISE COMPUTED BY OUR APPROACH FOR EACH CATEGORY ..............................................74

    10

  • 8/8/2019 3.0(Ying Liang Chen)

    11/113

    FIG 23. THE MAPPING OF USER LEVEL AND THE EXPERTISE COMPUTED BY OUR

    APPROACH FOR EACH CATEGORY TABLE LISTING

    TABLE 1. T HE MAPPING FROM THE USER LEVEL TO POINTS IN YA ............................................................................................................16

    TABLE 2 . T HE MAPPING OF ACTION TO THE GAIN OF POINTS ....................................................................................................................17

    TABLE 3 . E ASY QUESTION VS. HARD QUESTION ....................................................................................................................................18

    TABLE 4 . T HE KNOWLEDGE GAP DIAGRAM IN HEALTH ...........................................................................................................................31

    TABLE 5 . D O THE NORMALIZATION FROM TABLE 4...............................................................................................................................31

    TABLE 6 . T HE FOUR ZONES IN THE KNOWLEDGE GAP DIAGRAM ................................................................................................................33

    TABLE 7 . T HE EVALUATION OF EACH FACTOR ........................................................................................................................................35

    TABLE 8 . T HE DISTRIBUTION OF ASKER FOR EACH LEVEL IN CATEGORY HEALTH .........................................................................................35

    TABLE 9 . T HE KGS FOR EACH CATEGORY ...........................................................................................................................................36

    TABLE 10 . D ATASET STATISTICS .........................................................................................................................................................48

    TABLE 11 . T HE STATISTICS OF LEVEL OF USERS CRAWLED FROM YA ........................................................................................................53

    TABLE 12 . T HE PERCENTAGE OF EACH LEVEL OMITTING THE MISSING DATA ...............................................................................................54

    TABLE 13 . T HE DISTRIBUTION DIVIDES TO THE TWO PARTS FROM FIG 15..................................................................................................56

    TABLE 14 . T HE ANSWER SET FOR EACH CATEGORIES .............................................................................................................................56

    TABLE 15 . T HE PARAMETERS FOR THE TWO MODELS .............................................................................................................................65

    TABLE 16 . T HE COMPARISON OF METHODS FOR EACH CATEGORY .............................................................................................................66

    TABLE 17 . T HE F MEASURE OF HARD QUESTION FOR EACH METHOD AND EACH CATEGORY ...........................................................................67

    11

  • 8/8/2019 3.0(Ying Liang Chen)

    12/113

    TABLE 18 . T HE F MEASURE OF EASY QUESTION FOR EACH METHOD AND EACH CATEGORY ............................................................................68

    TABLE 19 . T HE AUC FOR EACH METHODS AND CATEGORIES ..................................................................................................................70

    Table 19. The AUC for each methods and categories

    12

  • 8/8/2019 3.0(Ying Liang Chen)

    13/113

    Web2.0 , .

    ,

    . , ,

    . , (KGB-DR)

    , KGB-DR . ,

    . , .

    , , , .

    . , ,

    . ,

    . , ,

    KGB-DR .KGB-DR , ,

    . , . , KGB-DR

    . , .

    , , .

    13

  • 8/8/2019 3.0(Ying Liang Chen)

    14/113

    14

  • 8/8/2019 3.0(Ying Liang Chen)

    15/113

    ABSTRACT

    The CQA service is a typical forum of Web2.0 in sharing knowledge for each people, and there are millions of

    questions have posted and solved every day. Because of the above reasons and the variant users in CQA service,

    the question search and question ranking are the most important researches in the portal. In this paper, we address

    the problem of detecting the question being easy or hard by means of a probability model. In addition, we propose

    the approach called knowledge-gap-based difficulty rank (KGB-DR) algorithm that combines the user-user

    network and the architecture of the CQA service to solve this problem. The expert finding is the important subtask

    for the problem, if we want to detect the question being easy or hard, then the participation of experts is a major

    factor. There are many researches existing for the work of expert finding, unfortunately, it is not enough to apply

    to our problem due to the habits of users. That is to say, the expert can not only answer the hard questions but also

    easy questions but the prior works all omit it. We observe the phenomenon called knowledge gap that is related to

    the habit of users and we add this phenomenon to our KGB-DR algorithm for completing the expert finding. The

    KGB-DR algorithm include two steps, one is expert finding step, used to compute the expertise of users. And

    another is difficulty degree detecting step, used to compute the difficulty of questions and rank questions by

    difficulty. Specifically, we design two models for KGB-DR algorithm. The one of two is called a local difficulty

    model, and the model is based on users. Another is called a global difficulty model, and the model is based on all

    questions. The experimental results show the use of the local difficulty model is essential to our approach and our

    approach has the best performance over the all baseline approach.

    15

  • 8/8/2019 3.0(Ying Liang Chen)

    16/113

    16

  • 8/8/2019 3.0(Ying Liang Chen)

    17/113

    1.

    Web2.0 , , ,

    , Naver, .

    , , ,

    : , , . ,

    . ,

    : ?.

    , , ,

    . , ,

    : , .

    :

    . , ,

    , . ,

    .

    . .

    17

  • 8/8/2019 3.0(Ying Liang Chen)

    18/113

    2.

    , . ,

    , ,

    . .

    , ,

    . ,

    , ,

    , , .

    .

    3.

    , Fig 3 ,

    ( 1~ 7) ,

    . 1~ 3 , 4~ 7 1~

    3 4~ 7 . ,

    6 7 6.

    . Fig 3 , , (A) (1,1) (1,6) 1

    1 1 6 .

    Fig 3 , . .

    18

  • 8/8/2019 3.0(Ying Liang Chen)

    19/113

    .

    , ,

    . F , ,

    .

    , .

    , :1. 2.

    . -2~+2 ,+2 Fig 3 (f) .

    , , .

    , .

    4.

    , . , , ?,

    :

    phq=pqhphpq

    pqh , pqh ,

    q . Fig 6 , : .

    19

  • 8/8/2019 3.0(Ying Liang Chen)

    20/113

    , , HITS .

    , .

    ,

    , ,

    . . , .

    , t , t

    ,

    . .

    5.

    40,000 , 5 ,

    (Martial arts), (Cycling), (Health), (Pets), (Software),

    8,000 , Table 10 . 10

    , 70~90 , Table 14 .

    Eigenrumour Fujimura, K., Inoue, T., Sugisaki, M., (2005)The EigenRumor

    Algorithm for Ranking Weblogs, Proceedings of the 2nd Annual Workshop on the

    Weblogging Ecosystem: Aggregation, Analysis and Dynamics, WWW 2005. ,

    ( 1), Hits , ,

    . precision,recall,f-measure ROC curve AUC

    20

  • 8/8/2019 3.0(Ying Liang Chen)

    21/113

    . , Agichtein, E., Castillo, C.,

    Donato, D., Gionis, A. and Mishne, G. (2008) Finding high-quality content in social media.

    Proceedings of the international conference on Web searc h and web data mining. ACM,

    Palo Alto, California, USA. ( ) , ,

    , , f-measure,

    f-measure ; , .

    , .

    6.

    ,

    ,

    .

    .

    21

  • 8/8/2019 3.0(Ying Liang Chen)

    22/113

    1. INTRODUCTION

    1.1 Background

    Recently, the forum system of Web 2.0 is more and more popular and interesting. People can share or seek any

    information from any place in the world. One of the most popular and useful forum system is Community-based

    Question-Answering (CQA) portals. For example, the typical communities such as Yahoo! Answers 1 in English,

    Naver 2 in Korean, and Baidu Knows 3 in Chinese, can be regarded as variations of online forums. In this paper, We

    choose Yahoo!Answers where approximately over one hundred million resolved questions in English for our

    research .

    Yahoo! Answers(YA) ( Fig 1. The homepage of Yahoo! Answers ) is a CQA service where people can ask or

    answer questions on any topic. There are an enormous amount of questions and answers posted in English-

    language yahoo answers. In YA, users can ask or answer any questions in any categories by their volition and it

    exist the points system weighted to encourage users to answer questions and to limit spam questions. There are

    also levels (with point thresholds) which give more site access. When users answer a question, they can gain some

    points. Additionally, users can gain more points if their answers are Best answer which is selected by the

    question's asker or voted by the other users. The user level system in YA is reported in Table 1 , and we can see

    there are seven degree levels with users in YA and each user account only has just a particular level. The

    1 http://answers.yahoo.com2 http://www.naver.com3 http://zhidao.baidu.com

    22

    http://en.wikipedia.org/wiki/Forum_spamhttp://answers.yahoo.com/http://www.naver.com/http://zhidao.baidu.com/http://answers.yahoo.com/http://www.naver.com/http://zhidao.baidu.com/http://en.wikipedia.org/wiki/Forum_spam
  • 8/8/2019 3.0(Ying Liang Chen)

    23/113

    promoting of the user level is decided by the earning of points. For example, if the user first creates an account in

    YA then the user will get 100 points and the level of the user is one. When the user earns more points by doing

    several actions in YA and it bring about the points this user obtain are more than 249, the level of the user will

    promote to level 2. In general, as more points user earns, the level user obtains is higher; however, the level user

    has is up to seven at most. How many points user earns is decided by what type of action user does in YA, and the

    mapping of actions and points is listed in Table 2. For example, answering a question will get two points, but

    users can obtain the extra ten points when their answers are selected as the best answer. In addition, there are

    some actions will lose the points if user does. For example, if users ask a question then users will lose five points,

    or users will lose two points if they delete their answer in the question.

    23

  • 8/8/2019 3.0(Ying Liang Chen)

    24/113

    Fig 1. The homepage of Yahoo! Answers

    24

  • 8/8/2019 3.0(Ying Liang Chen)

    25/113

    Table 1. The mapping from the user level to points in YA

    Table 2. The mapping of action to the gain of points

    25

  • 8/8/2019 3.0(Ying Liang Chen)

    26/113

    1.2 Motivation

    On account of the increasing progressively of questions and answers, user cant find the questions they want to

    know or discuss effectively, on account of this reason, most of works in CQA service are to improve their

    functionalities like question ranking, question search, or question recommendation. But the output of the prior

    work do not consider the expertise (or authority) of users, i.e., for question ranking, the user may be a amateur in

    this area and the outputs of the search engine are so hard that the user cant understand it, or the user who is expert

    may wants to search the harder questions but the results are easy on the top of rank. Thus, in this paper, we are

    26

    http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E9%81%9E%E5%A2%9Ehttp://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E9%81%9E%E5%A2%9E
  • 8/8/2019 3.0(Ying Liang Chen)

    27/113

    concerned with how to rank questions by difficulty . The work we want to do is similar to find high-quality

    questions Agichtein, E., Castillo, C., Donato, D., Gionis, A. and Mishne, G. (2008) Finding high-quality content

    in social media. Proceedings of the international conference on Web searc h and web data mining. ACM, Palo

    Alto, California, USA. Jurczyk, P. and Agichtein, E. (2007) Hits on question answer portals: exploration of link

    analysis for author ranking. Proceedings of the 30th annual international ACM SIGIR conference on Research and

    development in information retrieval. ACM, Amsterdam, The Netherlands. that identify the question as high-

    quality or low-quality; however, our work is to tell the question as easy or hard. Moreover, the work of expert

    finding is associated with our task strongly. The expert is the user who is familiar with the particular topic or

    category, and the expert also can solve most of questions about the particular category. The non-expert is contrary

    to the expert, and it includes the amateur who is strange to the category and the novice who attach the topic for a

    little time but not much familiar with it. For example, we list three easy questions and hard questions about

    karate respectively in Table 3 . The first easy question What ideas do you have in my first karate class? is can

    be identified the asker as an amateur on karate and the question also can be solved by the amateur who have been

    to the karate class over two times. On the contrary, the hard question such as What is the difference between

    Traditional Okinawan Karate and the modern sport Karate? , if the user who want to solve or answer this

    question, he/she must has the certain knowledge or experience in karate and the amateur who attach karate in a

    little time must not be available to answer it.

    Table 3. Easy question vs. Hard question

    27

  • 8/8/2019 3.0(Ying Liang Chen)

    28/113

    As a result of the above describing, we list the following principles from observation without considering the

    text significance that easy or hard question must be included:

    1. If the asker of the question is more close to expert, the question has higher probability to be a hard

    question.

    2. If most of the repliers on the question are amateur, then the question must be easy, instead, the more

    expert on the question, the harder the question is.

    3. If experts and amateurs have answered the question and the asker chooses amateur as best answer, it

    shows that the question is so easy that the amateur can answers the right content that asker also believe it

    can tackle his/her question.

    In short, the difference between easy question and hard question is the ratio of non-expert and expert participating

    in the question respectively, and we can separate the task as two parts: expert finding and question ranking

    Determining the question as easy or hard is not trivial, there are some noisy in the real world. First, identifying

    the question as hard or easy is determined by people, i.e. one may think the question being hard because he/she

    has not been to see it ever and another think the question being easy due to his/her experience, i.e., it is subjective

    28

  • 8/8/2019 3.0(Ying Liang Chen)

    29/113

    for everyone. Second, there are some questions that asker just want to share or discuss something instead of

    asking something, its hard to tell the question is easy or hard, the same thing also happen in the question that

    exist the war of word or abusive content. Third, sometimes we consider there are more hard terms in the hard

    question than in the easy question, but people can also use easy terms to ask or answer questions, so we cant

    using the significance of terms to solve, on the contrary, we utilize the relationship between users and the

    framework in YA to tackle this problem. Finally, noting that we consider the difficulty degree of a question is

    determined by how many experts in it; however, the fact in real world is that the experts not only answer hard

    questions but also easy questions and it bring out the difficulty of determining the question is easy or hard. In the

    prior works, there exist some works that ranking blogs by using the PageRank Brin, S. & Page, L., (1998) The

    anatomy of a large-scale hypertextual web search engine. Proceedings of the seventh international conference on

    World Wide Web 7. Brisbane, Australia: Elsevier Science Publishers B. V. and HITS Kleinberg, J.M. (1999)

    Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632. algorithm, such as EigenRumor

    Fujimura, K., Inoue, T., Sugisaki, M., (2005)The EigenRumor Algorithm for Ranking Weblogs, Proceedings of

    29

    http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0http://tw.dictionary.yahoo.com/search?ei=UTF-8&p=%E7%AD%86%E6%88%B0
  • 8/8/2019 3.0(Ying Liang Chen)

    30/113

    the 2nd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis algorithm. The EigenRumor

    algorithm uses the interactions between posters, repliers, and blogs in order to rank blogs, and then users can

    obtain the feedback of the blog scores to fix their authorities, and the iteration of the progress will terminate when

    the score of blogs change significantly. The nodes and the architecture in EigenRumor algorithm are similar to us.

    A question containing an asker and answerers corresponds to a blog containing a poster and repliers, but

    EigenRumor algorithm cant apply to find hard questions in CQA service because of omitting the relationship

    between hard questions and users. That is to say, if we use EigenRumor algorithm to compute the score of

    questions, then the feedback of the scores will back to the users that have asked or answered the question. But the

    experts can answer the easy questions in the real world, the feedback of expert may increasing the expertise of

    other non-expert who have answered the same question with experts, and it is not reasonable to detect the

    expertise of users and the difficulty degree of questions.

    30

  • 8/8/2019 3.0(Ying Liang Chen)

    31/113

    1.1 Method Abstract

    Although the above difficulties are discussed in this task; however, we observe the phenomenon called

    knowledge gap exists in the CQA service. The knowledge gap is a phenomenon which associates with users and it

    is illustrated in Fig 2 . The stairs in Fig 2 represent the difficulty in the specific topic and the user who stands

    higher stair represents he/she has more knowledge in this topic. For example, the expert in Fig 2 stands on the top

    of the stairs indicates the expert is familiar to all things about the topic; however, the amateur and the novice stand

    on the bottom of the stairs owing to their strange with the topic, and the knowledge gap is that the gap between

    any two users. For example, the gap A in Fig 2 represents the knowledge gap between the expert and the novice

    and the gap B represents the knowledge gap between the expert and the novice. The gap may bring about two

    situations. The one of two is that the user who stands on the lower stair does not have ability to answer the

    question that the user who stands on the higher stair has asked. Another is that the user who stands on the higher

    stair may has no interesting in answering the question that the lower-stair user has asked although he/she has

    enough knowledge. For instance, the one of the hard questions in Table 3 : What are advantages of karate over

    other martial arts?, if the user wants to answer the question, the user must have more knowledge about karate

    and other martial arts and then he/she can compare the difference between the karate and martial arts. The

    progress is hard to a novice or an amateur because of their poor knowledge. On the other hand, the one of the easy

    questions in Table 3 : What ideas do you have in my first karate class? , the question is so easy to experts who

    have a lot of experiences in karate and experts may have less interesting in answering the question because the

    experts can see the same questions every day. In addition, the boundary in Fig 2 represents the line that separates

    the hard question and easy question. In the short word, there are two main principles on the knowledge gap. First,

    31

  • 8/8/2019 3.0(Ying Liang Chen)

    32/113

    the non-experts have no ability to answer the question that over their knowledge. Second, the experts have less

    interesting in the question that under the certain difficulty degree. In this paper, we investigate the knowledge gap

    how to and how much effect CQA service and we propose knowledge-gap-based difficulty rank (KGB-DR)

    algorithm to tackle the task that determine the question being easy or hard. Specifically, KGB-DR makes use of

    the structure of CQA service and the relationship between users, and there are the iteration of two steps namely an

    expert finding step and a difficulty degree detecting step.

    (1) Expert finding step:

    This step is to compute the probability of a user being an expert in asking or answering, we use graph-based

    algorithm to compute first and then adjust them by the feedback from the difficulty degree detecting step in

    order to fit the knowledge gap between users.

    (2) Difficulty degree detecting step:

    The goal of this step is to compute the probability of a question being a hard question and rank these

    questions by their difficulty degree score.

    32

  • 8/8/2019 3.0(Ying Liang Chen)

    33/113

    Fig 2. The knowledge gap

    1.1 Application

    Given a set of questions and rank the questions by difficulty degree can apply the following case potentially. First,

    for CQA service and discussion board, it can shows types of difficulty-degree questions to types of users. For

    example, if an amateur enters the particular category, then rout him/her the easy question. Second, it can promote

    the performance of the search engine by adding the difficulty-degree factor, the outputs are different for each

    types of users. Third, ranking question by difficulty is to find the suitable questions for the types of users, user can

    select the types of difficulty questions that they want. It shows that can apply to the commercial advertisement.

    For example, for the cycling, most of novice just need cheep or usable bicycle and expert may want to the other

    functional such as lighter, tougher, or even high-level part in bicycle comparatively.

    1.2 Paper Organization

    The rest of this paper is structured as follows. In Section RELATED WORK we briefly discuss related work. In

    Section The knowledge gap , we describe the knowledge gap in more detail. In Section Problem definition and

    Our Approach , we define our task by the probability model and we present our approach called KGB-DR

    algorithm to solve this task. Next, experimental results are reported in Section Experiments. Section concludes

    the paper and our future work.

    33

  • 8/8/2019 3.0(Ying Liang Chen)

    34/113

    2. RELATED WORK

    The task of identifying the question is easy or hard is similar the prior works such question ranking, question

    search, or question recommendation. The common goal of the above tasks is to support users to find questions

    they want to know effectively and the difference between us is that our goal is to rank the questions in the

    particular category by the difficulty or recommend hard questions to expert, easy questions to non-expert.

    Besides, the most of important subtask in this task is to find experts in the particular category and we utilize the

    user-user reactions to model it. The related work we refer previously will be particularized in this section.

    2.1 Link Analysis

    PageRank Brin, S. & Page, L., (1998) The anatomy of a large-scale hypertextual web search engine.

    Proceedings of the seventh international conference on World Wide Web 7. Brisbane, Australia: Elsevier Science

    Publishers B. V. is a link analysis algorithm that calculates the importance score for each element of a group of

    hyperlinked documents. By using the linking structure of the web pages, PageRank interprets the links as the

    indicator for the importance of pages. Basically, a link from page A to page B is regarded as a vote from page A to

    page B. The algorithm intends to estimate the probability distribution representing a user randomly clicking on

    links. In addition, a page linked from a page with high PageRank score would be given a high score. Kleinberg

    34

  • 8/8/2019 3.0(Ying Liang Chen)

    35/113

    Kleinberg, J.M. (1999) Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632. proposed the

    Hyperlink-Induced Topic Search (HITS) algorithm. This is another prominent algorithm for ranking web pages.

    The critical concepts are the hubs and authorities. For each web page, two scores are calculated based on the

    corresponding concepts. The hub score presents the quality of links to other pages about that topic, while the

    authority score indicates the quality of the page content. Besides web pages, Yupeng et al. [1] investigated the co-

    occurrences of people from web pages and communication patterns from emails to discover the relationships

    among people.

    2.2 Expert finding in social media

    There are several researches focusing on how to find expert by modeling and computing user-user graph such

    as McCallum, A, A. Corrada-Emanuel, and X. Wang. (2005) Topic and role discovery in social networks. In

    Proceedings of 19th International Joint Conference on Artificial Intelligence. Jurczyk, P. and Agichtein, E. (2007)

    Discovering authorities in question answer communities by using link analysis. Proceedings of the sixteenth ACM

    conference on Conference on information and knowledge management. ACM, Lisbon, Portugal. Zhang, J.,

    Ackerman, M.S. and Adamic, L. (2007) Expertise networks in online communities: structure and algorithms.

    Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta, Canada. . McCallum

    McCallum, A, A. Corrada-Emanuel, and X. Wang. (2005) Topic and role discovery in social networks. In

    Proceedings of 19th International Joint Conference on Artificial Intelligence. et al. utilize the user-user graph to

    35

  • 8/8/2019 3.0(Ying Liang Chen)

    36/113

    find the experts for particular topics and then Zhang et al. Zhang, J., Ackerman, M.S. and Adamic, L. (2007)

    Expertise networks in online communities: structure and algorithms. Proceedings of the 16th international

    conference on World Wide Web. ACM, Banff, Alberta, Canada. use the network-based ranking algorithms HITS

    Kleinberg, J.M. (1999) Authoritative sources in a hyperlinked environment, J. ACM, 46, 604-632. and PageRank

    Brin, S. & Page, L., (1998) The anatomy of a large-scale hypertextual web search engine. Proceedings of the

    seventh international conference on World Wide Web 7. Brisbane, Australia: Elsevier Science Publishers B. V. to

    identify users with high expertise. Their results show high correlation between link-based metrics and the answer

    quality. Next, Liu et al Liu, X., Croft, W.B. and Koll, M. (2005) Finding experts in community-based question-

    answering services. Proceedings of the 14th ACM international conference on Information and knowledge

    management. ACM, Bremen, Germany. use the features such as authors activity, number of clicks for finding the

    best answers for a given question. Then Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question

    answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on

    information and knowledge management. ACM, Lisbon, Portugal. Jurczyk, P. and Agichtein, E. (2007) Hits on

    question answer portals: exploration of link analysis for author ranking. Proceedings of the 30th annual

    international ACM SIGIR conference on Research and development in information retrieval. ACM, Amsterdam,

    The Netherlands. use HITS algorithm to find the authority of users in question/answering network that an asker is

    linked to an answerer if the answerer replies the question that asker has been asked. The experiments show that

    the obtained authority score is better than simply counting the number of answers an answerer has given.

    Although the results demonstrate that HITS algorithm is a good approach to find expert, but there are some

    potential factor like the habit of asking questions and replying answers may reduce the performance. Unlike

    36

  • 8/8/2019 3.0(Ying Liang Chen)

    37/113

    Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question answer communities by using link

    analysis. Proceedings of the sixteenth ACM conference on Conference on information and knowledge

    management. ACM, Lisbon, Portugal. and Zhang, J., Ackerman, M.S. and Adamic, L. (2007) Expertise networks

    in online communities: structure and algorithms. Proceedings of the 16th international conference on World Wide

    Web. ACM, Banff, Alberta, Canada. , Zhou et al. Zhou, Y., Cong, G., Cui, B., Jensen, C.S. and Yao, J. (2009)

    Routing Questions to the Right Users in Online Communities. Proceedings of the 2009 IEEE International

    Conference on Data Engineering. IEEE Computer Society. utilize the structural relations among users in the

    forum system and use content-based probability model to find the experts for the particular question.

    In the other social media, Balog et al. Balog, K., Azzopardi, L. and Rijke, M.d. (2006) Formal

    models for expert finding in enterprise corpora. Proceedings of the 29th annual

    international ACM SIGIR conference on Research and development in information retrieval.

    ACM, Seattle, Washington, USA. propose extended language models to address the expert finding problem

    in enterprise corpora. From 2005, Text REtrieval Conference (TREC) has provided a platform with the Enterprise

    Search Track for researchers to empirically assess their methods for expert finding N. Craswell, A.P.d.V., and I.

    Soboroff. (2005) Overview of the trec-2005 enterprise track TREC05, 199-205. . In addition, there are some

    researches for finding experts over the e-mail corpus such as Campbell, C.S., Maglio, P.P., Cozzi, A. and Dom, B.

    (2003) Expertise identification using email communications. Proceedings of the twelfth international conference

    37

  • 8/8/2019 3.0(Ying Liang Chen)

    38/113

    on Information and knowledge management. ACM, New Orleans, LA, USA.Dom, B., Eiron, I., Cozzi, A. and

    Zhang, Y. (2003) Graph-based ranking algorithms for e-mail expertise analysis. Proceedings of the 8th ACM

    SIGMOD workshop on Resea rch issues in data mining and knowledge discovery. ACM, San Diego, California..

    2.3 Question Ranking

    Question ranking is to rank the results for purpose of browsing-time decreasing, the typical approach is using

    the information of Q&A content such as best ratings such as Adamic, L.A., Zhang, J., Bakshy, E. and Ackerman,

    M.S. (2008) Knowledge sharing and yahoo answers: everyone knows something. Proceeding of the 17th

    international conference on World Wide Web. ACM, Beijing, China.. Bian, J., Liu, Y., Agichtein, E. and Zha, H.

    (2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of the 17th

    international conference on World Wide Web. ACM, Beijing, China. .Jeon et al. Jeon, J., Croft, W.B., Lee, J.H. and

    Park, S. (2006) A framework to predict the quality of answers with non-textual features. Proceedings of the 29th

    annual international ACM SIGIR conference on Research and development in information retrieval. ACM,

    Seattle, Washington, USA. addressed the answer quality problem in a community QA portal and tried to estimate

    it, they used a set of non-textual features such as answer length, number of points received, etc. for determining

    answer quality. Agichtein et al. Agichtein, E., Castillo, C., Donato, D., Gionis, A. and Mishne, G. (2008) Finding

    high-quality content in social media. Proceedings of the international conference on Web searc h and web data

    mining. ACM, Palo Alto, California, USA. expands on Jeon, J., Croft, W.B., Lee, J.H. and Park, S. (2006) A

    38

  • 8/8/2019 3.0(Ying Liang Chen)

    39/113

    framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual

    international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle,

    Washington, USA. by exploring a larger range of features including both structural, textual, and community

    features. Su et al. Su, Q., Pavlov, D., Chow, J.-H. and Baker, W.C. (2007) Internet-scale collection of human-

    reviewed data. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta,

    Canada. analyzed the quality of each answers varies significantly. Bian et al. Bian, J., Liu, Y., Agichtein, E. and

    Zha, H. (2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of

    the 17th international conference on World Wide Web. ACM, Beijing, China. proposed to solve collaborative QA

    by considering both answer quality and relevance and also used content-based quality answers without

    considering user expertise. Because of the answer quality problem on Bian, J., Liu, Y., Agichtein, E. and Zha, H.

    (2008) Finding the right facts in the crowd: factoid question answering over social media. Proceeding of the 17th

    international conference on World Wide Web. ACM, Beijing, China. Jeon, J., Croft, W.B., Lee, J.H. and Park, S.

    (2006) A framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual

    international ACM SIGIR conference on Research and development in information retrieval. ACM, Seattle,

    Washington, USA. Su, Q., Pavlov, D., Chow, J.-H. and Baker, W.C. (2007) Internet-scale collection of human-

    reviewed data. Proceedings of the 16th international conference on World Wide Web. ACM, Banff, Alberta,

    Canada. , Suryanoto et al. Suryanto, M.A., Lim, E.P., Sun, A. and Chiang, R.H.L. (2009) Quality-aware

    collaborative question answering: methods and evaluation. Proceedings of the Second ACM International

    Conference on Web Search and Data Mining. ACM, Barcelona, Spain. propose a quality-aware framework that

    considers both answer relevance and answer quality derived from answer features and expertise of answerers.

    39

  • 8/8/2019 3.0(Ying Liang Chen)

    40/113

    Wang et al. Wang, X.-J., Tu, X., Feng, D. and Zhang, L. (2009) Ranking community answers by modeling

    question-answer relationships via analogical reasoning. Proceedings of the 32nd international ACM SIGIR

    conference on Research and development in information retrieval. ACM, Boston, MA, USA. consider questions

    and their answers as relational data but instead model them as independent information and propose link-based

    algorithm for evaluating the quality of answers.

    2.4 Question Search

    The research of question search is first conducted using FAQ data Burke, R.D., Hammond, K.J., Kulyukin,

    V.A., Lytinen, S.L., Tomuro, N. and Schoenberg, S. (1997) Question Answering from Frequently Asked Question

    Files: Experiences with the FAQ Finder System. University of Chicago., Lai, Y.-S., Fung, K.-A. and Wu, C.-H.

    (2002) FAQ mining via list detection. proceedings of the 2002 conference on multilingual summarization and

    question answering - Volume 19. Association for Computational Linguistics. ,Sneiders, E. (2002) Automated

    Question Answering Using Question Templates That Cover the Conceptual Model of the Database. Proceedings

    of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers.

    Springer-Verlag. . FAQ finder Burke, R.D., Hammond, K.J., Kulyukin, V.A., Lytinen, S.L., Tomuro, N. and

    Schoenberg, S. (1997) Question Answering from Frequently Asked Question Files: Experiences with the FAQ

    Finder System. University of Chicago. heuristically combines statistical similarities and semantic similarities

    between questions and FAQs. Sneiders Sneiders, E. (2002) Automated Question Answering Using Question

    Templates That Cover the Conceptual Model of the Database. Proceedings of the 6th International Conference on

    Applications of Natural Language to Information Systems-Revised Papers. Springer-Verlag. proposed template

    40

  • 8/8/2019 3.0(Ying Liang Chen)

    41/113

    based FAQ retrieval systems. Lai et al. Lai, Y.-S., Fung, K.-A. and Wu, C.-H. (2002) FAQ mining via list

    detection. proceedings of the 2002 conference on multilingual summarization and question answering - Volume

    19. Association for Computational Linguistics. proposed an approach to automatically mine FAQs from the web.

    Recently, the research of question search has been further extended to the cQA. For example, Jeon et al. Jeon,

    J., Croft, W.B. and Lee, J.H. (2005) Finding similar questions in large question and answer archives. Proceedings

    of the 14th ACM international conference on Information and knowledge management. ACM, Bremen, Germany.J

    eon, J., Croft, W.B. and Lee, J.H. (2005) Finding semantically similar questions based on their answers.

    Proceedings of the 28th annual international ACM SIGIR conference on Research and development in

    information retrieval. ACM, Salvador, Brazil.Jeon, J.A.C., W. B., Learning translation-based language models

    using q&a archives. Technical Report,University of Massachusetts. compared four different retrieval methods, i.e.

    cosine similarity, Okapi, language model (LM), and statistical machine translation model (SMT), for

    automatically fixing the lexical chasm between questions of question search. They found that the SMT-based

    method performed the best.

    2.5 Question Recommendation

    The question recommendation is like question search, the goal of two search is to find the questions that user

    prefer to know or reply. The first question recommendation problem was addressed by Cao et al. Cao, Y., Duan,

    H., Lin, C.-Y., Yu, Y. and Hon, H.-W. (2008) Recommending questions using the mdl-based tree cut model.

    Proceeding of the 17th international conference on World Wide Web. ACM, Beijing, China., they consider a good

    41

  • 8/8/2019 3.0(Ying Liang Chen)

    42/113

    recommendation should provide alternative aspects around users interest and they use MDL-based tree cut model

    to tackle this problem. Sun et al. Sun, K., Cao, Y., Song, X., Song, Y.-I., Wang, X. and Lin, C.-Y. (2009) Learning

    to recommend questions based on user ratings. Proceeding of the 18th ACM conference on Information and

    knowledge management. ACM, Hong Kong, China. address the recommend question problem, they use user

    rating and propose majority-based perceptron algorithm for avoiding the influence of noisy instances by

    emphasizing its learning over data instances from majority users and they show the effectiveness of their approach

    through intensive experiments.

    The question recommendation and question ranking are similar to what we discuss in this paper, but they have

    the same problem as the most related works about question search we discuss above: they focused on content-

    based analysis and ignored the interaction between user and user. The recently works about the interaction of

    users like Nam, K.K., Ackerman, M.S. and Adamic, L.A. (2009) Questions in, knowledge in?: a study of naver's

    question answering community. Proceedings of the 27th international conference on Human factors in computing

    systems. ACM, Boston, MA, USA. in Naver, Adamic, L.A., Zhang, J., Bakshy, E. and Ackerman, M.S. (2008)

    Knowledge sharing and yahoo answers: everyone knows something. Proceeding of the 17th international

    conference on World Wide Web. ACM, Beijing, China.. in YA, and Yang, J., and Wei, X.,(2009) Seeking and

    Offering Expertise across Categories: A in Baidu analyzed the activity of users in Q&A forums and found the

    types of users such as expert and non-expert having different habit in asking and answering.

    42

  • 8/8/2019 3.0(Ying Liang Chen)

    43/113

    3. THE KNOWLEDGE GAP

    In this section, we investigate the knowledge gap how to and how much effect in YA. First, we will propose the

    phenomenon that called knowledge gap in YA or other CQA service and explain the relationship between the

    knowledge gap and the hard question in section The Knowledge Gap in YA . Second, we will show how to

    quantify this phenomenon for each category in YA. Finally, we will propose some problems about expert finding

    and what type of expert we want to extract in section The Expert Finding .

    3.1 The Knowledge Gap in YA

    According to Table 1 and Table 2 , we can assume the higher the level, the more expertise user has for the

    moment, and the question that higher level user asked have a high probability being a hard question, and we

    combine the level 6 and level 7 as level 6 for conveniently. Fig 3 shows that the proportion of each level of askers

    replied by each level of the answerers in five categories. The color of a zone (i,j) is more darker than the one of a

    zone (k,j) if the proportion of the level j askers replied by level i answerers is higher than the proportion of the

    level j askers replied by level k answerers, for example, in Fig 3 (a), the zone (1,1) is more black than zone

    (2~6,1). It indicates that the ratio of the questions level 1 users have asked are replied by the level 1 users more

    than the users of other levels. In Fig 3 (a), we can see the questions that higher level user asked and the ratio of the

    43

  • 8/8/2019 3.0(Ying Liang Chen)

    44/113

    higher level user replied is higher and while the questions that low level user asked are contrary. There are two

    explains to prove the above situation. First, the high level users tend to reply or participate in the challenging

    question that can contribute their knowledge to other people. Second, the low level users just can reply the easy

    question that the low level users asked because of their poor knowledge, so we can see the questions that level 1

    answerers and level 2 answerers reply are focusing on the questions that from level 1 user to level 4 user ask. The

    observation of Fig 3 tell us that the hard question q is related to the expertise of users that participate in the

    question q, and we call this phenomenon as a Knowledge Gap and the graph corresponding to the knowledge

    gap is called knowledge gap diagram . Besides, there are different degrees of the knowledge gap for each

    category. For example, the degree of the knowledge gap in some categories is strong such as Martial arts, Pets,

    and Health. However, the degree of the knowledge gap in some categories is weak such as Cycling and

    Software, in fact, we can see the questions that low-level user have asked have been answered by high-level users

    more than low-level users in Fig 3 (b)Cycling , and it indicates that the experts in Cycling are enthusiastic to solve

    the question that amateur has asked. On the contrary, in Fig 3 (e)Software, we can see the low-level users also

    have ability to answer the questions that high-level users have asked, and it shows that the most of

    questions in Software are easy so that the low-level users can answer it. In addition, Fig 3 (f) is a perfect

    knowledge gap diagram that low-level users all answer the questions that low-level users have asked and

    high-level users just answer the questions that high-level users have asked. Therefore, the using of the

    knowledge gap is most important in detecting the question being easy or hard, and we will show how to utilize the

    knowledge gap in the next section.

    44

  • 8/8/2019 3.0(Ying Liang Chen)

    45/113

    45

  • 8/8/2019 3.0(Ying Liang Chen)

    46/113

    Fig 3. The knowledge gap diagram

    3.2 The Quantification of Knowledge Gap Diagram

    3.2.1 The preliminary for quantifying the knowledge gap diagram

    Table 4 is the prototype of knowledge gap diagram in category Health, and we will show how to quantify the

    knowledge gap diagram by this table in this paragraph. First, we normalize each asker level from Table 4 , and the

    result is shown as Table 5Table 4. Note that there are two principles in the knowledge gap, one of two is that the

    non-experts have no ability to answer the question which over their knowledge and another is the experts have

    less interesting in the question which under the certain difficulty degree. We then formulate the knowledge gap

    diagram as follow by the two principles:

    KGSC= ExpertC+NonexpertC

    Where C is the category in YA, and ExpertC and Nonexpert(C) are the knowledge gap score for expert and non-

    expert in the category C.

    Table 4. The knowledge gap diagram in Health

    46

  • 8/8/2019 3.0(Ying Liang Chen)

    47/113

    Table 5Table 4. Do the normalization from Table 4

    3.2.2 The four zones in the knowledge gap diagram

    Now we assume if the level of the user is from 1 to 3 then the user is a non-expert, and if the level of the user is

    from 4 to 6 then the user is an expert. Fig 4 illustrates the non-expert and expert in the knowledge gap diagram,

    Fig 4 (a) means the area that non-expert and expert can move respectively, and Fig 4 (b) means the area of

    interdiction that non-expert and expert cannot move frequently, i.e., the non-experts cannot answer the question

    that experts have asked frequently and the experts cannot answer the question that non-experts have asked

    frequently. According to the Fig 4 , we can divide four zones in the knowledge gap diagram, and the four zones

    47

  • 8/8/2019 3.0(Ying Liang Chen)

    48/113

    can be illustrated as Fig 5 . First, the zone of regular non-expert (RN), and this zone gathers the user who is a non-

    expert and focuses on answering easy questions due to his/her poor knowledge. Second, the zone of promising

    non-expert (PN), and this zone gathers the user who is a non-expert at present but has ability to answer hard

    questions. Third, the zone of regular expert (RE), and this zone gathers the user who is an expert and has ability to

    answer the hard questions that experts have asked. Finally, the zone of enthusiastic expert (EE), this zone gathers

    the user who is an expert but focusing on answering easy questions due to his/her enthusiasm. Besides, Table 6 is

    an example of four zones from the mapping of Table 5Table 4 .

    Fig 4. The non-expert and expert in the knowledge gap diagram

    48

  • 8/8/2019 3.0(Ying Liang Chen)

    49/113

    Fig 5. The classification for non-expert and expert in the knowledge gap diagram

    Table 6. The four zones in the knowledge gap diagram

    3.2.3 The quantification of non-expert and expert

    Recall that the knowledge gap score is based on the answering of the expert and non-expert, and we will show

    how to quantify the two factors in four viewpoints. First, the behavior of answering questions for non-expert and

    expert can be viewed as the responsibility. That is to say, the non-experts cannot answer hard questions due to

    their inexperience, so answering easy questions is the responsibility for non-experts. In this paper, we evaluate the

    responsibility of non-experts by F(RN)-F(PN), where F(RN) is the flow of non-experts in the zone RN and F(PN)

    49

  • 8/8/2019 3.0(Ying Liang Chen)

    50/113

    is the flow of non-experts in the area PN. If the value of F(RN)-F(PN) is bigger, then it indicates the non-experts

    focus on answering easy questions than hard questions compared to the other experts. In addition, we quantify

    F(RN) and F(PN) by summing all ratios in the zone RN and PN respectively. For example, F(RN)=1.51 and

    F(PN)=1.23 in Table 6 . Second, the experts have more knowledge and experiences than the other non-experts, so

    answering hard questions can be seen as their responsibility. In the same way, we quantify the value of the

    responsibility of experts by F(RE)-F(EE). The rest of the viewpoints are about the asking of non-expert and

    expert. We assume that the question that expert has asked is hard question before. However, we want to know the

    difficulty is credible or not credible. In general, the hard question should be answered by experts and if non-

    experts also can answer it then the difficulty of the hard question is doubt. We utilize F(RE)-F(PN) to evaluate the

    credibility of hard questions that experts have asked, where F(RE) is the flow of experts in the zone RE and F(PN)

    is the flow of non-experts in the zone PN and the value of F(RE)-F(PN) is bigger representing the hard questions

    that experts have asked is credible. In the same way, we quantify the credibility of easy questions that non-experts

    have asked by F(RN)-F(EE), where F(RN) is the flow of non-experts in the zone RN and F(EE) is the flow of

    experts in the zone EE. The value of F(RN)-F(EE) is bigger representing the easy questions that non-experts have

    asked is credible. Furthermore, we summarize this paragraph as Table 7 .

    50

  • 8/8/2019 3.0(Ying Liang Chen)

    51/113

    Table 7. The evaluation of each factor

    3.2.4 The weight for non-expert and expert

    In general, because of the number of askers in each level is variant, the weight of setting is necessary for expert

    and non-expert, and we utilize the ratio of asker level to set the weight. The distribution of asker for each level in

    category Health is shown as Table 8 , Table 8 (a) is the distribution from level 1 to level 7 and Table 8 (b) is the

    ratio of non-expert (from level 1 to level 4) and expert (from level 4 to level 7), in addition, we choose the ratio in

    Table 8 (b) to weight the expert and non-expert. The computation of non-expert and expert is shown as:

    ExpertC=Wex*(2*FcEE-FcRE-Fc(PN))

    51

  • 8/8/2019 3.0(Ying Liang Chen)

    52/113

    NonexpertC=Wnon*(2*FcRN-FcEE-Fc(PN))

    Where Wex is the ratio of the questions experts have asked and Wnon is the ratio of the questions non-experts

    have asked, and FcRE , FcEE ,Fc(PN) ,and FcRN are the four zones of category C we define in Table 6 . Finally,

    the value of KGS(Health) is 0.296*0.9023+0.796*0.0977=0.43.

    Table 8. The distribution of asker for each level in category Health

    3.2.5 The L-KGS

    Note that the knowledge gap diagram is based on the level of users; however, we dont know the level of users is

    gained from this category or other categories. Thus, we must add the other level index to adjust it. In general, we

    can know that the more the question user answers the more points and level user obtains from Table 1 and Table

    2. According to this reason, we utilize the average number of answers asker replies to adjust their user level.

    Additionally, we also set the weight as the square root of the number, and the formula are followed as:

    L-KGSC= ExpertC+NonexpertC

    ExpertC=Wex*Lex*(2*FcRE-FcEE-Fc(PN))

    52

  • 8/8/2019 3.0(Ying Liang Chen)

    53/113

    NonexpertC=Wnon*Lnon*(2*FcRN-FcEE-Fc(PN))

    Where Lex is the number of replies per asker who is an expert and Lnon is the number of replies per asker who is

    a non-expert and both do the square root of its.

    3.2.6 The results of knowledge gap score (KGS)

    The results for all categories can be seen in Table 9 , and the order of KGS by descent is Pets>Health>

    Software>Martial arts>Cycling. However, after considering the number of questions askers reply, the ranking of

    L-KGS is Pets>Health>Martial arts>Software>Cycling. The ranking growth of the Martial arts in L-KGS due to

    the more number of questions pre asker answered than the category Software.

    Table 9 . The KGS for each category

    53

  • 8/8/2019 3.0(Ying Liang Chen)

    54/113

    3.3 The Expert Finding By Knowledge Gap

    In general, we regard the higher-level user as the expert. But there are some noisy problems in QA portal, i.e.,

    there exist some people that their user level is not high but they are experts. The following examples are the parts

    of the noisy expert finding using user level in YA:

    Case1. The level of users is from gaining points such as Table 2, and the points users gained are from different

    categories. It shows that if the level of user is 7, it is not true that the user is all expert in the categories

    he/she have been asked or replied.

    54

  • 8/8/2019 3.0(Ying Liang Chen)

    55/113

    Case2. If you usually reply questions in YA, your level should increase quickly, however, your level may increase

    slowly if you reply questions rarely. In the real world, some real experts do not have much time to reply in

    YA, it bring their level are not very high.

    Case3. Although some users are novice in the category, their knowledge is less than real expert. But they are

    enjoying replying the questions that the other amateur or novice asked in the category, and though they do

    not have ability to reply the hard questions but they reply a lot of easy question instead. The result is that

    their user level increases quickly but they are not real expert.

    Case4. The most of users may not have only one account id in YA, however, they reply the question q1 using

    account id 1 and reply the other question q2 using account id 2, this situation bring their expertise to

    ambiguous. Additionally, some users may have bad intention to create many accounts to ask question and

    reply by other accounts for increasing their level quickly.

    The methods that the prior works Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question

    answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on

    information and knowledge management. ACM, Lisbon, Portugal. Jurczyk, P. and Agichtein, E. (2007) Hits on

    question answer portals: exploration of link analysis for author ranking. Proceedings of the 30th annual

    international ACM SIGIR conference on Research and development in information retrieval. ACM, Amsterdam,

    The Netherlands. Zhang, J., Ackerman, M.S. and Adamic, L. (2007) Expertise networks in online communities:

    55

  • 8/8/2019 3.0(Ying Liang Chen)

    56/113

    structure and algorithms. Proceedings of the 16th international conference on World Wide Web. ACM, Banff,

    Alberta, Canada. has proposed can solve most of problems, but it is not enough if we want to use the expertise of

    users to detect the question being easy or hard. In fact, the habits of users in asking and answering are an

    important factor and it reflects on the knowledge gap diagram for each category, but none of the previous works

    consider it. For example, the non-experts could answer so many easy questions that bring about the abnormal

    expertise increasing of the non-experts; nevertheless, the experts could answer easy questions though the quality

    of the questions is low. In this paper, the expert we want to find is different from the prior works,

    comparatively, we want to find the answer-hard-question-frequently expert according to the two principles of the

    knowledge gap we discussed previously. That is to say, we want to find the user that not only an expert but also

    focusing on answering hard questions more than the other users.

    56

  • 8/8/2019 3.0(Ying Liang Chen)

    57/113

    4. PROBLEM DEFINITION AND OUR APPROACH

    In this section, we describe the progress of our problem as probability model and propose an algorithm called

    knowledge-gap-based difficulty rank (KGB-DR) algorithm to predict the difficulty degree of the question is

    easy or hard. The KGB-DR includes two steps: expert finding step and difficulty degree detecting step. Expert

    finding step is not only to find the expert but also adjust the expertise by the knowledge gap, while difficulty

    degree detecting step is to compute the difficulty degree of the questions, in addition, when the difficulty degree

    score of the questions have been computed, the expert finding step can obtains the feedback from the difficulty

    degree model and then the difficulty degree detecting step can computes the score of the questions again. We will

    detail the two steps how to work.

    4.1 Problem Definition

    In this paper, we state the problem of the difficulty degree on questions q is by means of a probability model.

    The probability model is used widely by the expert finding recently Balog, K., Azzopardi, L. and Rijke, M.d.

    (2006) Formal models for expert finding in enterprise corpora. Proceedings of the 29th annual international ACM

    SIGIR conference on Research and development in information retrieval. ACM, Seattle, Washington, USA. Fang,

    H. and Zhai, C. (2007) Probabilistic models for expert finding. Proceedings of the 29th European conference on

    57

  • 8/8/2019 3.0(Ying Liang Chen)

    58/113

    IR research. Springer-Verlag, Rome, Italy. Zhou, Y., Cong, G., Cui, B., Jensen, C.S. and Yao, J. (2009) Routing

    Questions to the Right Users in Online Communities. Proceedings of the 2009 IEEE International Conference on

    Data En gineering. IEEE Computer Society., and we also use probability model to formulate our task, furthermore,

    we add the link analysis and the knowledge gap to solve this problem. Given a question q, the probability of the

    difficulty degree being hard is estimated as follows:

    phq=pqhphpq (1)

    where ph is the probability of the difficulty-degree being hard and pq is the probability of a question, which is the

    same for all questions. In our task, we just identify the question is easy or hard for conveniently and set a

    threshold t. If the difficulty rank of the question is less than t, the question is hard. So ph is estimated by t/n,

    where n is the total number of questions in the category. Thus, the ranking of questions is proportional to the

    probability of the question given the difficulty degree being hard. In this study our task is therefore to capture how

    much hard on the question q by pqh .

    4.2 The Flowchart of Our Approach

    The flowchart of our approach is illustrated in Fig 6 . Given a particular category or a set of categories, we first

    using expertise model by the asking-answering network to capture the expertise including asking and answering

    of the users. Difficulty degree model is used to calculate the difficulty degree score of the input-questions by the

    framework in YA, and question ranking is ranking the questions by their difficulty degree score. The

    58

  • 8/8/2019 3.0(Ying Liang Chen)

    59/113

    reinforcement model is to reinforce the expertise of users by the phenomenon of the knowledge gap via the DDR

    index that prior step calculates. In short, the progress of our system only has two steps: the expert finding step and

    the difficulty degree detecting step, and the expert finding is to find the answer-hard-question-frequently expert,

    while the difficulty degree detecting is to determine the difficulty degree score of questions. The loop will

    terminate when the difficulty degree score of the questions change significantly.

    Fig 6. The flowchart of KGB-DR algorithm

    4.3 Expertise Model

    We utilize the prior work Jurczyk, P. and Agichtein, E. (2007) Discovering authorities in question answer

    communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on information

    and knowledge management. ACM, Lisbon, Portugal. as our expertise model, we will introduce this model briefly

    in this paragraph. The link structure in the expertise model can be shown in Fig 7 , a particular question has a

    number of answers that single user replies for each one. An edge from a user to a question means that the user

    asked the question, and an edge from an answer to a user means that the answer was posted by this user. For

    example, in Fig 7 (a), user 1 has posted question 1 and user 2 has posted question 2, but both of them has never

    59

  • 8/8/2019 3.0(Ying Liang Chen)

    60/113

  • 8/8/2019 3.0(Ying Liang Chen)

    61/113

    and are updated iteratively using the equation above. After each iteration, the values in the H and A vectors are

    normalized, so that the highest hub and the highest authority values are 1.

    4.4 Scores

    The KGB-DR algorithm exist three major scores, two scores belong to users and used to combine the expertise

    of users and knowledge gap , while the other one only belongs to the question.

    pask(u|h) (the asker-part score of users)

    It indicates that the probability of an asker-expert of the user u given the difficulty-degree model h. This

    represents the ability of asking the hard question of the user, and the higher the score, the stronger the probability

    of the user in asking hard questions. The score is initialized to the hub value that section Expertise Model referred.

    pans(u|h) (the answerer-part score of users)

    61

  • 8/8/2019 3.0(Ying Liang Chen)

    62/113

    It indicates that the probability of an answerer-expert of the user u given the difficulty-degree model h. This

    represents the ability of answering the hard question of the user, and the higher the score, the stronger the

    probability of the user in answering hard questions. The score is initialized to the auth value that section Expertise

    Model referred.

    pqh (the difficulty degree score of questions)

    It indicates that the probability of the question given the difficulty-degree being hard. This score is the goal we

    want to evaluate in our task. This represents how much hard the question is, the higher the score, the stronger the

    probability of the question being hard.

    4.5 Difficulty Degree Model

    Recalling our goal is to estimate the value of pqh , and the question can be divided to two major parts: the part of

    asking and the part of answering and each part is assumed to be generated independently. Thus, the probability of

    question q given the hard difficulty degree is obtained by taking the summation across the two parts users on the

    question:

    62

  • 8/8/2019 3.0(Ying Liang Chen)

    63/113

    pqh=*pcqah+1-*pcqrh (2)

    Where pcqah is the probability of the part in asking given the hard difficulty degree, and while pcqrh is the

    probability of the part in answering given the hard difficulty degree. Additionally, we set a parameter [0,1] to

    adjust the weight of the two parts, and we will discuss the two parts respectively

    4.5.1 The part of asking

    In the part of asking, we capture an estimate of pcqah by means of three types of architecture of CQA service.

    The first is the relationship between the content that the asker has asked on the question and the question, and we

    formalize it by the length of the content. The other is the relationship between the question and the other questions

    that asker had posted before. Another is the relationship between the asker and the difficulty degree model. It can

    be expressed as:

    pcqah=pcqaq*pqa*paskah (3)

    Where pcqaq is the probability of the content that asker a asks on the question q and pqa is how much knowledge

    that the asker a contribute to the question q and paskah is the probability of an asker-expert of the asker a given

    63

  • 8/8/2019 3.0(Ying Liang Chen)

    64/113

    the difficulty degree model h. The value of paskah is initialized to the hub value that section Expertise Model

    referred and we will discuss the change of the next value later in this section.

    To compute an estimation of pcqaq , we use the length of the content and normalize it by the square root such as

    the following:

    pcquq=lqu(i qlqi) (4)

    Where cqu is the content that user u has posted on the question q and lqu is the length of cqu and i qlqi is the

    summation of the length of all the posts on the question q. According to above estimation, the estimation of pqa is

    defined by:

    pqu=lqumaxq' ulqu (5)

    Where maxq' ulqu is the maximum of the length that user u had asked before. The estimation of pqu is similar

    to pcquq , but we use maximum instead of summation to compute pqu . It is reasonable that the hard questions

    must be using more word to ask and while the easy questions instead, and for the same asker, the question that the

    64

  • 8/8/2019 3.0(Ying Liang Chen)

    65/113

    maximum length posted by the asker is stronger probability of being hard question than the other questions that

    this asker has asked.

    4.5.2 The part of answering

    In the part of answerer, we also divided this into two parts: the best answerer and the other repliers, so the

    probability of the part of the answerer given the hard difficulty degree can be obtained as follows:

    pcqrh=*pcqbh+1-* pcqr,h (6)

    Where pcqbh is the probability of the content that the best answerer b replies on the question q, and while pcqr,h

    is the probability of the content that the other answerers reply on the question q. The computing of the two factors

    is similar to the equation (3), but paskuh changes to pansuh such as following:

    pcqbh=pcqbq*pqb*pansbh (7)

    pcqr,h=i qr' pcqiq*pqi*pans(i|h) (8)

    65

  • 8/8/2019 3.0(Ying Liang Chen)

    66/113

  • 8/8/2019 3.0(Ying Liang Chen)

    67/113

  • 8/8/2019 3.0(Ying Liang Chen)

    68/113

    Where H is the total number of hard questions in category C, and we called this equation as global difficulty

    probability module. Because the above score may contain zero probability, we must smooth it for computing the

    expertise of users. Thus, we conduct the probability model h to our calculating and construct two models from

    equation 9 and equation 10. Therefore, the local difficulty model of paskuh and pansuh is represented as:

    paskuhk+1=*paskuh+1-*paskuhk (11)

    pansuhk+1=*pansuh+1-*pansuhk (12)

    Where paskuhk and pansuhk are the kth-iteration-number score of paskuh and pansuh which paskuh0

    and pansuh0 are hub and authority that base HITS algorithm computes in the prior section, and [0,1] .

    Another way is global difficulty model such as :

    paskuhk+1 *paskhu+1-*paskuhk (13)

    pansuhk+1 *panshu+1-*pansuhk (14)

    68

  • 8/8/2019 3.0(Ying Liang Chen)

    69/113

    1. EXPERIMENTS

    In this section, we will introduce and analyze our dataset crawled from YA in Section Dataset and the

    answer set in Section Answer set . Next, the baselines we compare to our approach are listed in Section

    Baseline and the use of evaluation metrics in Section Evaluation metrics . The results of parameter tuning

    of our approach is shown as Section Parameter tuning. Finally, the results of performance with all

    methods are exhibited in Section Comparison to baseline .

    1.1 Dataset

    1.1.1 The basic statistics analysis

    We crawled 40,000 resolved questions from Yahoo! Answers service in English for our experiments, and these

    questions from five categories respectively such as Martial arts , Cycling , Health , Pets , and Software, and there

    are 8,000 questions from each category. The dataset statistics are shown in Table 10, we will analyze and discuss

    these data before experiments clearly. From the attribute of # of answers and avg answers per question, we

    can see the number of the answers per question in two categories Martial arts and Pets are more than the other

    categories, furthermore, martial arts has the less number of users and less number of askers than other categories,

    69

  • 8/8/2019 3.0(Ying Liang Chen)

    70/113

    it shows that the users repeat to ask and answer with other users, in a short word, there are highest activity

    between users in the category martial arts. However, the category Cycling is contrary to Martial arts, the number

    of answers and the users are the least than other categories, and the ratio of questions with only one answer is

    25%. Furthermore, the number of answerers and the length of content that user have posted are also least than the

    other categories, and it indicates that there are low activity between users in this category. The category Health is

    as same as the category Cycling, the ratio of questions with only one answer is higher than the other categories,

    however, the number of users and answerers in Health are much more than other categories else. It shows that

    users posted contents in category Health only once mostly. The average answers per question in the category

    Software is only higher than Cycling, but the number of questions with one answer is least than the other

    categories. Additionally, the number of askers in Software is 7,232, but the number of total questions is 8,000. It

    says that there are a lot of unique users ask and do it only once in this category, and this is a reason that brings

    about the knowledge gap diagram of Software in Fig 3 (e) i.e., the questions are easy mostly in this category.

    Specifically, for the types of attribute in Table 10, we find that some attributes are related to the knowledge gap

    probably. First, the average answers per question, we can see the categories with strong knowledge gap such as

    Martial arts, Health, and Pets have more average answers per question than the categories with weak knowledge

    gap such as Cycling and Software. That is to say, the more answers in the question represents the high probability

    of the question being hard. Second, the average of length of content that user post, it is make sense because we

    use more word to describe the question if the more hard the question is, in addition, the length of asking is more

    related to the knowledge gap than the length of answering.

    70

  • 8/8/2019 3.0(Ying Liang Chen)

    71/113

    Table 10. Dataset statistics

    1.1.2 The asking and answering of users in each category

    The number of answering for each user in categories is shown in Fig 8 and the average of number of answering

    for top x%~y% users is shown in Fig 9, where y-x=5, and we can see how many users always answer questions in

    each category. First, we can see the tail of three categories (a)Martial arts ,(b)Health, and (c)Pets are shorter than

    the other two categories in Fig 8 , and the shorter tail indicates that the less users answer just once in the category.

    However, the length of head in Fig 9 represents that the number of users usually answer in the category more than

    71

  • 8/8/2019 3.0(Ying Liang Chen)

    72/113

    once, and it is interesting that two of the longest length of the head are Martial arts and Cycling, but the length of

    tail of Cycling in Fig 8 is also longer. It shows that there are many users answer questions over a long period of

    time and there are also many passing visitors answer question just once. The category Health and Pets are contrary

    to the category Cycling, although the length of tail of the two categories in Fig 8 are shorter, the length of head of

    the two categories are also shorter, and it indicates that there are less passing visitors who answer question just

    once and also less users who answer question frequently.

    Fig 8. The number of answering for each user in categories

    72

  • 8/8/2019 3.0(Ying Liang Chen)

    73/113

    Fig 9. The average of number of top x% users in answering

    Now we investigate the asking of users for each category, and Fig 10 shows that the number of asking for each

    user in categories and Fig 11 represents the average of number of asking for top x%~y% users, where y-x=5, and

    we can see how many users always ask questions in each category. In Fig 10 , we can see the length of tail for each

    category is as same as long and the category Software has the longest tail compare to other categories, and the

    length of tail of asking is shorter than answering for most of categories. However, the category Software is only a

    category that the length of tail of asking longer than answering, and we can infer this is why there are most of

    easy questions in this category, the most of users ask few questions so they cant ask more harder question and the

    most of users answer few questions so they cant obtain more experience in this category. Besides, we can regard

    this feature as the type of the knowledge gap diagram of the category Software. The length of head of the two

    category Martial arts and Cycling is longer than other categories in Fig 11 , and it shows that there are stable users

    who ask questions in the two categories.

    Fig 10. The number of asking for each user in categories

    73

  • 8/8/2019 3.0(Ying Liang Chen)

    74/113

  • 8/8/2019 3.0(Ying Liang Chen)

    75/113

    Martial arts and Pets, and another group contains the others. From the first group and Fig 13 , the ratio of the two

    categories Martial arts and Pets is lower in the number of answers for each question between one and three, but

    the ratio is higher followed by the increasing of answers, and it indicates that the two categories belong to the

    category that most of questions replied by many users. Second, on the contrary, the questions of the other three

    categories such as Cycling, Health and Software are replied by a few users frequently, and the questions which

    replied by more than ten users are rarely, due to the intuition, we can infer the three categories belong to the most

    of questions are easy question so that the questions can resolved by a few users.

    Fig 12. The number of answers for each category

    Fig 13. The ratio of answers in each question

    75

  • 8/8/2019 3.0(Ying Liang Chen)

    76/113

    1.1.4 The level of users for each category

    The statistics of each level of users crawled from YA are shown in Table 11 and the percentage of each level

    omitting the missing data is reported in Table 12 .

    Table 11. The statistics of level of users crawled from YA

    76

  • 8/8/2019 3.0(Ying Liang Chen)

    77/113

    Table 12. The percentage of each level omitting the missing data

    1.1.5 The level of askers for each category

    Although the level of users cannot equals to the expertise of users, however, the level of users is similar to the

    experience of users in YA, and we still regard the level of the askers as the difficulty degree of the questions

    according to this assumption. The distribution of each level of askers for each category is shown in Fig 14, and the

    distribution of the lv1 askers occupy the most of percentage for all categories, it shows that the percentage of the

    easy question is more than hard question for all categories. The category Pets has the least ratio 39.05% of lv1

    askers and the other category has mo