Page:The World Within Wikipedia: An Ecology of Mind.pdf/18

From Wikisource
Jump to navigation Jump to search
This page needs to be proofread.
Information 2012, 3

non-presented target word, yielding a similarity score. Table 13 presents the correlation between the similarity scores for each model and backward associative strength, including correlations previously reported for the LSA and LDA models described above[1].

Although none of the comparisons in Table 13 are statistically significant, the W3C3 model has a higher correlation with the human data as reflected by backward associative strength than the constituent models. Since previous results with LDA or LSA used a different corpus (TASA), a direct comparison is not warranted. Nevertheless, the correlation of the W3C3 model is not as strong as has been previously reported for LDA. This result was surprising, particularly with regard to the low correlation for WLM. We undertook a qualitative analysis to determine if there is a better correspondence between Wikipedia’s link structure and the associative strength behind the DRM paradigm than is reflected by the WLM metric.

Table 13. Spearman rank correlations with backward associative strength for DRM lists (N = 55).

Model Correlation
LSA * 0.30
LDA * 0.44
W3C3 0.34
COALS 0.27
ESA 0.30
WLM 0.24

Notes: * From Griffiths et al.[2]; LSA and LDA results include only 52 of 55 lists.

Table 14 provides some suggestion that the raw link structure of Wikipedia might be more strongly related to backward associative strength than the gist-like WLM metric reveals. Each word in Table 14 is from the DRM list for sleep[3]. As shown in the table, most words (11/15) have sleep as an outlink or are used equivalently to mean sleep. In other words, this pattern of links is consistent with the backward association strength found in[4].

Table 14. DRM list for target sleep (outlink ◦, inlink •, redirect/anchor *).

bed ◦ • wake ◦ snore ◦
rest * snooze * nap *
awake ◦ blanket ◦ • peace
tired doze yawn
dream ◦ slumber * drowsy ◦

In order to more rigorously assess the possibility that raw Wikipedia link structure might better reflect backward associative strength, we recomputed the correlations from Table 13 with separate measures for Wikipedia inlinks and outlinks. Recall from Section 2.3 that WLM has two separate measures for inlinks

  1. Griffiths, T.L.; Steyvers, M.; Tenenbaum, J.B. Topics in semantic representation. Psychol. Rev. 2007, 114, 211–244.
  2. Griffiths, T.L.; Steyvers, M.; Tenenbaum, J.B. Topics in semantic representation. Psychol. Rev. 2007, 114, 211–244.
  3. Roediger, H.L.; Watson, J.M.; McDermott, K.B.; Gallo, D.A. Factors that determine false recall: A multiple regression analysis. Psychon. Bull. Rev. 2001, 8, 385–407.
  4. Roediger, H.L.; Watson, J.M.; McDermott, K.B.; Gallo, D.A. Factors that determine false recall: A multiple regression analysis. Psychon. Bull. Rev. 2001, 8, 385–407.