non-presented target word, yielding a similarity score. Table 13 presents the correlation between the similarity scores for each model and backward associative strength, including correlations previously reported for the LSA and LDA models described above[1].
Although none of the comparisons in Table 13 are statistically significant, the W3C3 model has a
higher correlation with the human data as reflected by backward associative strength than the constituent
models. Since previous results with LDA or LSA used a different corpus (TASA), a direct comparison
is not warranted. Nevertheless, the correlation of the W3C3 model is not as strong as has been
previously reported for LDA. This result was surprising, particularly with regard to the low correlation
for WLM. We undertook a qualitative analysis to determine if there is a better correspondence between
Wikipedia’s link structure and the associative strength behind the DRM paradigm than is reflected by the
WLM metric.
Table 13. Spearman rank correlations with backward associative strength for DRM lists (N = 55).
Model | Correlation |
LSA * | 0.30 |
LDA * | 0.44 |
W3C3 | 0.34 |
COALS | 0.27 |
ESA | 0.30 |
WLM | 0.24 |
Notes: * From Griffiths et al.[2]; LSA and LDA results include only 52 of 55 lists.
Table 14 provides some suggestion that the raw link structure of Wikipedia might be more strongly
related to backward associative strength than the gist-like WLM metric reveals. Each word in Table 14 is
from the DRM list for sleep[3]. As shown in the table, most words (11/15) have sleep as an outlink or
are used equivalently to mean sleep. In other words, this pattern of links is consistent with the backward
association strength found in[4].
Table 14. DRM list for target sleep (outlink ◦, inlink •, redirect/anchor *).
bed ◦ • | wake ◦ | snore ◦ |
rest * | snooze * | nap * |
awake ◦ | blanket ◦ • | peace |
tired | doze | yawn |
dream ◦ | slumber * | drowsy ◦ |
In order to more rigorously assess the possibility that raw Wikipedia link structure might better reflect
backward associative strength, we recomputed the correlations from Table 13 with separate measures for
Wikipedia inlinks and outlinks. Recall from Section 2.3 that WLM has two separate measures for inlinks
- ↑ Griffiths, T.L.; Steyvers, M.; Tenenbaum, J.B. Topics in semantic representation. Psychol. Rev. 2007, 114, 211–244.
- ↑ Griffiths, T.L.; Steyvers, M.; Tenenbaum, J.B. Topics in semantic representation. Psychol. Rev. 2007, 114, 211–244.
- ↑ Roediger, H.L.; Watson, J.M.; McDermott, K.B.; Gallo, D.A. Factors that determine false recall: A multiple regression analysis. Psychon. Bull. Rev. 2001, 8, 385–407.
- ↑ Roediger, H.L.; Watson, J.M.; McDermott, K.B.; Gallo, D.A. Factors that determine false recall: A multiple regression analysis. Psychon. Bull. Rev. 2001, 8, 385–407.