Page:The World Within Wikipedia: An Ecology of Mind.pdf/15

From Wikisource
Jump to navigation Jump to search
This page needs to be proofread.
Information 2012, 3
243

predicted responses for all stimulus words are some ordering of the ranks 1–5, then the median rank will be 3. The second task is simply the probability that the model’s first response matches the human first response. The results from this previous work as well as the results from our three constituent and W3C3 models on this task are presented in Table 10.


Table 10. Median rank of each model’s first five associates compared to human associates
and proportion of model first associates that are the human first associate (N = 5,019).


Model Median Rank First Associate
LSA * 31 0.12
Topics Model * 18 0.16
W3C3 5 0.24
COALS 6 0.22
ESA 6 0.19
WLM 6 0.15

Notes: * from Griffiths et al.[1].


As in Studies 1 and 2, the W3C3 model has higher agreement with the human data than both its three constituent models or previous models. It should be noted that the previous results reported in Table 10 are based on models that used a corpus that is two orders of magnitude smaller than Wikipedia and also were tested against only about 90% of the NMS forward associative strength data. Thus the present study used more data but was also tested against 100% of the NMS forward strength data.


The results from Table 10 consider the NMS dataset as a collection of lists: Each stimulus word matched to list of response words ranked by forward associative strength. However, it is also informative to consider each triple (stimulus word/response word/forward associative strength) individually as was done in Study 2. Accordingly, Table 11 presents the correlations of model scores for each pair with the associated forward associative strength.


Table 11. Pearson correlations with NMS (N = 72,176).


Model Correlation
W3C3 0.28
COALS 0.26
ESA 0.15
WLM 0.20


The W3C3 model has a significantly higher correlation with the human data than its three constituent models, p < 0.001. However, these correlations are less than half as high for the NMS dataset as they were for the MCSM dataset presented in Table 8. This difference might imply that the underlying assumptions of these models may not be well aligned with the constraints of the word association task.

  1. Griffiths, T.L.; Steyvers, M.; Tenenbaum, J.B. Topics in semantic representation. Psychol. Rev. 2007, 114, 211–244.