RPI  |   Cognitive Science  |   CogWorks

Research > Measures of Semantic Relatedness   [http://cwl-projects.cogsci.rpi.edu/msr] Researchers
Abstract

    Measures of Semantic Relatedness (MSRs) are computational means for calculating the association strength between terms. MSRs have been used to produce models of human web-browsing behavior (Pirolli, 2005), augmented search engine technology (Dumais, 2003), essay-grading algorithms for ETS (Landauer, Foltz, & Laham, 1998), and could be useful for any cognitive models or AI agents that have to deal with text.

    http://cwl-projects.cogsci.rpi.edu/msr

Description

    In order to evaluate the semantic saliency of an interface, to predict information foraging behavior of human users, or to rearrange users’ information environment in a meaningful fashion, we must employ statistical methods to measure semantic distances between task goals and various words/phrases found in the interface. We have surveyed the field as to the currently available measures of semantic distance (MSR), looking closely at possible practical uses and limitations of each method, as well as the various employed MSR evaluation methods. We have also implemented some MSRs locally for research and evaluation purposes, and set up an experiment that directly compares two MSRs at a time in those instances where the two MSRs make opposing predictions.

    Theoretical distinctions may be made between MSRs based on the practical concerns driven by particular applications of these tools. For example, the long training-time of LSA may be a disadvantage when learning of new terms is required of our MSRs, but LSA’s ability to compare whole paragraphs and documents may be essential for constructing semantic saliency maps (SSM) of user interfaces. However, the most popular experimental MSRs evaluation methods seem to be in predicting human browsing behavior and in performance on synonym questions from standard examinations like TOEFL. Both evaluation methods are forced-choice comparisons of option terms given some target term. We designed an elaborate filtering system for selecting such target-option test cases, such that all terms are familiar to human subjects, that the terms are within given MSR vocabularies, and that the evaluated MSRs make differing predictions on this task.

    We selected some of the most prevalent MSRs for our initial attempt at evaluating these tools, including LSA (Landauer & Dumais, 1997), PMI-IR (Turney, 2001), GLSA (http://glsa.parc.com/), WordNet (Budanitsky & Hirst, 2001, as cited by Kaur, Ishwinder, & Hornof, 2005), NGD (Cilibrasi & Vitanyi, 2005), and MHC (Veksler & Gray, in progress). Although in Experiment1 we found that some MSRs may have predictive advantages over others, our preliminary results from Experiment2, which used a different set of test cases, did not show the same advantages.

    What our preliminary findings suggest is that a given group of selected test-terms has an effect as to MSR predictive power. The test environment may have such effects, as well. Furthermore, the data corpus used to train each MSR most certainly has an effect (e.g. LSA may do better given a large number of books, while PMI may do better given Wikipedia to train on; perhaps all MSRs would do better given a large corpus of emails to train on; etc.), as do various free parameters used in each MSR algorithm. A large-scale experiment is the next step in thorough MSR evaluation (MSRs x MSR-parameters x Training-Corpuses x Test-Task x Test-Terms).

    Another important step in MSR evaluation will be in comparing MSR learning rates (as opposed to performance) to those of human subjects. Detailed examination of MSR successes and failures in matching human learning rates will give more precise insight as to where the algorithms may be improved.

    Some remaining issues include the collection and in-house implementation of all viable MSRs, collection of large varying data corpuses, and finding the large-scale computing power needed to train MSRs on these corpuses.

  • Cilibrasi, R. & Vitanyi P. (2005). Automatic meaning discovery using Google.
  • Kaur, Ishwinder & Hornof, Anthony J. (2005). A Comparison of LSA, WordNet and PMI-IR for Predicting User Click Behavior. In: Proceedings of the Conference on Human Factors in Computing (CHI 2005), pp. 51-60.
  • Landauer, T. K. & Dumais, S. T. (1997). A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211-240.
  • Turney, P. (2001). Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL. In L. De Raedt & P. Flach (Eds.), Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp. 491-502). Freiburg, Germany.
  • Veksler, V. D. & Gray W. D. (2006). Test Case Selection for Evaluating Measures of Semantic Distance

Library
  • Veksler, V. D. & Gray W. D. (2006). Test Case Selection for Evaluating Measures of Semantic Distance.
  • Veksler, V. D., Grintsvayg, A., Lindsey, R., & Gray, W. D. (2007). A Proxy for All Your Semantic Needs. Proceedings of the 29th Annual Meeting of the Cognitive Science Society, CogSci2007, Nashville, TN.

Copyright ©2003–2008, Rensselaer Polytechnic Institute: CogWorks Lab | Events | Research | People | Publications | Links | About.
Current URL: http://cogworks.cogsci.rpi.edu/?view=modules.research.spec&id=65