• Medicine
  • Astronomy
  • Sociology
  • Technology
  • Spirituality
  • History
  • Open Access
  • News

Scientists Tested Natural Language Models To Predict Human Language Judgments

Natural language models may test computational assumptions about how people understand language. A group of scientists from Columbia University in New York, coordinated by Tal Golan and Matthew Siegelman, assessed the model human consistency of several language models using a unique experimental approach: problematic sentence pairs. Two language models differ regarding which sentence is more likely to appear in the real test for each contentious sentence pair. Taking into account nine language models (including n-gram, recurrent neural networks, and transformer models),

The researchers generated hundreds of such contentious sentence pairings by picking phrases from a corpus or synthetically optimizing sentence pairs to be highly controversial and controversial. Human volunteers subsequently made evaluations indicating which of the two terms was more plausible for each couple. Controversial phrase pairings successfully highlighted model flaws and found models most closely matched with human assessments. GPT-2 was the most human-consistent model studied. However, testing showed severe deficiencies in its alignment with human perception.

Natural Language Models

These researchers put nine models from three different classes to the test: n-gram models, recurrent neural networks, and transformers. The Natural Language Toolkit's open source code was used to train the n-gram models. The recurrent neural networks were trained using PyTorch designs and optimization processes. HuggingFace, an open-source repository, was used to build the transformers. They gathered opinions from 100 native English speakers who took an online exam. Participants in each experimental session were asked to determine which statements they would be "more likely to encounter in the world, as either speech or written text" and rate their confidence in their response on a 3 point scale.

Despite the consistency in model ranking between our findings and earlier work, GPT-2's severe failure in predicting human reactions to natural vs. synthetic contentious pairings reveals that GPT-2 does not adequately imitate the computations used in human processing of even short words. This result is somewhat predictable because GPT-2 is an off-the-shelf machine learning model that was not created with human psycholinguistic and physiological features in mind. Even though we found a lot of human inconsistency, a recent GPT-2 study found that almost all of the variations in how people responded to actual words could be explained.

Natural And Synthetic Sentences Pairs

The researchers arranged 90 sentence pairings into ten sets of nine sentences each and gave each set to a different group of ten individuals. They calculated the percentage of trials in which the model and the person agreed on which phrase was more likely to assess model-human alignment. All nine language models outperformed chance by predicting human choices for randomly generated natural phrase pairings (50% accuracy). They gave each group of ten individuals a different set of phrase pairs. We statistically analyzed between-model differences while accounting for both people and sentence pairs as random variables using a simple Wilcoxon signed-rank test across the ten participant groups.

A process for synthesizing contentious sentence pairs was created, in which naturally existing sentences serve as initializations for synthetic phrases and reference points that drive sentence synthesis. They started with a naturally occurring statement. They then keep replacing words in the sentence with comments from a predefined vocabulary to make the synthetic sentence less likely to be correct by one language model while ensuring that the synthetic sentence is at least as possible to be accurate by another model.

Human participants rated ten contentious synthetic-sentence pairings for each model pair. They assessed how well each model predicted human sentence choices in all of the controversial synthetic-sentence combinations in which it was one of two models tested.

Conclusion

The tests proved that:

  • There are many ways natural language processing models can generate controversial sentence pairs. They can pick pairs of sentences from a corpus or change natural sentences to make controversial predictions.
  • The contentious sentence pairs make it easy to quickly compare models that seem the same in terms of human consistency.
  • All of the existing natural language processing model classes mistakenly give a high probability to the following non-natural sentences: A simple statement may be modified such that its likelihood according to a specific model does not diminish. Still, as per human judgments, the phrase becomes much less likely.
  • This method of comparing and testing models may give new ideas about which types of models work best with human language perception, and which types of models need to be made in the future.

Comments (0 comments)

    Recent Articles

    • What Is My Angel Number? Number That Feels Spiritually Significant

      What Is My Angel Number? Number That Feels Spiritually Significant

      What Is My Angel Number - Your angel number is a spiritually important number for you. It's usually a number associated with your name or birthdate. It might also be a series of numbers that you see continually throughout time.

    • Music Origin - Overview From Begining To Present Day

      Music Origin - Overview From Begining To Present Day

      Music origin is likely to have occurred in Syria about 3400 years ago.

    • Instructions For Authors - A Step-by-Step Guide For Submitting A Scientific Paper

      Instructions For Authors - A Step-by-Step Guide For Submitting A Scientific Paper

      The instructions for authors are a unique set of criteria for each journal.

    • Peer Review - An Overview Of The Process, Benefits, And Pitfalls

      Peer Review - An Overview Of The Process, Benefits, And Pitfalls

      You will get information about our comprehensive, productive, and open-minded peer review process.

    • Alpha Brain – A Premium Brain Supplement For Improving Memory And Focus

      Alpha Brain – A Premium Brain Supplement For Improving Memory And Focus

      Alpha Brain is a nutritional supplement, not a medicine. The chemicals in Alpha Brain are among the most powerful cognitive enhancers available.

    • Irocit – Use, Dosage And Side Effects

      Irocit – Use, Dosage And Side Effects

      Irocit is an iron, folic acid, and zinc preparation that is taken orally and is used to treat iron, folic acid, and zinc deficiency.

    • The Ability Of Kalanchoe Tubiflora To Fight Cancer

      The Ability Of Kalanchoe Tubiflora To Fight Cancer

      Kalanchoe tubiflora has a possible anti-cancer agent that stops cells from dividing and makes them less likely to live.

    • Angel Number Birthday - Represents Completion And Rebirth

      Angel Number Birthday - Represents Completion And Rebirth

      Angel Number Birthday meaning is always spiritual, and it denotes, among other things, the beginning of a season of completing things.

    • Angel Number 33 Means - Being Creative And Expressing Oneself

      Angel Number 33 Means - Being Creative And Expressing Oneself

      Angel Number 33 Is known as a "Master Number," which indicates it has a greater vibration than other numbers.