Ten-Years Research Progress of Natural Language Understanding Based on Perceptual Formalization

. This paper introduces the research progress of machinery truly understanding of natural language from three aspects. First, this paper explains why to carry out data or feature description by perceptual structure. Secondly, this paper summarizes the main understanding algorithms since the theory of machinery truly understanding has been proposed , and emphasizes the recent research progress. Finally, in view of current research status, this paper gives some research directions of natural language understanding in the future


Introduction
The problem of natural language understanding has appeared in the field of natural language processing for a long time.In 1949,Warren Weaver [1] of the United States put forward the idea of machine translation, IBM and other companies have followed up the research.But seven years later, in 1966,the American academy of sciences submitted an ALPAC report [2],put forward that machine translation encountered Semantic Barriers, and Semantic Barriers is essentially a natural language understanding problem.People hopes to deal with natural language through "understanding".Is there any way to translate raw text that is not understood into understandable structure?In this regard, many conceptual understanding  attempts have given positive answers and inspiration point.At present, the perception theory is no longer confined to the field of biology, cognitive science [13,27,34] in the field of computer is known by many scholars, literature [28] since 2007 officially published and put forward the theoretical framework of natural language understanding, the process of perceptual formalization oriented natural language machine understanding research has gone through ten years, has published some representative papers [28][29][30][31].In order to facilitate colleagues to quickly understand the relevant content, this paper gives the main results of real understanding of natural language, finally, the future research direction is analyzed, and the development trend of machinery understanding is discussed.
the definition of the function, what M will be on the semantic effect of S can be represented by the following mapping:

H
x  , x is the external identification of s .Among them, the perceptual pattern set M includes the intrinsic perceptual pattern i M and the acquired perceptual pattern set l M , the perceptual element pattern i m  i M , and the acquired perceptual pattern set l M , which is a perceptual set composed of perception p , is introduced by formula 1 ), and the perceptual pattern set M corresponding to the external stimulus or external marker H is called as Definition 3 [28] (Learning).For that cognitive system Perception is selected to perform machine understanding tasks because it can realize natural representation , In this paper, such perceptual semantic computation is referred to p -Semantic Computation.The formation process of perceptual semantics is that real-world images form perceptual pattern sets under the action of learning axioms and are marked with natural language symbols, thus forming language and its corresponding perceptual semantics.This is also the reason for using perception to express semantics.It is not clear that what the meaning of the cognitive object is caused by the semantic representation of symbols or concepts, so image perception calculation is needed to make up for the limitation of symbol calculation, and image thinking [34] and symbol logic thinking are needed to combine to solve the problem.

Mechanisms of understanding
The complex problem of natural language understanding in psychology is far from clear enough to formalize the laws or processes of natural language understanding.
The study [28][29][30][31] based on perception formalization constructs the understanding formula of natural language on the basis of what-why understanding effect to reveal the laws of natural language understanding to solve the problem.

Understanding
The understanding formula [28] is as follow : ) , ( ) ( Among them, the external stimulus is x , its corresponding to a certain perception pattern x m and perception subset for x g P , sure feeling t b , matching function w .

Comprehensive understanding
For external stimulus x , the perception P produced for it, corresponding aggregatable perception set then the comprehensive understanding of x is as follows: Proof: ( omitted ), see document [28] for relevant proof.

Understanding effect
As we all know, the theory of physics is based on the laws of physics, which are verified by experiments.Perceptual-based natural language understanding theory drawn the conclusion of what-why understanding effect through experiments [31], then deduced the whole understanding theory, and proved the completeness theorem (  -completeness) of the theory of natural language understanding.
What-why understanding effect.Paper [31] gave assumptions and expectations, assuming that due to what-why factors cause understanding effect, so when the variable values added to the variables at not understanding state, expectations will lead to the expected understanding effect; If you don't add the variable values, you can't have an understanding effect.In variable control, this study controlled the factors such as what-variables, why-variables, true and false words and variable complexity, and took into account the operational definition of variables in the subjects.Through the understanding effect experiment, it is found that what-factor and why-factor jointly lead to the understanding effect in the natural language processing .See literature [31] for details of experiments,results,and discussion.
Reliability(  -reliability).Perceptual-based natural language understanding study ensure its logic through the above what-why understanding effect verification and reliability(  -reliability) proof, that is to say, for any external stimulus x , if it can be introduced by axiom system  to be understandable, then it is correct that the external stimulus x can be understood .That is: See document [29] for the certification process.
Completeness(  -completeness).Literature [29] had proved its completeness(  - completeness), that is, to any external stimulus x , if x P can be understood by the machine, its understanding will be introduced by the axiom system  : Where x P is represented the corresponding perceptual set( i.e., the poly-perception set ) of x .Proof:slightly,the proof process is shown in document [29].Example 1.The following is the understanding of the stimulus "蓝".Readers can use this example to deepen their understanding of the understanding definition.
(1)Firstly, the perception set of the " 蓝 " shape matches with the perception pattern set in the cognitive system Example 2 [30] .analyzes the understanding process of " bright moon light in front of bed" below.This is a poem of the poet Li Bai of Tang Dynasty.The understanding process is omitted,see document [30] for detail.

4
Several understanding mechanism related algorithms  .See literature [28] for experiments, experimental results and discussions.

Pragmatic meaning derivation algorithm of natural language machine understanding
Definition 4 [29] (Pragmatic Meaning).Set the context G , the context includes Set j S ∈ G ， the pragmatic meaning of a sentence j S in the corresponding context can be uniquely determined: Proof: slightly, see document [29]  [29]  3)From the corresponding relationship to obtain pragmatic meaning part s h p , corresponding relationship was in accordance with pragmatics-formula 6).
4) Continue to perform steps 1-3, printed out s h p ,then exit.
Experiment(omitted),see literature [29] for experimental results and discussion.

Deductive reasoning algorithm guided by natural language understanding
Definition 5. Difficulty Element [33] .The so-called Difficulty Element, refers to the user, is the sentence or its elements which is difficult to solve and must be solved by reasoning, in this article Difficulty Element was agreed as dd .
For any conclusion K , if it can be expressed as reasoning sequence k S of a fact F and the rule m R , F F j  ,if testknown( k S )=True, the end of the solution.If testknown( x ) = false, said x contains Difficulty Element dd , otherwise true.See reference [33] for algorithm , and its expression is as follows: Relevant experiments, experimental results and specific discussions can be found in reference [33].The above three real understanding algorithms are mainly evaluated by the what-why understanding effect.Understanding indicators are measured by understanding degree to see how much they have understood.

Related work and prospects
The main contribution of perceptual-based natural language understanding study is to give some law and theorem about what is understanding and what is natural language understanding.Paper [35,36] is similar to the present study [28][29][30][31]33,37],linking perception with natural language to study the problem of language grounding, which further confirms the correctness and effectiveness of this research direction.The perception in the paper [35,36] is still a conceptual level, not a perceptual unit.
There are two type of ways to classify natural language understanding researches at home and abroad: (1)One is considered that understanding is the analysis of grammar, semantics and pragmatics, such as the system grammar [14], the case grammar [3,4], the full information theory [15], and so on.The statistical method is essentially a lexical or syntactic analysis, which can be classified as such category.Winograd (1983) [14] completed the SHDRLU system in the closed building block world, using the system grammar within a limited vocabulary range, and the human-computer dialogue experiment had been successful.
(2) Another idea of understanding is that understanding is the mapping of concepts.R. Schank (1975) of Yale University in the United States and his colleagues put forward the concept dependency (CD) theory [5] that there is a conceptual basis in the human brain, and the understanding of natural language is the process of mapping concepts.Many typical theories of language understanding are affected by this idea, for example, WordNet [6][7][8][9][10], HNC [17], HowNet [19], ontology theory [11][12][13] and so on were based on conceptual understanding.
The above two methods can be attributed to the conceptual level of natural language understanding.The study of natural language understanding based on perceptual formalization is different from the above two methods.It is based on the smallest element of semantics, and based on the what-why understanding effect obtained by the physics method, is a true understanding of natural language.Natural language understanding [38] at conceptual level is different from real understanding.
In addition, this paper gives some natural language machinery understanding research clues for readers to provide reference.

5.1
The natural language understanding basis of machine translation At present, more and more algorithms are devoted to the problem of intermediate language representation of data.The KBMT and KANT system [39] of Carnegie Mellon university is a knowledge-based translation system in a restricted intermediate language.At present, the system is the most important machine translation system using inter-language model translation method.
Inspired by the inter-language translation model, we can consider constructing a machine translation model based on perceptual semantics.Since the target text and the translated text have the same semantic meaning and the perceptual elements, we can take perceptual element as a sememe, use perceptual semantic set as interlanguage.Paper [28] divides the sentences and words on the sentences ' group, constructs understandable mature sentences, and further logically analyzes and generates the structure of the aggregative perception set; Paper [29] gives a study of semantic meaning representation for the dynamic sentence meaning and the derivability of sentence understanding in sentences group.By using these mathematical formulas, this idea of perceptual hierarchical representation can be combined with the existing inter-language translation mode.

Machine learning based on natural language understanding
Big data provides a data basis for machine learning, in which Deep Learning generates hidden lay of neural network according to automatic method.Inspired by this, based on perceptual expression, perceptual element is used as hidden layer, Deep Learning research based on perceptual element is worth to be carried out.Papers [28,29] provide relevant definitions,common premises and theorem proofs for representation, understanding and learning of machine understanding and learning system.Paper [30,37] provides a machine understanding and learning method based on text analysis.On this basis, we can consider the further application of understanding theory in machine learning.Machine learning combined with perceptual elements and logic is the highest level of machine learning to reach the human level, is an important small data learning method in human-like learning.

The physiological basis of the invariance of perceptual properties
In the process of formalization of natural language understanding, paper [28,29] formalizes perception, separates the qualitative part of perception from the quantitative part, thus completing the formalization of language understanding process.In the formal process,the invariant qualitative part is represented by symbols, but the qualitative part belongs to the category of physiology, its research has physiological significance [29], this is also a very forward-looking topic which gives a possible way to achieve human longevity [40].

Visual Turing test and intelligence definition
Problems of knowledge representation in traditional expert system and knowledge engineering are due to the fact that the relevant knowledge need to be used is often not enough.Paper [40] combined with visual Turing test, gave the proof framework of the relationship between intelligent definition and Turing test [38].Therefore, we can consider scheme to improve the traditional expert system, on the basis of the structure of conceptual space, try to combine the theory of understanding, solve the problem of insufficient common sense, and solve the problem of knowledge representation.Of course, there are a lot of future research directions for natural language understanding based on perceptual formalization,here are just a few references for readers.

cS
to obtain semantics including the perception pattern set and the certainty feeling of its logo, so as to know what it is and why it is;(2)Matching and understanding each perceptual subset of these perceptual pattern sets to know what and why these perceptual subsets are; (3)Finally, matching and disjointing each perception element p , so as to know what the color 'blue' perception element is and why(the corresponding truth value); (4)The "蓝" external stimulus is understood by means of a comprehensive matching and disjuncting so as to fully know what and why the various parts involved in the stimulus are.

2 ,. 3 .
Sentence understanding.Matching words based on the possible poly-perception sets of sentences according to semantic constraints,the understanding result w Context understanding.Understanding results of each poly-perception set at sentence group level are c j y ,the constraint was as follows: Sentences arrangement.The understanding results of each poly-perception set at the level of generalization are k j y ,and conform to the constraints

. 2 )
(PA) Input: a set of linguistic material sequences G in context Output: all sentences ' periods within the context of G and their implication(Pragmatic Meaning) in the context G 1) From the text understanding process to obtain the rules r M Get the location of the source sentence in the target rule r M .

Pragmatic meaning derivation algorithm
. Wherein that meaning of the sentence is s