專利匯可以提供Method using a programmed digital computer system for translation between natural languages專利檢索,專利查詢,專利分析的服務(wù)。并且A Computerized translation method with universal application to all natural languages is provided. With this method, parameters are changed only when source or target languages are changed. The computerized method can be regarded as a self-contained system, having been developed to accept input texts in the source language, and look up individual (or sequences of)textwords in various dictionaries. On the basis of the dictionary information, sequences of operations are carried out which gradually generate the multiplicity of computer codes needed to express all the syntactic and semantic functions of the words in the sentence. On the basis of all the codes and target meanings in the dictionary, plus synthesis codes of such meanings, translation is carried out automatically. Procedures which generate and easily update main dictionaries, idiom dictionaries, high frequency dictionaries and compound dictionaries are integral parts of the system.,下面是Method using a programmed digital computer system for translation between natural languages專利的具體信息內(nèi)容。
This invention relates to a method utilizing a digital computer for translating between natural languages.
Attempts have been made to utilize digital computers for translating from one language to another, i.e., from a source language to a target language. The translation systems involve a programmable digital computer system along with a program for effecting the translation. The approaches used were theoretical. The theoretical language approach for syntactical analysis has not been acceptable because it starts out from linguistic assumptions instead of considering the capabilities of the programmable digital computer and approaching the translation from the computers point.of view.
The idea of machine translation was conceived in 1946 by Warren Weaver and A.D.Booth. Many attempts to achieve a machine translation system and put it into operation have been made. The projects were directed toward developing linguistic theories encompassing the whole natural language and then going to the computer. This approach inevitably failed because the human mind cannot encompass the totality of the laguage.
The following is a brief resume in the approaches and the theory behind them:
The Fulcrum theory approach developed from 1959 to 1967 by the Bunker-Ramo Corporation, USA, was directed toward solving, with a relatively small dictionary, the problems occuring in a limited Russian text. No attempt was made to introduce resolution of multiple meanings; instead, several meanings were printed in the output, separated by slashes.
A predictive syntax system was developed by the National Bureau of Standards and Massachusetts Institute of Technology in 1960 to 1964. This approach failed because it considered only one limited path to the sentence. This system was never implemented on a larger scale, but was used just within a limited experimental environment.
Transformational grammar was another approach. However,this approach turned out to be absolutely incompatible with computer translation requirements. Only small experimental systems have been developed on the basis of this theory, . and they had to be discontinued before any significant translation was produced.
As compared to the prior art it is an object of the invention to enable accurate, fast, almost instantaneous translation of large volumes of texts from one source language i.e. natural language into another natural language, the so-called target language.
Instead of a natural language an artificial language as par example Esperanto can act as source or target language as the case may be.
To achieve this purpose the subject matter of the present invention is a method using a programmable digital computer system, the steps comprising
Advantageously the method comprises a step wherein a special subroutine of the dictionary lookup process searches a list of words (lexical list) to determine which words require the incorporation of a special lexical subroutine into the dictionary file and.a separate lexical control program analyzes the words of the source language sentence for those cases where only the results of the syntactic/ semantic analysis of the sentence or the membership of a word in a grammatical or semantic class can determine wehther or not a lexical subroutine must be called in at the time of translation of a specific word or expression in order to determine the meaning of that word or expression by examining the syntactic relationships which have been established for said word or expression and utilizing syntactic and/or semantic rules which apply only to that word or the class of words to which it belongs.
There is andventageously provided a method of resolving semantic ambiguities, wherein each word in the stem (single-word) dictionary is assigned a unique limited semantic (LS) number, and the specialized multi-word expressions of the LS ccmpound dictionary (LS-expressicns) which are composed of the LS number representation of their individual words in the stem dictionary, are then automatically grouped into a dictionary record according to the principal word of the expression, one existing for each unique principal word, and the record is then searched whenever the principal word occurs in the sentence to be translated, in order to determine if a match exists between a group of contiguous words in the text and any of the expressions . in that record,'the longest match or the highest-priority expression being used to determine the specialized meaning of that group of words, which differs from the sum of the meaning of the individual words; a subset of anLS expression is the conditional limited semantic (CLS) expression, which permits the inclusion of all definable syntactic and semantic-rules, as well as simple programming instructions which can be used to change the information stored in the bits and bytes of the sentence analysis area, in the LS expression itself (i.E. in the dictionary), so that the dictionary expression is only matched when the syntactic and/or semantic conditions expressed in the rule(s) have been met by the wprd(s). in the text, this fact being determinable only after a complete syntactic and semantic analysis of the sentence has been carried out by the linguistic analysis programs.
In the method advantageously all source language words can be supplied with semantic category codes in a variable length format, which are then interrogated by by the source language analysis programs, the lexical routines, and the CLS dictionary lookup routines as an aid in resolving semantic ambiguities; these codes are expressed in a hierarchical taxonomy (set of tree structures) in such a way that the coding of a category which exists at a lower level of a semantic tree generates the automatic coding of the appropriate higher-level codes by the system itself.
The invention involves a method of operating a digital computer to translate from a s.ource natural language, e.g. Russian, to a target natural language, e.g. English. The method involves three phases. The dictionary look-up phase establishes the target language meaning of each-word or expression in the source text. This dictionary look-up phase attaches grammatical codes and target language equivalents to each word and expression in the source language. The syntactical analysis phase identifies syntactical information on the basis of the grammatical information attached to the words and expressions and also utilizes the inflection of the word and the position of the word in the source text. The synthesis phase takes the meaning and syntactical information of all the words of a sentence in the source text and forms a sentence in the target language.
More specifically, the method begins by loading the source text into the memory of a computer. Each source text word is then transformed into a converted source text word. The converted source text word consists of the source text word and coded information. The coded information may include a memory offset adress linkage which provides access to a memory location that contains syntactical information and translation for the source text word. The converted source text words which derive from a' source text sentence are then synthesized into a target language translation of that sentence. The synthesis correctly establishes both word meaning and word position in the target language sentence.
An important aspect of the invention is the separate treatment given high frequency versus low frequency words. In order to maximize the effective capacity of the core memory of the computer, the low frequency words carry their translation information along with them, while each of the high frequerice words carries a memory offset address linkage which allows easy access to its translation information which is stored in the core memory. Thus, the translation information for frequently used words is held in an easily accessible place in the computer rather than along with every occurrence of the word as is done for low frequency words.
While the above description portrays a human analogy of how the claimed invention functions, it must be understood that, in fact, the actual operation of the process by the computer is quite different. From the . time that the source text is converted to machine -readable input data until the time that the machine-readable output data is converted to human-readable translation text, the claimed process proceeds under the control of a computer program. While it is convenient to describe the steps of the program as if they were being performed by a human translator, in fact; nothing of the kind is happening. Rather, the computer is carrying out a series of unthinking, abstract mathematical operations on the abstract values stored in the memory of the computer. The program functions . independently of the meaning or significance of the data on which it is acting. The fact that the program is formed in a high level programming language, which makes the program appear to give significance to the machine operation, does not change the fact that the machine is actually carrying out a series of abstract steps which have nothing to do with translating between natural languages.If a different kind of information were fed into the computer, the program used in this invention could conceivably perform a function totally different from translating.
The invention comprises also the process by which information is extracted from the computer including printing out the translation i.e.- the step converting the target language sequence from computer intelligible binary coded signals back to visual indicia.
It is of great advantage that by means of the present invention any source language can be translated into a target language by applying the proposed method and by making use of same equipment, just by changing language dependent parts of the equipment, as dictionaries and linguistic programs which includes the whole syntactic and semantic rules of both languages.
Another advantage of the method according to the invention is the consistancy of the translation,i.e. a particular expression is translated from the source language into the target language with the same environment, if the environment changes the invention correspondingly changes the meaning of the expression with the above mentioned consistancy. Also the recognition of all the syntactic and scmantic features of the source language are reflected in the target language is of advantage.
Specific embodiments of the invention will now be described by way of example with reference to the accompanying drawings, in which:
As shown in Figure 1 texts are input to the translation system either on-line via a terminal or off-line via key-to-disk or key-to-tape devices. (Input via punch cards is also possible). The input text and the translation dictionaries and programs reside on disk files accessible by the computer, which performs the translation. The results are then printed out in the desired format on the printer taking into consideration any format codes in the input text.
Post-editing of the output text and any resulting dictionary update can take place, if desired, via on-line interactive terminal (or it can be performed off-line by post-editors and dictionary codes).
Diagrams and tables which must be included in the same form in the translated output can be incorporated in the text via a photocomposition device.
Figure 2 illustrates the program which loads the input text into computer memory (L?ADTXT) identifies the high-frequency words and the idioms in the text and assigns each word of the text a unique serial number. It immediately attaches a translation to the idioms and ensures that the information for all the high-frequency words is immediately available in computer core memory during the translation process. The remaining (low-frequency)words are then sorted into alphabetic order to facilitate the process of.main dictionary look-up (MDL). Main dictionary look-up attaches to each low-frequency word the necessary grammatical information as well as a cross-reference to the record of multi-word limited semantic (LS) compounds containing that word. After main dictionary look-up all the words in the text are re-sorted into their original sequence.
According to Figure 3, the program INITCALL controls the actual translation process. It first calls the program GETSENTN, which establishes a sentence analysis area in computer memory, consisting of 160 bytes of fixed-length information for each word; with cross- references to additional variable-length information areas.
At first the sentence analysis area contains only the dictionary information for each word. During the translation process, additional information is added by the source language analysis programs (refered to as "passes") which resolve syntactic and semantic ambiguities, establish clause boundaries, describe basic syntactical relationship between words, identify subject/ predicate relationships and analyze the function of any prepositions.
Limited semantic (LS) and conditioned Limited semantic (CLS) dictionary look-up takes place at appropriate points in the analysis, conditioned limited semantic look-up being possible only after the entire sentence has been analyzed.
The remaining four steps, which synthezize the target language on the basis of the total information contained in the sentence analysis area, are: the translation of prepositions, the solving of word-specific problems by lexical routines, the actual synthesis of the target language words, and the rearrangement of the sentence into the word order appropriate to the target language.
The-final step-in the translation process is printing of the translated output by programm TRPRINT.
標題 | 發(fā)布/更新時間 | 閱讀量 |
---|---|---|
機器翻譯方法、訓練方法、相應(yīng)的裝置及電子設(shè)備 | 2020-05-11 | 523 |
一種基于對偶學習的蒙漢機器翻譯方法 | 2020-05-14 | 935 |
翻譯方法、裝置、設(shè)備及存儲介質(zhì) | 2020-05-13 | 967 |
處理網(wǎng)絡(luò)上的音頻通信的方法和系統(tǒng) | 2020-05-15 | 385 |
一種譯文重對齊的循環(huán)神經(jīng)網(wǎng)絡(luò)跨語言機器翻譯方法 | 2020-05-15 | 543 |
一種基于詞性注意力機制的神經(jīng)機器翻譯方法 | 2020-05-14 | 316 |
一種基于句法依存關(guān)系動態(tài)編碼的語句處理方法及裝置 | 2020-05-12 | 225 |
平行語料數(shù)據(jù)的獲取方法、裝置、電子設(shè)備和存儲介質(zhì) | 2020-05-08 | 265 |
用于同步翻譯的系統(tǒng)和方法 | 2020-05-11 | 380 |
一種機器翻譯方法和裝置 | 2020-05-13 | 958 |
高效檢索全球?qū)@?/div>專利匯是專利免費檢索,專利查詢,專利分析-國家發(fā)明專利查詢檢索分析平臺,是提供專利分析,專利查詢,專利檢索等數(shù)據(jù)服務(wù)功能的知識產(chǎn)權(quán)數(shù)據(jù)服務(wù)商。
我們的產(chǎn)品包含105個國家的1.26億組數(shù)據(jù),免費查、免費專利分析。
分析報告專利匯分析報告產(chǎn)品可以對行業(yè)情報數(shù)據(jù)進行梳理分析,涉及維度包括行業(yè)專利基本狀況分析、地域分析、技術(shù)分析、發(fā)明人分析、申請人分析、專利權(quán)人分析、失效分析、核心專利分析、法律分析、研發(fā)重點分析、企業(yè)專利處境分析、技術(shù)處境分析、專利壽命分析、企業(yè)定位分析、引證分析等超過60個分析角度,系統(tǒng)通過AI智能系統(tǒng)對圖表進行解讀,只需1分鐘,一鍵生成行業(yè)專利分析報告。
源語言熱門專利
3 機器翻譯