Thursday, February 10, 2011

“The history of AI is a history of fantasies, possibilities, demonstrations, and promise” - AI in T9 (Predictive Text)



I find the most outstanding feature of the T9 text to have been its ability to make predictions (Planning) and to find patterns in a stream of input (Learning).


Firstly, what is AI?

Artificial intelligence (AI) is the intelligence of machines and the branch of computer science that aims to create it. It is the study and design of an intelligent agent that perceives its environment and takes actions that maximize its chances of success.
The field was founded on the claim that a central property of humans, intelligence—the sapience of Homo sapiens—can be so precisely described that it can be simulated by a machine. Artificial intelligence has been the subject of optimism, but has also suffered setbacks and, today, has become an essential part of the technology industry, providing the heavy lifting for many of the most difficult problems in computer science.


What is T9 (Predictive Text)?

T9, which stands for Text on 9 keys, is a patented predictive text technology for mobile phones, originally developed by Tegic Communications, now part of Nuance Communications.

Design
  • T9's objective is to make it easier to type text messages. It allows words to be entered by a single keypress for each letter, as opposed to the multi-tap approach used in the older generation of mobile phones in which several letters are associated with each key, and selecting one letter often requires multiple keypresses.
  • It combines the groups of letters on each phone key with a fast-access dictionary of words. It looks up in the dictionary all words corresponding to the sequence of keypresses and orders them by frequency of use.
  • Learning: It determines what category(frequently or rarely used) the word belongs in, after seeing a number of word texts. As it gains familiarity with the words and phrases the user commonly uses, it speeds up the process by offering the most frequently used words first and then lets the user access other choices with one or more presses of a predefined Next key.
  • New Word additions: The dictionary can be expanded by adding missing words, enabling them to be recognized in the future. After introducing a new word, the next time the user tries to produce that word T9 will add it to the predictive dictionary.
  • Replacing Text
  • Replacing TextReordering of New Words
  • Prioritizing: The prioritizing function is used to manage the number of user defined words. These words are prioritized according to the frequency of usage. When not used frequently, the words can be deleted, thus maintaining the size of the List. The permanent words have NULL priority.
  • Word Editing: When an already typed word is modified, the dictionary pointer location is moved to the most probable complete word present in the dictionary. If it exists, the word is replaced by the modified extended word. If it does not exist, the word is considered as a new word and then NewWord addition procedure is followed.



The graphs in Fig.1 give the relation between the Skip Number and Total Count. Skip Number denotes the number by which the pointer moves in one attempt during the searching of the word in the dictionary. The Total number of moves by the pointer in the process is given by Total Count. Based on the graphs for a few sample words, the optimum Skip Number can be obtained.


 


                               Fig. 1(a...e).  Skip Number vs. Total Count



The graph in Fig.2 gives the number Graph showing number of movements of Dictionary Position before reaching the Word.



Features
  • Some T9 implementations feature smart punctuation. This feature allows the user to insert sentence and word punctuation using the '1'-key. Depending on the context, smart punctuation inserts sentence punctuation (period) or embedded punctuation (period or hyphen) or word punctuation (apostrophe in can't, won't, isn't as well as the possessive 's). Depending on the language, T9 also supports word breaking after punctuation to support clitics such as l' and n' in French and 's in English.
  • The UDB is an optional feature which allows words that were explicitly entered by the user to be stored for future reference. The number of words stored depends on the implementation as well as the language.
  • In later versions of T9, the order of the words presented adapts to the usage pattern. For instance, in English, 4663 matches "good", "home", "gone", "hood", etc. Such combinations are known astextonyms; e.g., "home" is referred to as an textonym of "good". When the user uses "home" more often than "good", eventually the two words will switch position. Information about common word combinations can also be learned from the user and stored for future predictions.
  • For words entered by the user, word completion can be enabled. When the user enters matching key-presses, in addition to words and stems, the system will also provide completions.
  • In later versions of T9, the user can select a primary and secondary language and matches from both languages are presented. This enables users to write messages in their native as well as a foreign language.
  • Some implementations also learn commonly used word pairs and provide word prediction (e.g. if you often write "eat food", after entering "eat" the phone will suggest "food" and it can be confirmed by simply pressing next).
  • Auto correction: Another powerful feature is its ability to automatically recognise and correct typing/texting errors, by looking at neighbouring keys on the keypad to ascertain an incorrect keypress. For example, the word "testing" would be entered with the key combination "8378464". Entering the same number but with two incorrect keypresses of neighbouring keys, e.g., "8278494" still results in T9 suggesting the words "tasting" (8278464), "testing" (8378464), and "tapping" (8277464).

The data structure used in T9 Dictionary

A trie for keys "A", "to", "tea", "ted", "ten", "i", "in", and "inn"In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings.

Algorithm
In order to achieve compression ratios of close to 1 byte per word, T9 uses an optimized algorithm which maintains the order of words, and partial words (also known as stems) but because of this compression, it over-generates words which are sometimes visible to the user as 'junk words'.
http://www.acadjournal.com/2003/v10/part1/p1/   - The algorithm has three modules, Dictionary Module, Temporary File Module, Window Module

Intelligence
  • Deduction, reasoning, problem solving: Algorithm imitates the step-by-step reasoning that humans were often assumed to use when they solve puzzles, play board games or make logical deductions. Highly successful methods for dealing with uncertain or incomplete information; concepts from probability and economics are employed.
  • Knowledge representation: The algorithm will require extensive knowledge about the world. 
    “Knowledge about knowledge (what we know about what other people know)”
  • Planning: It periodically checks if the word matches its predictions and changes its plan as this becomes necessary, requiring the agent to reason under uncertainty.
  • Natural language processing: Gives machines the ability to read and understand the languages that humans speak.
  • Social intelligence: For good human-computer interaction, an intelligent machine also needs to display emotions; smileys ( J L K ) are used here.

Relationship with Users
On a typical telephone keypad, groups of letters in alphabetical order are associated with numeric keys. The user sequentially presses the numeric keys which represent the letters in the numeric keypad (a key press per letter), thus building the words sequentially.  The algorithm also searches sequentially moving from one possibility to another in the added word list as and when the user presses the next numeric to represent the succeeding letter in the word. Since the user cannot type in more than three words a second, the program would appear to perform no steps at all because of its effective search. Moreover the program exploits human limitation of sequential text entry to perform a sequential search thereby going in par and even better than human typing speed. This is comparatively faster to the word guessing algorithm by which words are formed and cross checked with the words present in the word list. The word guessing algorithm also waits for the user to complete typing the entire word and then searches for the most probable combination.

By claiming to be able to recreate the capabilities of the human
 mind, the T9 Predictive Text proves to be a good AI project.

- MANEKA
  BT09B009

REFERENCES:





No comments:

Post a Comment