Tuesday, February 16, 2010

Predicting User Intent Using Semantic Analysis

Predicting User Intent Using Semantic Analysis

Dating back to early 2000, the Kontera founding team sought to answer a fundamental question – how could we provide users relevant and useful information without actively looking for it? This was and still is the fundamental quest behind Kontera’s vision: connecting the web’s information.The answer to this question is what we call today the Third Generation of Online Information Interaction. The first generation was developed using human mapped categorical directories. The second generation uses search engines where users type keywords that represent the information they are seeking in order to receive links to sites that relate to those keywords. Both methods require active thought and effort in order to find the information that would satisfy the user’s quest.Third Generation Online Information Interaction is based on Kontera’s ability to understand the true meaning of content coupled with the ability to predict users’ intent. Kontera selects the most relevant keyword phrases and turns them into hyperlinks that connect users to relevant information.Kontera’s patented In-Text Semantic Analysis technology predicts the user’s information intent based on content that the user is currently browsing coupled with real time information, extracted from thousands of web sites, about topics, keywords, content, and ads that are available and developing online.
Relevance - Accuracy - Interest
The Kontera system performs the following process, in a split of a second, for every page:
· Extraction: A typical contextual analysis process begins by extracting all the relevant publisher and page content and attributes, including: text, HTML properties, structure, location on page, URL, Title, Meta tags, custom Meta tags, etc. Every such feature has a weight used by the machine learning algorithms that analyze the data.
· Discovery: using Natural Language Processing, Machine Learning, and other proprietary linguistic, semantic, and statistical algorithms, keyword phrases are discovered and classified based on semantic meaning and potential semantic relationships.
· Page classification: using a proprietary Dynamic Taxonomy, that continues to expand and refine autonomously, Topical classes and Clusters are dynamically computed for the given page. In addition, the page sensitivity, sentiment and commercial value are analyzed.
· Information Clustering: Kontera uses several proprietary content extraction and classification engines that scour the web continuously for the most up to date relevant content, information, and contextual ads. Each information type, such as articles, blog posts, videos, ads, etc., is analyzed differently in order to ensure maximum relevancy. The potential matches are scored relatively to the page and the keywords phrases that were discovered on the page.
· Selection: Out of a potential pool of tens of keyword phrases and hundreds of ads and other related content objects, typically three to five keyword phrases are selected together with the best matching ads and information. This selection will rotate automatically over time due to the dynamic nature of online content and the system’s self-learning optimization algorithms.
· Online Learning & Optimization: The online learning and optimization module automatically performs yield management, optimization and tuning. This real-time analysis of users’ interaction with specific keywords, contextual advertising, and information as they relate to specific web sites, pages and topics is used to increase yield, relevancy and usefulness of Kontera’s different products..
Our technology and product innovations do not stop with linking relevant information using keyword phrases that best represent the user’s intent. Our research, software development, user experience, design, and engineering teams continue to develop advanced information solutions that combine text analysis algorithms using natural language processing, machine learning, and predictive models with cutting edge design and user interaction elements. One of the interesting innovations that was developed as a by-product of our content classification and meaning analysis was the Real Time Interest Index that dynamically discovers and surfaces the most interesting concepts online. Another exciting development is our modular In-text advertisng widget that allows us to offer advertisers and publishers the most engaging methods to interact with users.

No comments:

Post a Comment