ABSTRACT:
Recent studies have shown that a
noticeable percentage of web search traffic is about social events. While
traditional websites can only show human-edited events, in this paper we
present a novel system to automatically detect events from search log data and
generate storyboards where the events are arranged chronologically. We chose
image search log as the resource for event mining, as search logs can directly
reflect people’s interests. To discover events from log data, we present a
Smooth Nonnegative Matrix Factorization framework (SNMF) which combines the
information of query semantics, temporal correlations, search logs and time
continuity. Moreover, we consider the time factor an important element since
different events will develop in different time tendencies. In addition, to
provide a media-rich and visually appealing storyboard, each event is
associated with a set of representative photos arranged along a timeline. These
relevant photos are automatically selected from image search results by
analyzing image content features. We use celebrities as our test domain, which
takes a large percentage of image search traffics.
Experiments consisting of web search traffic on 200 celebrities, for a period
of six months, show very encouraging results compared with handcrafted
editorial storyboards.
PROJECT OUTPUT
VIDEO: (Click the below link to see the project output video):
EXISTING SYSTEM:
·
The most related research topics to this paper are event/topic detection
from Web. There have been quite a few works that examine related directions.
The most typical data sources for event/topic mining are news articles and
weblogs. Various statistical methods have been proposed to group documents
sharing the same stories. Temporal analysis has also been involved to recover
the development trend of an event.
·
The representative work for event/topic detection is the DARPA-sponsored
research program called TDT (topic detection and tracking), which focus on
discovering events from streams of news documents. With the development of Web
2.0, weblogs have become another data source for event detection. Some of these
research efforts develop new statistical methods and some others focused on recovering
the temporal structure of events.
DISADVANTAGES OF EXISTING SYSTEM:
·
First, the coverage of human center domains is small. Typically, one
website only focuses on celebrities in one or two domains (most of them are
entertainment and sports), and to the best of our knowledge, there are no
general services yet for tracing celebrities over various domains.
·
Second, these existing services are not scalable. Even for specific
domains, only a few top stars are covered1, as the editing effort to cover more
celebrities is not financially viable.
·
Third, reported event news may be biased by editors’ interests.
·
Discovering events from a search log is not a trivial task.
·
Existing work on log event mining mostly focus on merging similar
queries into groups, and investigating whether these groups are related to
semantic events like “Japan Earthquake” or “American Idol”. Basically, their
goals are to distinguish salient topics from noisy queries. Directly applying
their approaches will fail as the discovered topics are more likely related to
vast and common topics, which may be familiar to most users.
PROPOSED SYSTEM:
·
In this paper, we aim to build a scalable and unbiased solution to
automatically detect social events especially related to celebrities along a timeline.
This could be an attractive supplement to enrich the existing event description
in search result pages.
·
In this paper, we will focus on those events happening at a certain time
favored by users as our celebrity-related social events. we
would like to detect those more interesting social events to entertain users
and fit their browsing taste, which could be supplementary to some current
knowledge bases.
·
A novel approach is proposed in this paper using Smooth Nonnegative
Matrix Factorization (SNMF) for event detection, by fully leveraging
information from query semantics, temporal correlations, and search log
records. We use the SNMF method rather than the normal NMF method or other MF
method to guarantee that the weights for each topic are non-negative and
consider the time factor for event development at the same time.
·
The basic idea is two-fold: 1) promote event queries through by
strengthening their connections based on all available features; 2)
differentiate events from popular queries according to their temporal
characteristics.
ADVANTAGES OF PROPOSED SYSTEM:
·
To provide a comprehensive and vivid storyboard, in this paper, we also
introduce an automatic way to attach a set of relevant photos to each piece of
event news.
·
We propose a novel framework to detect interesting events by mining
users’ search log data. The framework consists of two components, i.e., Smooth
Non-Negative Matrix Factorization event detection and representative event
related image photo selection
·
We have conducted comprehensive evaluations on largescale real-world
click through data to validate the effectiveness.
SYSTEM ARCHITECTURE:
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
·
System : Pentium Dual Core.
·
Hard Disk : 120 GB.
·
Monitor : 15’’ LED
·
Input Devices : Keyboard, Mouse
·
Ram : 1 GB
SOFTWARE REQUIREMENTS:
·
Operating system : Windows 7.
·
Coding Language : JAVA/J2EE
·
Tool : Netbeans 7.2.1
·
Database : MYSQL
REFERENCE:
Jun Xu, Tao Mei,
Senior Member, IEEE, Rui Cai, Member, IEEE, Houqiang Li, Senior Member, IEEE
and Yong Rui, Fellow, IEEE, “Automatic Generation of Social Event Storyboard
from Image Click-through Data”, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR
VIDEO TECHNOLOGY, 2017