About

BlogScope is an analysis and visualization tool for blogosphere which is being developed as part of a research prject at the the Department of Computer Science, University of Toronto.

Motivation: The explosive growth of the internet and the massive adoption of social media has created new ways for individuals to express their opinions online. Millions of bloggers across the globe are writing daily to produce one of the richest pool of information, blogosphere. Bloggers blog about diverse topics including their personal lives, product reviews, political opinions, technology trends, tourism experiences, sports events, and the entertainment industry. Without a doubt, blogging is a social phenomenon. This trend will persist and grow as our lifes become more heavily dependent on internet technologies. Given such trends there is pressing need to monitor such online forums continuously, and extract useful and actionable information regarding the "public opinion" on a variety of topics. BlogScope tries to help user discover knowledge from blogs by providing hints in form of bursts and correlations.

In 2007, Sysomos was founded as a spinoff of this project. Today, Sysomos is the leading provider of social media monitoring and social media analytics for enterprises across the world. The Sysomos platform, based on BlogScope technology, is tracking over 50 billion conversations, processing more than 10,000 per second.

NOTE: BlogScope is discontinued as of April 2012.

Members

Collaborators Summer Interns Past Contributors

Publications

Michael Mathioudakis, Nilesh Bansal, Nick Koudas Remembrance of Things Past: Towards Computational History, To appear in Proceedings of the 36th International Conference on Very Large Data Bases, VLDB 2010, Singapore, Sept 2010.

Nikos Sarkas, Nilesh Bansal, Gautam Das, Nick Koudas Measure-driven Keyword-Query Expansion, In Proceedings of the 35rd International Conference on Very Large Data Bases, VLDB 2009, Lyon, France, Aug 24-28 2009.

Manos Papagelis, Nilesh Bansal, Nick Koudas Information Cascades in the Blogosphere: A Look Behind the Curtain, In Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media, ICWSM 2009, San Jose, California USA, May 17-20 2009, Poster.

Yin Yang, Nilesh Bansal, Wisam Dakka, Panagiotis Ipeirotis, Nick Koudas, Dimitris Papadias Query by Document, To appear in Proceedings of the 2nd ACM International Conference on Web Search and Data Mining, WSDM 2009, Barcelona, Spain, Feb 9-12 2009.

Nilesh Bansal, Sudipto Guha, Nick Koudas, Ad-Hoc Aggregations of Ranked Lists in the Presence of Hierarchies, In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, Canada, June 9-12 2008. (slides)

Nilesh Bansal, Fei Chiang, Nick Koudas, Frank Wm. Tompa, Seeking Stable Clusters in the Blogosphere, In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB 2007, Vienna, Austria, Sept 23-28 2007. (slides)

Nilesh Bansal, Nick Koudas, BlogScope: A System for Online Analysis of High Volume Text Streams, In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB 2007, Vienna, Austria, Sept 23-28 2007, Demonstration Proposal.

Nilesh Bansal, Nick Koudas, Searching the Blogosphere, In Proceedings of the 10th international Workshop on Web and Databases, WebDB 2007, (co-located with SIGMOD) Beijing, China, June 15 2007. (slides)

Nilesh Bansal, Nick Koudas, BlogScope: Spatio-temporal Analysis of the Blogosphere, In Proceedings of the 16th international conference on World Wide Web, WWW 2007, Banff, Canada, May 8-12, 2007, Poster.

System

BlogScope is written in Java, and it runs on four Sun V40z server machine with RedHat Linux AS4. Main components include: a multi-threaded crawler with spam analyzer, indexing and searching module, statistics collection and access framework, popularity curve generator, correlation discovery module, natural language processor, and the web interface. Figure below summarizes the high level system architecture.

High level architecture of BlogScope

All this is built using many great open source libraries and utilities, which must be acknowledged.

Libraries

Platform

Development