Time Series Prediction Approaches

Time Series Journal

Subscribe to Time Series Journal: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Time Series Journal: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Time Series Journal Authors: Janakiram MSV, Jason Bloomberg, Progress Blog, SmartBear Blog, APM Blog

Related Topics: Time Series Journal, Open Source Journal

Blog Feed Post

Sentiment analysis finds trouble in the Enron emails

The Enron email dataset, collected during the FERC investigation of the Enron financial scandal, represents the largest publicly available set of emails. This makes theman ideal testbed for sentiment analysis algorithms. Ikanow's Andrew Strite used the open-source Infinit.e framework and a Hadoop cluster to generate sentiment scores for all of the Enron emails, and then used R to manipulate and analyze the resulting data. Here's a visualization of just a few of the email accounts: the red marks flag emails where the sender's sentiment suddenly turned sharply negative (and would therefore be a good place to start looking for evidence): Andrew used the rjson package to interface with the Ikanow REST API, the plyr package to restructure the incoming data, and the ggplot2 package to visualize the results. In a subsequent analysis he also used the zoo package to interpolate and analyze time series of sentiment scores, which you can read about in the full blog post below. Ikanow blog: Making the most of sentiment scores using Ikanow and R

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid