Blog
How we add new words
The way we talk is both beautiful and messy, because people are quite inventive communicators. New words are constantly emerging, just as old ones fall out of use. So there’s really no point in trying to list all the words in the entire English language. (But in case you were interested, the Global Language Monitor estimated, for January 1, 2012, that there are 1,013,913 words in the English language…and counting.) To find out how OpenAmplify keeps up with changes in the English language, I chatted with Alexandra Stålnacke, OpenAmplify’s Director of Linguistics. This is how she summed it up:
Language is alive and forever changing. Every year about 25,000 new words are introduced into English. To keep up with its evolvement, our linguists do daily updates to the resource files that support the text analysis system in the core of OpenAmplify. So, based on lists of known words, endings and entities, and the context they appear in, OpenAmplify can correctly recognize an innumerable set of words we use today, plus ones that have not yet been “invented.”
For the task of daily additions, we’ve examined how language users instinctively follow specific rules when being creative in their forming of new words and expressions. We take people’s lead to find new words or phrases that evolve naturally. The OpenAmplify discovery process makes it possible to catch the correct part-of-speech of an unknown word, based on its textual context and its ending.
For example, you can most likely guess that a made up word “grubbly” is an adjective, used to describe a person, place, or thing. Or that “klomping” is most likely—but not necessarily—an action word. OpenAmplify’s engine examines its context:
- The boys have been klomping all day - most likely a verb (an action word)
- My new car has a beautiful klomping - quite probably a noun (a thing)
So as alive as the English language is, OpenAmplify’s NLP and text analysis engine is built to keep up with this dynamism. And that's a beautiful thing.
Webinar: The Evolution of Social Media Monitoring
Listen in and chime into the conversation this Thursday, Feb. 23. 2PM EST during a webinar hosted by our friends at Radian6, a Salesforce company. OpenAmplify's CEO, Mark Redgrave has been invited to join this panel to discuss what's next in the rapidly changing social listening space. Mark will be speaking more specifically about how OpenAmplify's NLP engine delivers unique insights on "intent" which what your social customers express about how they want to engage with your brands and products.
For more details and how to register, please see below.
The Evolution of Social Media Monitoring: Integrating Deeper Intelligence into Listening Solutions
Sign up: https://radian6.adobeconnect.com/_a995981730/evolutionofsmm/event/event_info.html
Panelists:
Mark Redgrave, CEO, OpenAmplify
Dana Martin, Supervisor, Social Media, McDonald's Corporation
Andrew Grill, CEO, Kred
Host: Genevieve Coates, Radian6 Community Manager
Webinar Description:
Social Monitoring has evolved beyond volume and trends, even beyond simple sentiment analysis. The next generation of social media monitoring involves enhancing conversations with meaningful insights – on the authors, sources, and content itself – to get even more value from the social web. Join Radian6 and some of our partners for an online roundtable discussion on how innovative social data is helping organizations like yours overcome real obstacles. OpenAmplify, the market leader in natural language processing and semantic web analysis, will discuss how computers can mimic human understanding of language and even recognize subtleties like people’s intentions. Kred, a sophisticated social scoring system, will explain how someone’s ability to inspire action online can be married with “real life” for a more complete representation of influence. All that and much more in this 1-hour webinar on February, 23rd at 2pm ET. Sign up now to learn how revolutionary social insights like these can help you make more sense of social conversations and redefine what social media can bring to your business.
Vampires vs. Jerry Seinfeld
We recently completed an analysis, in collaboration with Kantar Video, on social conversations around Super Bowl game-day commercials. For OpenAmplify’s contribution, we focused on one of the largest category of advertisers: automotive. As part of their annual study on what’s dubbed as “Big Game Advertising,” Kantar Video examined online video viewership and social activity to determine earned media value. Building on important viewership metrics, a closer look at what people are actually saying about the commercials seemed like the logical next step. What could we learn through these conversations and how do they inform the creative process for next year’s round of commercials? To find out, we dove into the conversations using SocialView, a new and powerful data analysis tool that has the full force of OpenAmplify’s NLP engine under its hood (automotive pun intended).
Through SocialView, we were able to see plainly how consumers naturally referred to products and brands, plus what the top associated topics of conversation were. We looked at a total of 36 themes among the six commercials. Some were obvious as intended by the advertisers—“Ferris Bueller’s Day Off” in the Honda commercial, for example. In other cases, a topic like “couch” or “sofa” bubbled up within the larger theme of Toyota’s “Reinvented” which contained several sub-themes in their vignettes. Kantar Video published a chart of top performers in the area of “Social Recall” by theme and a ranking of the six automotive advertisers in the order of their relative success at getting commenters to talk specifically about and mention their respective brands.
There certainly is a ton to learn by examining—at the topic level—the conversations generated by millions of views. As noted in a post by Trevor Wolfe of Kantar Video, discovering that the topic of “vampire” outperformed “Jerry Seinfeld” is informative for advertisers to balance investing in a celebrity with a compelling creative that resonates with their target audience. Now let’s look at the Chrysler ad, which did very well in the earned media ROI metric. But why were they last in the ranking for social recall of brand? The clue is in conversation. One of their commercial’s themes, “America,” was in the top five, reflecting the political slant of the majority of the conversations. Quite possibly that in a highly charged election year, taking on that theme was enough to sideline “Chrysler” as a topic, thus their ranking in the social recall by brand.
What do you think? Share your opinions, hunches or hypotheses around this year’s crop of Super Bowls commercials and we can go back to the comments to see how they play out.
Go west... Socialize West
Clearly, for businesses gone social, the pressure is on to measure the efforts and investments around it. This will be a hot topic at next week’s conference Socialize West (Oct 20-21 in San Francisco), which OpenAmplify is sponsoring. The conference has a terrific line up of speakers around the four themes: gamify, mobilize, optimize and monetize.
I expect loads of discussions around very social behaviors like playing (gaming), being on the go (mobile), shopping (monetization). At OpenAmplify, we're particularly interested in what people are sharing when they play, move/travel, and buy on the social web. And how we can help social businesses by making sense of it.
Watch this space for some nuggets from the sessions “Social Media: Harvesting Intent for Better ROI” and “Measuring Social Media Success.” I think I’ll also sit in on “Your Tweets Hurt my Facebook when I am LinkedIn: How to Monetize Social Media for Business Results”—which wins the funniest title award.
Keep the 'human' in social media - no need to scrub it out!
Social media provides us with a rich source of information about people - our friends, our customers, everyone. Up until recently, large data volumes had to be 'scrubbed' down to work within our analysis capabilities - well no longer.
Mark Redgrave's recent blog post on Radian6, Do your Scrubbing in the Shower, highligts how you can keep the real human element in the data and thus not lose a key component of the social media content stream.
Semantic Analysis for Facebook Open Graph
FaceBook announced yesterday the Open Graph Protocol which enables you to integrate your Web pages into the social graph. This means when a user clicks a Like button on your page, a connection is made between your page and the user. Your page will appear in the "Likes and Interests" section of the user's profile, and you will have the ability to publish updates to the user.
So basically ,the structured data that you provide via the Open Graph Protocol defines how your page will be represented on Facebook. So,
- What is this structured data ?
- How do I get it and make it scalable ?
Structured data is nothing but the meta data about that page which summarizes the page using tags/topics. The more information you provide, the more opportunities your web pages can be surfaced within Facebook. To answer the second question, the first immediate thought that comes to our mind is - with people speaking about semanctic web/web 3.0/linked data etc.. , there should be some automated way of analyzing the web pages and extracting the topics from it and use that as meta data. Yes, you are right , one of the ways to semantically analyze those pages would be to pass it to OpenAmplify.
OpenAmplify, it's all about understanding the meaning of content. It will analyze the content and will list all the topics along with the sentiments and guidance and as well as domain/category that the page belongs to. The information is readily consumable that can used as meta data. Now, that you know what the user likes, as its been semantic analyzed, you can publish similar stories and target ads catering to that.
Is it that simple? Really?
Grab a free API key and try it out!
Mike Petit Previews Digiday Panel: Engagement and Monetization of Social Communities
Catch Michael Petit, Founder and CIO, OpenAmplify at the 2011 Digiday Social event, September 22 in New York. Q: Mike, please give us a sneak preview of your panel session this Thursday… A: We're excited to be at Digiday Social again this year. It will be great fun to see how much the conversation has evolved from one year to the next. I will be facilitating a discussion around "Engagement and Monetization of Social Communities." I'm happy to be joined by Maribel Sierra of Dell, Mike Dunn of Hearst Interactive and Bryan Jennewein of Radian6. Q: Tell us what the Digiday Social attendees can look forward to--what will you be discussing during your session? A: A major theme of discussion will be the role of social as a channel for the voice of the customer. End customers are generating a lot of content that gives businesses—large and small—a huge opportunity to connect with them in a relevant way. Q: What do you think are some of the biggest challenges ahead? A: I think it's a three-fold challenge: massive scale, a short shelf life for the content, and an expectation that each singular, individual voice will be heard. I will be asking our panelists to give their perspectives on this and how their organizations are meeting unique challenges for their respective brands or on behalf of clients. Q: Will you give us your thoughts on what Digiday dubs as "The Social Operating System"? A: We are seeing enterprises build out their social presence in a serious way. This requires a shift organizationally and culturally as the company as a whole takes a giant step towards being social. I have a feeling our panelists will have a lot to say about this. Q: Sounds like the Digiday attendees are in for a treat! For those who want to see you in person at Digiday, whom should they contact? And for those who miss the event, can they have access to the full content of your discussion? A: For brand or agency executives who are going to be in NYC or locally-based, please contact Monika Jo, OpenAmplify's Director of Marketing. She may still be able to get you into the event as our guest for this 3:45pm panel discussion. For those who can't be there in person, watch #openamplify and #digiday posts. And we will also be sharing more content, including videos, after the event.
Content leads to outcomes, but how can outcomes lead to content?
If you just read a lovely travelogue on the joys of Martinique, you might go ahead and book a vacation cruise. If you just found a really luscious sounding recipe for Crème Brulee, you might go ahead and order a kitchen torch. As publishers, we intuitively understand how content leads to outcomes. But how can we work in the other direction? Given an outcome, booking a vacation cruise or ordering a kitchen torch, how can we create or curate content which will produce that outcome? I could cut to the chase, and tell you that the way to do it is to use OpenAmplify to create content profiles, but where is the fun it that?
Here’s an example using the access_log from an e-commerce site. I’m assuming that each record in the access_log contains a UserID and a URI (a page view), and that you know the URI of a desired outcome (the purchase page). Split the access_log records by UserID. If the clickstream contains the URI of your purchase page, score the purchase relevance of each page view (UserID + URI) to be 1. Otherwise, score the purchase relevance of each page view to be 0.
We could use a much more sophisticated scoring algorithm; we could say that only page views immediately preceeding a visit to the purchase page are relevant, or that the relevance of a page view to visiting he purchase page goes down with the log of the number of intermediate pages, but this simple binary score should be good enough for this example.
Now, calculate the purchase relevance of each URI by getting the sum of the purchase relevance scores for each page view and grouping by URI. Again, we could have a more sophisticated scoring algorithm; we could normalize the scores based on the relative frequency of pageviews for a given URI, but again, this simple grouping query should be good enough for this example.
Get the OpenAmplify analysis for each of these URIs and then use a classifier such as Weka to determine your content profile for purchase relevance. Now that you have a content profile, you can evaluate new content by getting the OpenAmplify analysis for the new content and seeing how closely the content profile for this prospective content matches your content profile for purchase relevance. If the new content profile matches your purchase relevant content profile, go ahead and feature the new content to your site.
Second by second Sentiment Analysis of Twitter
Tweet Sentiments is mesmerizing, as it analyzes sentiment on the twitter feed. It uses OpenAmplify's Natural Language based Semantic Analysis engine to measure the positive (or negative) emotions associated with tweets. Watch as tweets roll by at a rate of 1 per second with the sentiment shown in red/yellow/green icons. Also has pie charts, gauges, and trending topic lists with associated sentiment.
Intridea is a leading web and mobile design and development company based in Washington, D.C. They specialize in web application development using Ruby on Rails and highly usable mobile applications for every major platform including iOS and Android. Intridea designed Tweet Sentiments as a showcase for their Twitter API development abilities. Worth checking out.
Using Semantic Analysis to help your site visitors find related content
SimpleReach is a NYC-based company at the intersection of content and advertising. They've partnered with OpenAmplify to help power the targeting technology behind their content recommendation unit, The Slide, and social influence measurement tool, LinkCurrent.
The Slide: Helps readers discover more of your content by recommending related posts on a widget that "slides" in at the bottom of the page -- increasing pageviews, time on site, and ad revenue (optional).
LinkCurrent: Measures the current and future social value of your content through an in-depth analysis of Tweets, Likes, Klout scores, and other variables.
Facebook grouping similar posts
Facebook yesterday officially announced that it is grouping similar news feed stories based on keywords.So if any of your friends speak about iPhone ,then they will be grouped under the topic "iPhone" with a link to the Fan Page.
Facebook has said that it will be using Natural Language Processing to match words in status updates to brand pages, but it will not be using human editors or sentiment analysis to filter out or cluster negative or positive stories, or corral unrelated elements. That makes me worried because if you accidentally post something that has the word “iPhone” then it may be grouped together with a completely random other story about an iPhone.To make things worse, same topic with different sentiment would be grouped together.
What do you guys think? Should it be more than just 1 topic? Do they have to consider other features than just topic/keyword.Should they be using more sophisticated way of grouping the related posts?
Any thoughts?
ColdFusion and OpenAmplify
Raymond Camden is a ColdFusion, Flex, AIR, mobile hacker working as an Evangelist for Adobe. He has a bit of a geek crush on OpenAmplify and has put together some nice blog entries on us so we thought we would return the favor. :-)
In this blog entry Raymond discusses how he makes use of ColdFusion to parse RSS feeds for keywords. Instead of simply reporting on matched keywords, he makes use of the OpenAmplify API to enhance those results by determining if the match was a positive or negative discussion.
In an earlier blog entry, he has written about real time textual analysis via OpenAmplify and jQuery and put up an online demo of our service.
So, check out his blog, and let us know if you are also working in ColdFusion.
Ready? Let's Go.
I'm preparing to keynote the Semantic Web Media Summit in New York in September, http://bit.ly/nri4IR. Yesterday, I compared notes with another keynote speaker, to get into sync. I was gratified to find that we share many perceptions of the marketplace and the state of semantic technology. One thing in particular stands out: the time for action has come.
In recent years, we have all discussed the potential of semantic technology to advance our various missions. We have debated the standards, compared the offerings, theorized about ROI. These are all valid issues to discuss, especially when real solutions are in short supply. Three years ago, there was very little from which to choose.
That's no longer the case. Offerings from OpenAmplify, Calais, Klout and the like can deliver real ROI, here and now, alone and in combination. Great companies like Radian6, Salesforce.com, Dell and Bestbuy are making real moves. The sun is shining; it's time to make hay.
My co-speaker proposed a simple roadmap: Pick a finite project that moves the needle, then assess the impact upon workflow, costs, revenues, etc. Build a real case for a moderately disruptive initiative with breakthrough potential.
Then, and only then, find the right tools to execute. The tools are there, mature, differentiated and reasonably priced.
Ready? Let's go.
