Guest post by Seth Duncan

As a researcher in the PR industry, I couldn’t be more excited about some of the advances I’ve seen around automation in media measurement over the past several years.

Image: Tinkerbots via Flickr, Creative Commons

Tools are beginning to do a solid job with the most mind-numbing parts of research (i.e., collecting media coverage, putting it into a database, matching the articles and posts up with other metrics, etc.).

But when it comes to the most difficult parts of research – where domain expertise is required, even the best tools tend to fall short.

Traditional and social media measurement tools usually provide 3 broad types of automation:

1. Aggregation and de-duplication: crawling web data, aggregating multiple media databases, removing spam and duplicate articles

2. Integrating media and web metrics: matching specific posts to web metrics (e.g., votes, comments, inbound links) and web analytics data (e.g. website visitors and sales leads)

3. Artificial Intelligence: sentiment, topic or emerging theme identification, and identifying influential sites and individuals

Good, fair and nonsense

At the risk of over-simplification, I would say that media monitoring software has gotten to be very useful for #1, does a serviceable job with #2,

and usually produces either nonsense or the obvious when it comes to #3.

Unfortunately, most public relations and marketing practitioners really want the insights from artificial intelligence. Consequently, when they choose to use the tool, they end up fairly disappointed with its immediate results.

If you’re enthusiastic about automation, here are a few things you should consider before making the plunge:

What is the strategic goal of your measurement program?

Don’t collect the number of positive, negative, and neutral mentions of your brand in thousands of posts just because that’s what the monitoring tool was designed to do.

Tracking sentiment around a brand and its competitors might be important for a product manager.

But if you are a corporate communications manager, you should really think about whether or not you might be better served by tracking the presence (or absence) of specific campaign messages instead of plain old sentiment.

Does your brand’s volume of coverage necessitate artificial intelligence?

If you have a huge volume of coverage to contend with, automated tools might be the only way you can collect data on every single mention you receive in the media. But if you’re not one of those fortunate brands, consider using higher-quality (probably lower-cost) human coding.

And even if you are one of those lucky brands, consider reading a sample of your coverage to collect crucial qualitative data like sentiment or message accuracy.

You can use a common random sampling technique to pare down the volume of coverage to analyze (this is always associated with a margin of error which you can calculate with this sample size calculator).

Another popular technique is to only read posts/stories from a list of pre-identified influencers, since these are the people who are ultimately driving earned media conversations.

Are you planning on using automated sentiment to alert you to potential crises?

Traditional and social media monitoring vendors often report the quality of their sentiment analysis in terms of “accuracy;” that is, the percentage of stories or posts that the tool’s algorithm and a human reader agree on.

Vendors often report “accuracy” percentages around the 70%-90% range, but these numbers are deceptively high.

The problem is that the vast majority of traditional and media coverage on brands is neutral – and it’s this neutral coverage that humans and algorithms tend to agree on.

Negative and positive posts are often identified at about chance level by social media monitoring tools (a controversial paper by Freshminds found that many popular social media monitoring tools operate even worse than chance level). This means that automated tools are likely to miss actual crises.

If you’re going to use an automated tool for alerting you to negative posts, make sure to ask potential vendors for specific information about their false alarm rates (e.g., percentage of time when the tool mistakenly identifies a positive or neutral post as negative) and miss rates (e.g., percentage of time that the tool mistakenly identifies a negative post as positive or neutral).

When it comes to alert systems, a high false alarm rate is much more acceptable than a high miss rate.

If you must use automated sentiment or topic identification, be prepared to train the tool.

If you are planning to use automated sentiment to measure or monitor your reputation in the media, be ready to spend a lot of time working with the tool or your vendor to get it right. I’ve seen some pretty good automated sentiment analyses, but these always involve a high-degree of human “training.” This process can take weeks (or months).

When done well, it can be worth the time investment – but be warned that getting automated sentiment to work well doesn’t come without time costs.

Do you want to identify meaningful coverage themes? Or are you looking for lists of commonly mentioned words in the media?

Most complaints about automated tools focus on sentiment, but I’ve been much more surprised by the poor quality of the topic identification. Most existing social media monitoring software doesn’t go much further than presenting the user with a set of two or three word phrases that appear most often in posts.

That’s very useful for SEO, but I can’t imagine that this level of detail would ever be sufficient to inform a marketing or communications strategy.

I look at it this way: if I had a life-threatening disease and my physician didn’t have the time to read the latest medical journal articles on the disease and instead had an intern collect a list of the 50 most-mentioned words in those articles to create a treatment program, I would probably die.

I don’t know why marketers and PR professionals would want to do the same thing to their brands.

At some point, any good strategy will depend on rolling up your sleeves and reading some coverage, to prevent automated intelligence from ruining your actual intelligence.

Want to hear more from Seth on the pros and cons of PR measurement automation? He’ll grace the bi-weekly #measurePR Twitterchat from 12-1 pm ET this coming Tuesday, November 23. RSVP for #measurePR and add it to your calendar… now!

Seth Duncan is Director of Research and Development at Beyond Analytics. Prior to joining Beyond, Seth worked as a statistician at the University of Pittsburgh Medical Center and University of California at San Francisco, conducting statistical analyses for Phase III clinical trials and acting as the lead statistician for numerous research studies. He has also taught statistics and research methodology at San Francisco State University. Seth received a B.A. in psychology from Reed College, an M.A. in psychological research at San Francisco State University, and completed graduate level coursework in advanced statistical analysis at Boston College and the Massachusetts Institute of Technology.