Sentiment corpus

25 Jan

Found this and thought I’d pass it along to folks interested in sentiment/opinion/emotion research: http://www.cyberemotions.eu/data.html.

If you’re at an academic institution, they’ll give you access to a variety of things tagged with sentiments. What you’ll get is a tag that is the average from three human beings. That’s not really a lot, though that is typical in the field right now. Each has a  1-5 positive strength score and a separate 1-5 negative strength score.

  • BBC News forum posts: 2,594,745 comments from selected BBC News forums and > 1,000 human classified sentiment strengths.
  • Digg post comments: 1,646,153 comments on Digg posts (typically highlighting news or technology stories) and > 1,000 human classified sentiment strengths.
  • MySpace (social network site) comments: six sets of systematic samples (3 for the US and 3 for the UK) of all comments exchanged between pairs of friends (about 350 pairs for each UK sample and about 3,500 pairs for each US sample) from a total of >100,000 members and > 1,000 human classified sentiment strengths.

One Response to “Sentiment corpus”

Trackbacks/Pingbacks

  1. Prosodically annotated corpora « Corpus linguistics - March 8, 2012

    […] my previous posts on emotion here and here for other resources–note that the two above are both […]

Leave a comment