Modelling news article quality

On the Feasibility of Predicting News Popularity at Cold Start

 I. Arapakis, B. Barla Cambazoglu, and M. Lalmas

Abstract - Prominent news sites in the Web provide hundreds of news articles daily. The abundance of news content competing to attract online attention, coupled with the manual effort involved in article selection, necessitates the timely prediction of future popularity of these news articles. The future popularity of a news article can be estimated using signals indicating the article’s penetration in social media (e.g., number of tweets) in addition to traditional web analytics (e.g., number of page views). In practice, it is important to make such estimations as early as possible, preferably before the article is made available on the news site (i.e., at cold start). In this paper, we perform a study on cold-start news popularity prediction using a collection of 13,319 news articles obtained from Yahoo News, a major news provider. We characterise the popularity of news articles through a set of online metrics and try to predict their values across time using machine learning techniques on a large collection of features obtained from various sources. Our findings indicate that predicting news popularity at cold start is a difficult task, contrary to the findings of a prior work on the same topic. Most articles’ popularity may not be accurately anticipated solely on the basis of content features, without having the early-stage popularity values.

 PDF

Variation of metrics over time.

User Engagement in Online News: Under the Scope of Sentiment, Interest, Affect, and Gaze

 I. Arapakis, M. Lalmas, B. Barla Cambazoglu, M. C. Marcos, and J. M. Jose

Abstract - Online content providers, like news portals and social media platforms, constantly seek new ways to attract large shares of online attention by keeping their users engaged. A common challenge is to identify which aspects of online interaction influence user engagement the most. In this article, through an analysis of a news article collection obtained from Yahoo! News US, we demonstrate that news articles exhibit considerable variation in terms of the sentimentality and polarity of their content, depending on factors like news provider and genre. Moreover, through a laboratory study, we observe the effect of sentimentality and polarity of news and comments on a set of subjective and objective measures of engagement. In particular, we show that attention, affect, and gaze differ across news of varying interestingness. As part of our study, we also explore methods that exploit the sentiments expressed in user comments to reorder the lists of comments displayed in news pages. Our results indicate that user engagement can be predicted if we account for the sentimentality and polarity of the content, as well as other factors that drive attention and inspire human curiosity.

 10.1002/asi.23096
 PDF
 Human-computer interaction; user studies; questionnaires

Arapakis, I., Lalmas, M., Cambazoglu, B. B., Marcos, M.-C. and Jose, J. M. (2014), User engagement in online News: Under the scope of sentiment, interest, affect, and gaze. J Assn Inf Sci Tec, 65: 1988–2005.
Variation of news volume, sentimentality, and polarity with respect to the genre.

Automatically Embedding Newsworthy Links to Articles: From Implementation to Evaluation

 I. Arapakis, M. Lalmas, H. Ceylan, and P. Donmez

Abstract - News portals are a popular destination for web users. News providers are therefore interested in attaining higher visitor rates and promoting greater engagement with their content. One aspect of engagement deals with keeping users on site longer by allowing them to have enhanced click-through experiences. News portals have invested in ways to embed links within news stories but so far these links have been curated by news editors. Given the manual effort involved, the use of such links is limited to a small scale. In this article, we evaluate a system-based approach that detects newsworthy events in a news article and locates other articles related to these events. Our system does not rely on resources like Wikipedia to identify events, and it was designed to be domain independent. A rigorous evaluation, using Amazon’s Mechanical Turk, was performed to assess the system-embedded links against the manually-curated ones. Our findings reveal that our system’s performance is comparable with that of professional editors, and that users find the automatically generated highlights interesting and the associated articles worthy of reading. Our evaluation also provides quantitative and qualitative insights into the curation of links, from the perspective of users and professional editors.

 10.1002/asi.22959
 PDF
 User studies; text mining; qualitative research

Arapakis, I., Lalmas, M., Ceylan, H. and Donmez, P. (2014), Automatically embedding newsworthy links to articles: From implementation to evaluation. J Assn Inf Sci Tec, 65: 129–145.
Example of news article with an automatically augmented link to a related article.

© Ioannis Arapakis 2016 with help from Bootstrap