The new system is called Reuters Tracer. It uses Twitter as a kind of global sensor that records news events as they are happening. The system then uses various kinds of data mining and machine learning to pick out the most relevant events, determine their topic, rank their priority, and write a headline and a summary. The news is then distributed around the company’s global news wire.
The first step in the process is to siphon the Twitter data stream. Tracer examines about 12 million tweets a day, 2 percent of the total. Half of these are sampled at random; the other half come from a list of Twitter accounts curated by Reuters’s human journalists. They include the accounts of other news organizations, significant companies, influential individuals, and so on.