Session 2: Advances in Computer Applications in Data Science
4:35 pm – 5:00 pm GMT
Ankur Sinha (Indian Institute of Management Ahmedabad) – Analysing Real Time Unstructured Data: Unlocking the Value of Business News
A very good evening to everyone in India and good afternoon to everyone else. So I’m Ankur Sinha and I’m going to talk about analysing real time, structured unstructured data unlocking the value of business news. This work has been done in collaboration with a couple of my students, Satish Kedas and R. Kumar, and also a collaborator P. malo from Helsinki. So just a while back, there was a discussion about modelling the economy with the help of contracts between a number of players that are there within the economy. this very similar sort of question that we asked in the context of the stock market, as well that what if we are able to extract all the news items and all the discussions that are going on among the people? Would we be better off than the price itself? Many times people say that the price itself is the indicator of accumulates everything that is out there, but just having all the discussions and all the business news with you, and if you can analyse it in real time, can you extract all the information? And would you be better off than the market? Similarly, in the context of the GDP as well, if you’re talking about such kind of numbers, which are often delayed by, let’s say, three months, or six months of time? Again, if you can analyse all the information in terms of unstructured data in terms of structured data that are out there? Do you have a better real time indicator of the GDP? So the question is that why do we want to analyse unstructured data when there is a lot of structured data already out there. And it could well be possible that the structured data contains all the information, and there is no need to look at some of the business news and the discussions that are going on within various social media. And it might not lead to any extra amount of information for you to convert it into any kind of a trade or any kind of a decision. So this was the premise for the study. And of course, in this context, some of these things are often studied, where people only aim to extract information like sentiment and stuff like that from news articles, but we wanted to kind of go a step ahead and see that what all other kinds of information can be extracted from business news, that can be put to some use and beyond sentiments, can you also figure out that in case your model is saying that a particular news item is positive or negative or neutral? What exactly is the explainability? What exactly is the reason why this sort of a prediction is being made? Or in essence, why is the model saying that this particular news item is positive? And if at all, you incorporate it into some sort of a trade? Are you better off in any way? So this particular presentation will be discussed in three parts? The first part is going to be about what is the utility of analysing business news, the second part is going to be our contribution where we have analysed almost 20 years of business news data primarily in the context of India. And the third part is going to be the application of the machine learning models to demonstrate that what kind of analysis can you get in case you are analysing this kind of large volume data? So the first question that one would like to ask is that why do we want to analyse business news and as I just mentioned that it contains the real, real time information about events and developments that happen across the world. Of course, there’s a lot of traditional sources like quarterly reports, annual reports, etc, that are also out there. But it contains a significant amount of lag. Just to motivate a sample of around 150 brokers and their recommendations were taken up during the COVID period from February 1 to June 15 2020. And these brokers who are ranked in terms of the returns that they’re able to generate, there was one particular hypothetical broker which is invisage sentiment that was only taking the sentiment data from the news items into account. And when it competed with other brokers, it actually stood at rank 4 out of 150. And during this period, s&p 500 had actually declined by 5.61%. And it generated a return of 16.82 just based on the analysis of news data, primarily extracting sentiments out. of course, there were others for example, Maxim group which generated the maximum returns of 44% And there were a couple of other brokers, which were marginally above 16.82 That is 19% and 20%. So of course, there is some sort of reason that is out there because it might require you to, which might encourage you to analyse business news. Now, the question is what would you like to measure from Business News. One of the things, of course, that people are looking at is the sentiments. For example, there are a couple of items over here, if you look at Adani enterprises, quarter three net profit declines 84% to 68 crores, and this is a negative item. And similarly, if you look at the next item where it’s a 65%, stock return, PC Jeweller shines, the moment you see the word shines, it conveys that it’s a positive news item, and it’s generating a significant amount of stock return, which comes to around 65%.
Now, some of the other information that can be extracted amounts to past information which might not have future credible value. For example, if the news item is simply talking about a company like Voltas, for which the share plunged. It’s past information. So the discussion is that whether such kind of a past news item has any future credible value. Similarly, there could be other kinds of things as well. For example, there could be a recommendation where a person or an expert is bullish on a particular item. Does it have any future tradable value? The third one could be simply some sort of an explanation about why the market or a particular index or a particular stock is behaving in a particular manner. For instance, if we look at this particular news item, which is giving a reason or a bunch of reasons why a particular index fell below 27,500. Now all of these information can very well be extracted from news items. So we’ll see that what are the capability of some of the AI systems how well this information can be extracted, and what are the accuracies that one can achieve. At the same time, one might also like to look at some of the future events. For example, the moment one observes a news item, like Samsung to launch three new flagship smartphones in August, I would like to be aware of such kind of news items as well, because this kind of future item, this kind of a news item has an impact to may have an impact on the stock price in the future, particularly on the launch date of the product. Similarly, information about certain deals, which companies might be considering will also be valuable to know. And all of these are future items, which I can extract from news items. The other things would be more on discussions of financial concepts, for example, rise in subsidies. So whenever I say a financial concept, what I mean is a concept like subsidy, I mean a concept like import duty, a concept like debt. The moment one stock one talks about increasing debt or decreasing debt, it conveys some sort of a sentiment, particularly if you look at the word debt itself. And you look at traditional sentiment analysis systems, the moment debt is encountered, one thinks of the word debt as a negative word. However, in the context of finance, almost all the companies have some sort of protect. So the moment you observe a phrase like increasing debt, that’s a negative phrase, the moment you encounter a phrase like decreasing debt, it should be considered as a positive phrase. So the way the analysis of business news items have to be done is very different from the way other kinds of items like movie reviews or normal conversations that people have on various kinds of products, and the way you extract sentiments from those kinds of conversations. Business news items have to be analysed in a very different manner. And that is what we’re going to look at. Now once again, our sentiments useful very quickly to motivate a couple of more studies based on using AI to make decisions and creating a model which uses sentiment and does not use sentiment. So this is a reinforcement learning model, where sentiments from news items were used. And in the second model sentiments from news items were not used. And what we observed was that the kind of Sharpe ratio that was there for the portfolio’s that were using sentiments was way higher as compared to the portfolio that did not have sentiments. The other question is How long are sentiments useful. So as we say that the diminishing value of news is very high, especially because of algorithmic traders and all a number of news items that are generated within a few seconds, what we would observe is that that particular news item is immediately taken into account and converted into trades. But from the information content or the information value point of view, if you look at what has been observed is that even if you’re making a delayed decision, delaying the decision by a day or delaying the decision by 48 hours, still you are better off than not taking that particular news item into account. So two days is what people typically talk about is the time that you have, where there will be some utility of taking the news item for a particular company that has come up in the news media. Now, this is the analysis of 20 years of data that we looked at. So we created a sentiment index, the upper chart that you see is the sentiment index, and the lower chart that you see is the nifty moving average. So it’s a 30 day moving average. And very clearly what we observed was that there are some turning points macro turning points that you see over here on sentiments, and this was your 2008 financial crisis and we see sustained of large movements over here on sentiments and these are the other large movements that you see in the, index itself. Now understanding the unstructuredness of news items. So, there are certain kinds of information which are out there in the news item that needs to be understood by an AI model. For instance, if I’m talking about ONGC and SBI, these are two companies which are a part of this index. Now, simply analysing this particular news item is not going to lead to much of value until and unless I understand that these two companies are a part of this index. So, there is some sort of an implicit knowledge there is some sort of a background knowledge that your AI model is required to have. Similarly, if you talk about prefer auto and pharmaceutical stocks, now what all companies belong to the auto sector, what all companies belong to the pharmaceutical sectors is something you need to be aware of. At the same time, the names of the companies could be in various forms, for example, Larsen and Toubro can be mentioned as Larson or it can be mentioned as L&T and so on and so forth. Again, the other problem could be that in case you are simply looking at sentiments in a particular news item, there could be conflicting opinions. For instance, it could be negative for one news item, and it could be positive for the sorry, it could be negative for one of the entities and it could be positive for the other entity that could be mentioned of multiple entities itself in the news item with two different sentiments that need to be extracted. So here we’re talking about a classifier system, which takes news item as an input, and then it predicts that whether the item is positive, negative or neutral, but this is something people have already been looking at for the last decade or so. And the results that we are able to obtain now are quite extraordinary, almost matching human level performance, but we are going a step ahead, where we are also talking about that, why exactly is the news positive? Can you provide this kind of an explainability as well, and this is something we talk about as an Explainable AI model. Now going forward, the second part of the presentation talks about the contributions that we have made. So looking at the last 20 years of Indian news data, we have created an annotated data set of around 10,500 business news items where we have annotated these items for the sentiments for the kind of financial entities that they contain the directionality of the financial entities, and in case there are multiple entities mentioned in some of these items, and what is the sentiment whether it is positive, negative or neutral for various entities that are mentioned in each particular news item. For example, if you look at this news item, Mahindra and Mahindra stock up 20%. Post strong revenue growth and then the second part says that auto sector likely to see an uptrend. Now this is talking about a sector and it’s also talking about a particular company in that sector. So the annotations that we have done mentions about the entities that are out there, in terms of the name of the company, the name of the sector, which financial concept has been spoken about. Now here we talk about two financial concepts. The item talks about two financial concepts. One is the stock price and the other one is talking about the revenue growth. The directionality is another important thing that we extract. And if you see there are three directions it is talking about up comes in the context of stock growth comes in the context of revenue. And then there is an isolated directionality, which is uptrend mentioning something that the AI system needs to understand.
Now, there are various types of AI models. For the last 20 years Bag of Words has been one of the most popular approaches where any document or any particular sentence is simply looked as a bunch of words, and based on the bunch of words that are present together, you classify the item as positive, negative, neutral or whatever set of sort of a class you want to put it in. In case you are going ahead with the bag of words approach and use the bag of words approach to make the sentiment predictions for news items, what typically you would observe is that the accuracy turns to be around 81% In terms of classifying news into positive, negative or neutral categories. Then there are other kinds of approaches which are N-gram models, where instead of looking at individual words, you’re looking at phrases for example, you can look at two consecutive words, then it would be a two gram if you’re looking at three constitutive words, it would be three grams, four grams and so on. So the moment you start taking, the moment, you start taking the position of words, in some sense into account, or the relationship between words into account, you improve marginally from 81 to 83%. Back in 2014, we did a work where we were working towards interpretable representations in interpretable representations, what we were talking about was again, extracting the financial entity, the financial concept that is mentioned in a particular news item and looking at the directionality. So what essentially we were doing was we maintain a dictionary of the financial concepts like average revenue, debt, and things of that sort, and another actual Dictionary of the directionality of words, and whenever we found out a news item, which had a financial concept and had a directionality, we were simply bringing the two things together and trying to figure out does it help us out in improving the sentiments that are contained in the news item. now, this actually led to another marginal improvement from 83% to 85%, the moment we started going for these kinds of interpretable representations, but then the significant improvement came from word embeddings. In the context of word embeddings, there have been two very common embeddings that people talked about the first one is the Word2Vec. And the second one is GloVe in the context of these two word embeddings. You essentially talk about each word being represented with a vector for example, if the word is king, man, woman, it could be the name of a stock or it could be any financial concept like debt or could be restructuring and such kind of words, each word is mapped to a vector and this vector is pre learned. Now, using this kind of vectorized representation of the words, you can again solve a classification problem. And here, you get another jump of around 3% When it comes to predicting the sentiments of the news items. However, one of the drawbacks with the word embeddings is that they are still based on words and they are not there are there is no contextual information that you are getting. So for example, bank as a financial institution will be treated in a similar way, as bank of a river, the vector will be the same for both these words. So that’s one of the drawbacks of some of these word embeddings. Back in 2019, the Bert model came into the picture and after birth, a number of other extensions of berts like RoBERT RoBERTa, all came out from the research community primarily, the initial research was driven by Google and then there was a lot of follow up from other companies as well, which lead to such kind of language models. Now what these language models did was they modelled the language from a massive amount of data such that whenever you would give some kind of sentence or a document, instead of considering it Word Wise it will consider the entire sentence or it will consider the entire document and it will convert it into a vector. So, for instance, if I give it a particular news item, it will convert the entire news item into vector and this vector is supposed to have the relationship of words and everything
into it. Now, with Bert which contains all contextual information, the moment you solve this classification problem, you get a jump of around 4% And that takes you to an accuracy of 92%. However, Bert is still a general language representation model, if at all you go for a domain specific representation, where you take a large amount of financial news items, you take a large number of financial dictionaries you take a large number of other financial sources into account and then work with further train Bert on it so, that it gets the domain specific information as well, then what we found was that the accuracy is actually jumped from 92% to 95%. And that is what we have reported in the paper. And on top of it, what we are also reporting is that, with the directionality of the financial concepts, it should be possible for the AI models to interpret as well as explain the reasons for that particular item being reported as negative, positive or neutral. So, this is where we are, and if at all we compare this number 95% with our Inter annotator agreement, so, we had around 10,500 items that were studied by that were analysed by three annotated by three annotators. The inter annotator agreement also comes very close to 95%. In fact, what we observed was that the agreement between a centre between positive versus neutral items was the lowest so people went, it was easy for people to figure out that whether an item is a negative or neutral there, the internet annotator agreement was very high around 97% or so, the moment you spoke about my method, yes, the moment you spoke about inter annotator agreement between positive and negative items, it was almost close to 99%. But the moment you spoke about people classifying an item as positive versus neutral, what we figured out was that the agreement was around 80%. So for out of 10 items, 8 items, one might out of 8 items in and out of 10 items on 8 items, there might be an agreement, and there would be 2 items where one might put it into the positive category, the other guy might put it into the neutral category. But on an average the inter annotator agreement was 95%. Very quickly, these are the results that we got these were the results from our model, RoBERTa was further trained. And after further training the RoBERTa model, making it a bit domain specific. These were the results that we got. And very quickly, what is the value, we did our Granger’s personality test. And based on the personality test, what we figured out was that the sentiment index that we were creating, it had a p value, which are highlighted over here from various years. So in each of the years, we analysed all the news items that were published from 2012 to 2017. And for every year, we found out that the data for those years were significant. So that’s all from my side. Thank you very much.
Thank you. Time for maybe one or two very quick questions.
The chat, so I’ll read it out. Very good presentation. When we consider the past data, the technology we use is quite different. What happens in the future when AI based news are created to manipulate the market.
Well, that is something. What you’re talking about is what about fake news? And how would your system take into account the fake news itself? Now that, of course, is a challenging problem that people are looking at that would there be a possibility of a system where you can figure out that whether the news item is a real item or a fake item? And what I’m talking about is that if there is a news item, I don’t know whether it is real or fake. At this point in time, what I’m talking about is can I understand it? can I figure out what is the sentiment contained in that particular item? But yes, that is a challenging problem. And I think that problem has to be taken up separately, it’s all together a different classification task that you’re talking about that an item is there, there is some sort of a content in that news item. And then you are classifying it for an actual item that has been produced by a person and individual, or is it a fake item, one of the ways to handle this kind of a problem would be to only look as only to look at news items from reputed sources, for example, ignoring news items that are coming from unknown sources, and you only look at items that might be coming from, let’s say FT, economic times and the sources that you know. So that’s what we did, we analysed only the sources that we knew are reliable, the sources that are reputed, and based on that we did the analysis, yes a very relevant question.
Yeah, thank you very much for this very nice presentation. Actually, there are some very interesting, that kind of analogies between some of the people I know who are working in, you know, you mentioned a lot about the concept combinations or association between words, and also the contextual information and so on. I know of one very interesting group who are working at Brussels, they have done some very interesting works related to how concepts are combined in human minds. And this actually goes that they have taken this quantum like approach, where there are certain interesting inequalities, which are violated, while you think about the association, how the words are associated, you know, when you calculate the correlation values, etc. These are often known as to be contextual inequalities. So I’d be very happy to share a couple of papers actually done. So maybe that might be quite interesting for this research.
Sudip, Can I also add that as part of this discussion group, if you can make it make it available on our website because the whole purpose of this exercise is so that these ideas are shared? You are taking them, right. Yes. So Ankur that was an excellent presentation. we did a sentiment analysis for the bitcoin price bubble. You know, the recent one, starting November 2021. to March, April of 2020. Sorry, am I getting wrong? Yeah. The November 2019. Too much. got that wrong? No, no, that’s right. 2019 To 2021 April. And what we found was, as you quite rightly say, is that when we include all the posts all the Twitter information, you know, and in particular with me, we followed Elon Musk’s Twitter feed, as you know, he’s a market manipulator of prices. He talks up prices in which he has interest. He first said that Tesla would be buying cars with Bitcoin. And then he went back on himself and said that no, because of considerations of green environment, two reasons. He wouldn’t be supporting, supporting Bitcoin and hey, presto, Bitcoin then falls in value at that particular time. But what was interesting is that the, this improved the Sharpe ratios and so on, or improves predictability over and above if you didn’t incorporate sentiments of social media. But what’s interesting in your case, is that you do this huge sort of beauty contest between all these different ways in which you would be able to tag which is positive, neutral, negative, right?
Absolutely, so I would just like to point out one thing so even we did a Twitter analysis and what we figured out was that there was a lot of noise in the Twitter data. So the Twitter data itself is not so much reliable I can understand in the context of Bitcoin, why you might have got a positive result is primarily because you must have been following few specific people who are market manipulators and all. And you know that every time Elon Musk writes a tweet, Dogecoin goes up or, and stuff like that happens. So there’s automatically there’s a known correlation that exists, right? So if you don’t take and that is the only information that is out there, and you get the tweets first. And then there is some other information that is created in the news media that right, Elon Musk tweeted, and then the price went up by 5%, or 7%, and so on. But in the context of business news, what we figured out was that Twitter actually had much less credible value as compared to reliable business news sources. It could well be because of the poor modelling that we did, because there’s a lot of noise that comes out of Twitter, maybe we did not pick up the right people. Maybe we did not analyse the various ways in people write the same word on Twitter, so that could be the reason but then in our results, what we observed was that analysing business news, we were much better off than looking at our tweets.