Sentiment analysis with machine learning

by | Nov 6, 2019 | AI

When developing a new technology, it really helps if you are also a user of that new tech. This has been an approach of Red Hat around artificial intelligence and machine learning — develop openly on one hand, exchanging knowledge across the organization to use the same tools in the other hand to work on interesting business problems. All while keeping a two-way exchange to and from the open source commons.

This is the sort of left-hand/right-hand move that data scientist Oindrilla Chatterjee began using as part of a project she originally started during an internship, then later in a full-time role at Red Hat. Chatterjee and her team are looking at how to do sentiment analysis using machine learning on a dataset consisting of customer and partner surveys regarding a service offering.

The challenge, though, was how to do natural language sentiment analysis on a relatively small dataset. The extra effort could be worth it — it’s an interesting problem to solve and build software for, and the answers provided could be useful to Red Hat and its customers.

When starting, the first approach Chaterjee and the team took was to stack together several open source natural language processing (NLP) tools: Stanford Core NLP, Vader Sentiment Process, and TextBlob. The initial runs, though, were disappointing in terms of performance and ability to work with Red Hat-specific terms.

Seeking to improve results, Chaterjee’s team looked at how the tools were stacked. Where they started originally with a voting scheme, they moved on to building a weighted average of the three models, taking this model-stacking approach. This did show some improvements.

As part of this, they introduced a quantity of context-specific tools into Vader, adding value into this lexicon approach, essentially retraining the model with Red Hat-specific terms.

As the quality of results seemed to be plateauing, the team began looking for a single service rather than gluing together three different approaches. This involved exploring deep learning techniques.

The first prototype for this new deep learning-based approach involved using a recurrent neural network (RNN) with long short-term memory (LSTM) units. This prototype provided better results than the previous three off-the-shelf libraries, although a lot of training time was needed.

From there, they looked at other deep-learning models and found transfer learning, a technique more commonly explored in image processing and similar models. In the transfer learning method, a model developed for one task is reused to seed a model for a different task.

In this technique, you take a model or service that has been trained on a particular type of data and is known to do a good job of identifying that type of data. The model is then used without retraining, working to identify the new kind of data needed. This approach saves computational time and resources.

The team began with a paper from Google called BERT for “bidirectional encoder representation from transformers.” The BERT approach is to train a language model bidirectionally from unannotated text and huge Wikipedia datasets. Using the approach, Chatterjee and the team had a channelized, fine-tuning step on the same pretrained model, requiring less data in this step than in the first unsupervised learning step.

Performing better than previous prototypes — up to 12 times faster — the BERT-based approach used much less computationally expensive training time. This approach could leverage the GPU infrastructure to be more parallel than the RNN/LSTMs approach, which is more sequential, capturing temporal dependencies in the text.

Finally, they introduced a feedback loop into the system. This begins with a machine learning service drawing predictions of the text, and noting where performance was better with certain types of data. Human checkers then mark and correct to help create a dataset that is human-influenced before being put back into the unsupervised learning cycle.

The feedback loop approach would improve the model’s accuracy through using the human as an agent so the model can learn context-specific data better and more efficiently.

This human-assisted learning can be done in the context of users of the service building the model by incorporating feedback on the model’s performance back into the retraining step. The team is looking forward to getting this feedback in the near future as the system interacts with actual users.

Taken all together, the overall approach was built-up via research and trials, and allowed the team to work from a relatively small dataset to refine a lexicon and model over several tool iterations. The result is a tool that may be very useful at extracting more meaningful sentiment analysis from survey data.