This post is part of a series:
In the last couple of posts, we saw what machine learning is at a high level and also how it works in detail using the decision tree algorithm as an example. And now, in this post, I am going to talk about some real-world examples and applications of machine learning.
And as you can see, there are quite a few examples that I want to talk about. And by presenting so many different applications, I hope, the huge variety of possible use cases for machine learning will come across.
Before I start, however, I want to say that I am not linked to or paid by any of the companies I am about to mention. I just want to show them to demonstrate that there are concrete, real-world examples where someone is using machine learning to solve a specific problem and someone else is willing to pay for or invest in that product or service.
Well-known Examples of Machine Learning
So, to start I’m just going to rattle off some of the more known applications of machine learning which are for example:
And now, let’s go into a little bit more detail with the maybe lesser known examples.
The first one is Dango. It’s an application that predicts or suggests which emoji you might use based on the message you have written.
The training process of such a service is pretty straight forward. You need many example messages where you show the algorithm some text message and the respective label for that message are the emojis that have been used. Once the algorithm is trained, you can show it a new message that it has never seen before and it will be able to predict which emojis are the most likely ones to represent the content of the message. And then consequently it will recommend those for you to use.
On their site you can play around with the app. So, if you type for example “shut up” it suggests the hand or the cactus. Or if you write “Michael Jordan” it realizes that it has something to do with basketball and it also suggest the goat emoji because he was the greatest of all time.
The next example, Swiftkey, does something similar but instead of predicting emojis, it predicts what the next word is you are going to write.
So, for example, if you write “I’ll meet you at the”, it suggests “airport, office or hotel” and you can tap on it instead of typing the word. So, with this service you are supposed to be able to write your messages faster.
Textio is a service that uses machine learning to predict the performance of job listings that companies create to hire new employees. Therefore, they analyzed millions of job postings and the metrics they used to determine the quality of those postings were:
And now, they are able to provide a service in the form of a text editor where you can insert your job listing and they will predict how good it will perform in terms of the just mentioned metrics. Therefore, they calculate an overall score which represents the percentile that this job listing falls into compared to other postings for the same role and same location.
And because of their extensive analysis of job listings they can also give direct and concrete feedback on how to improve your posting and therefore your score. So, for example, it highlights words or expressions that don’t perform well in job listings. And if you click on them, textio will suggest better ways of saying them. And that way you can step by step improve your job listing and therefore hopefully attract more good applicants.
Another interesting application of machine learning is done by Tessian. The problem that they are addressing are data security breaches due to inadvertently misaddressed emails.
Normally, cyber security providers try to protect companies by making sure that their technology doesn’t get compromised by hackers. But the employees themselves are also a great risk factor because a huge volume of sensitive information is contained in the emails that they write.
So, what Tessian does is, it analyzes the content of an email and checks it for potentially confidential information. And if it finds something, then it warns the person that the email might be addressed to the wrong person. And that way, Tessian helps to avoid inadvertently disclosing sensitive information.
The next application deals with the fact that today customers can directly contact companies via increasing numbers of social media platforms. And as more and more people do that, companies increasingly struggle to answer all the requests. And that’s why chat bots are a huge field for machine learning.
One example company is DigitalGenius. And their chat bot can automatically analyze the requests. So, for example what the priority of a message is, what the sentiment is or what it is about. And based on that it routes the message to the respective department responsible for that specific request. And the most important part, it automatically creates an appropriate response which the customer service rep only has to approve or maybe slightly personalize.
The machine learning algorithm also calculates a percentage which demonstrates how confident it is in the given answer and the customer service rep can determine a threshold above which DigitalGenius answers automatically.
A similar chat bot approach is used by x.ai. But instead of dealing with customer requests it deals with scheduling meetings.
So, if someone sends you a meeting request via e-mail, you simply reply to that email and write for example “Amy@x.ai” in the cc field. And then, their algorithms take care of the back and forth of scheduling the meeting based on your preferences and schedule. And once the date is agreed upon, it will be saved into your calendar.
Atomwise is a company that tries to improve the process of drug discovery and they use computer vision to do it.
To understand what they do, we need to first know a little bit about drug discovery in general. Namely, that diseases are influenced by proteins. The proteins either inhibit the disease or they enhance it. The activity of proteins, in turn, is influenced by molecules. And their relationship can be compared to a lock and a key. The protein is the lock and the molecule is the key.
So, drug design is about finding or designing the right key for a given lock so that you ultimately cure the disease. And this is very difficult to do. For every molecule which ultimately becomes a drug, millions might be physically tested and discarded as unsuitable resulting in extremely high costs to bring a drug to market. So, given these circumstances, it is really important to focus on the most promising molecules.
And that’s where machine learning comes into play. At Atomewise, they use a convolutional neural net and showed it protein-molecule-pairs and the label was if this combination was effective or not. And then, once the network is trained, you can virtually design different molecules for a specific protein and see what the network is predicting. So, for example, that key 1 has a probability of being effective of 8%, key 2 of 91% and key 3 of 75%. So, Atomwise allows you to focus on the most promising molecules and therefore save a lot of time and resources.
Alphasense is a service directed at professionals working in the financial industry. Obviously, they have to research a lot to make their investment decisions and that’s exactly where Alphasense comes into play.
It goes through the documents of many different sources and then indexes these documents. And just to get a sense for how many sources they cover, here is a list of them. It includes primary research, so documents that companies themselves publish, research documents done by brokers or documents published by regulatory institutions. And the list goes on and on.
So, clearly there is just too much stuff to read through. So basically, what Alphasense is, it’s a search engine for documents so that you can find the critical information faster that you need to make decisions.
BenevolentAI does something similar as Alphasense, only in the field of science. Here, too, there are just too many research papers published for a scientist to read them all.
So, what BenevolentAI does at a high level, is to go through those papers and based on the knowledge it builds from these sources, it tries to come up with new hypothesis that can then be further investigated by scientists. So, it basically is a tool for accelerating scientific innovation. And the field that they first focused on is discovering new drugs. And the process they use for that looks something like this.
First, they feed in the research papers. Then, they annotate all the words in those texts that are related to the field of pharmaceuticals and drug discovery. In the next step they use NLP to understand the relationships between those words. And out of all those relationships they build a knowledge graph which is used to derive inferences and come up with new hypothesis.
So, for example, let’s say one research paper found that a certain protein causes a certain disease. In another paper, completely irrelevant to that disease, they found that a specific molecule inhibits that specific protein. Now, you can make the inference that the molecule cures that disease because it inhibits the protein which causes the disease.
And these are the kinds of hypothesis that the system comes up with and which scientists can then investigate further. Obviously, that’s an over-simplification but that’s how I understood the system to work in general.
Clarifai is a company that works in the field of image recognition. So, their service automatically recognizes the content of images or videos and tags them accordingly.
That way it is easier to organize and search the content and one can also curate it more effectively and delete inappropriate content. This is particular interesting for social media sites like facebook or pinterest. But the cool thing is that there is a demo on the site with which you can play around with.
Affectiva does something similar to Clarifai but it focuses on facial expressions and recognizing human emotions. It has the world’s largest emotion database with more than 8 million analyzed faces.
And the cool thing is, there are also some demos that you can play around with. One demo, for example, will monitor your emotional reactions to a YouTube video. And, I think, it is clear how such data would be very interesting to advertisers or content producers to see how engaged the users are.
In another demo you can search for GIFs using your emotions. And I think this might become a real feature in messaging apps at some point. Your smart phone simply captures your emotion while reading a text and then it automatically recommends potential GIFs that you could use as a response to the particular message.
OrCam uses computer vision to help visually impaired and blind people to see the world. They built a camera that you attach to your glasses. And then, by simply pointing at things it it reads aloud the text that it sees through the built-in headphone.
Additionally, you can save the faces of your friends and family so that OrCam automatically tells you who is in front of you as soon as it sees a face without you having to do something. Another feature is that you can also save items that it recognizes, so that you can go to the supermarket on your own.
Zebra Medical Vision
Another area that will be hugely influenced by machine learning is medical diagnosis via any sort of scan, be it X-rays, CT or Mri scans. So, you simply show the algorithm a particular scan and it will then tell you if there is a particular disease/injury or not. One company that works in that field is Zebra Medical Vision.
Blue River Technology
Still another industry that will be hugely affected by machine learning is agriculture. And one example of how machine learning can be used comes from Blue River Technology.
They built a trailer that you pull over the field with your crops. And in that trailer are cameras that are able to detect which plants are crops and which are weed. And then it can spray herbicides very precisely just on the weeds instead of just spraying it all over the field. And that way, you can increase yields and reduce costs at the same time.
And now, the last example comes from another huge industry for machine learning which is cyber security. One company from that area is called Darktrace. It tries to detect anomalies in your computer network by analyzing data about user behavior. And the interesting fact is that, in contrast to the great majority of existing machine learning applications, it uses unsupervised learning techniques.
So, if you remember, in the first post of this series I briefly talked about unsupervised learning.
In the context of the iris flower data set, unsupervised learning meant that the label which described what species a particular flower is, is not included in the data. So, all the dots have the same color.
And, I think, here it gets clear why Darktrace might use unsupervised learning to detect anomalies. Let’s say the dots in the right plot represent data about user behavior in your network. So, each point represents a specific activity.
Then it is pretty clear that the right cluster, where most of the data points lie, are probably data points which represent normal activities of the employees. But the data points in the left cluster, which are clearly separated from all the other points, they probably represent some kind of anomaly that you might have to address. So, that’s how unsupervised learning could be used in cyber security.
And now to wrap up, by presenting all these different examples, my goal for this post was that you can get, on the one hand, a feeling and understanding for what can be done with machine learning. But maybe even more importantly, by showing such a breadth of examples ranging from the supposedly trivial like recommending which emojis to use to possibly lifesaving applications like medical diagnosis, I also wanted to demonstrate that the use cases and possibilities of machine learning are really only limited by our own imaginations.
And, I think, this realization raises one important question, namely: Given that machine learning can be applied to solve such a huge variety of different problems, what are some possible implications of that? And this will be the topic of the next post which is also the last one of this series.