At CompassRed, our goal is to continuously learn and use cutting-edge techniques to provide value for our clients. When I found a project where Deep Learning was appropriate, I dove right in. Here are a few tips that I learned along the way.
For those unfamiliar, Deep Learning is a form of Machine Learning that refers to some types of Neural Network modeling. The “Deep” part of Deep Learning refers to complex Neural Network models that can have many, many layers.
While Deep Learning can seem like something very complex and unattainable, there are existing implementations that can do all the math for you, so you just need to focus on being able to correctly setup the model and the data. However, before starting to use Deep Learning it is important to ask yourself if it is an appropriate analysis for what you are trying to answer.
1. Do you need Deep Learning?
It is very important to use a model that is appropriate for both the data you are using, and the question you are trying to answer. For many cases, you will be fine using classical methods such as regression. Some questions to ask yourself when considering Deep Learning are the following:
How complex is your problem?
Do you have many features?
How much time do you have to implement Deep Learning?
Which is more important — a high degree of interpretability or a high degree of accuracy? Is this a one-time analysis or will you be incorporating additional data later on?
Let’s break down the above questions and what your answers to them could mean.
Deep Learning is typically used to answer hard and complex problems. For example, you could train a model to determine the quantity to manufacture across a series of various products based on demand data. While predicting how much to manufacture for a single product can be considered a simple problem, the complexity in the above example comes from predicting a large number of different products. In this case, Deep Learning is especially helpful when the products share attributes, since the model is able consider interactions between the products.
Another example of a complex problem is if you want to classify fraud detection customers in a financial setting. This is a difficult problem that many companies have faced for years, which can be modeled using Deep Learning techniques.
That doesn’t mean you can’t use it for a simple problem, but often it is faster to implement or analyze something using regression rather than setting up Deep Learning. If you only have one feature/variable, a Deep Learning model won’t be able to utilize what it’s good at, which is finding interactions between features to predict the target. If you only have one feature, you will probably get similar results if you just used regression.
Deep Learning takes time to implement. Even though the math is done for you in the existing libraries, setting up the data, the model architecture, and hyper-parameter tuning is not done in a day. So, if you are in a rush to get results, this may not be the right time to try Deep Learning.
It is also important to determine if interpreting results is important in your project, or if you only care about accuracy. Deep Learning models can often give great accuracy, but at the cost of it being a “black box”. With some effort you may be able to decode the weights of the model, but in general the simpler the technique, the easier it is to get coefficient weights and actually quantify the relationships. If you need to be able to say exactly what is going on with your model, Deep Learning might not be appropriate.
Finally, it can be useful to ask yourself if the analysis will be continuously used and improved upon, or just conducted once. Deep Learning methods are able to adapt to new data, and so if this is something that will be going into production, then it makes sense to go with Deep Learning, otherwise a simpler method could suffice.
2. Inspect and clean your data: Garbage In, Garbage Out
Once you have determined that Deep Learning is the right approach for your problem, you will begin the modeling process. There is a lot of hype surrounding Deep Learning and AI as being “magic”. Unfortunately, there’s no such thing as magic and you cannot simply put anything in the model and get good results. You still need to remember the basics of any data science project, and inspect your data and figure out what you are actually trying to model and predict. Inspecting your data will also give you an idea of how much data cleaning you still need to do. By doing this step before even starting to think about your model architecture you are saving yourself time by addressing problems before they even occur.
For example, if inspecting the data shows you that there was a problem in the data collection process and there are rogue values in the data, the model will most likely give you noise as output, and by inspecting the data beforehand you can cross it out as a potential point of error.
Something specific to Machine Learning is that your input must be numeric (see one-hot-encoding). Part of your data cleaning process must involve transforming your variables to numeric if your features are categorical. If you try to input a categorical variable into an ML model without transforming, you could still get results, but they will be noise.
Speaking of numeric input, Machine Learning models are very picky about the scale of the values, so something you also need to do to your data is normalize it. There are different ways of normalizing data, but a good starting point would be to squeeze your data between 0 and 1. Make sure to normalize each feature separately, and save the attributes of each feature so that you can later transform the output back into the original scale (e.g., if you are transforming your data into a value between 0 and 1, make sure to retain the max and min of your original data).
3. You will fail at first. Iterate, iterate, and iterate some more.
Even after taking the above tips about data inspection, cleaning, and transformation, your model will still likely fail during the first iteration. Don’t get discouraged by this. Because there are so many moving parts in Deep Learning, it should not be surprising that the first iteration fails to give you the results you were searching for.
Instead of quitting in a frustrated manner, take a step back and look at what you have done so far. Take another look at the data. Visualizing your data in a different way (or visualize it in the first place if you have not already!) is a good way to quickly see if something is going wrong.
In general, there are two categories of things that may be wrong with your model. The first category is there is something wrong with your data, and the second category is there is something wrong with your model. One useful way that I have found to check if there is something wrong with the model architecture is to generate dummy data that represents an easy problem to solve. For example, if you are trying to model a sequence, generate a very easy to predict sequence such as a series of small consecutive numbers as input, and the next value in the sequence as the output (e.g., inputting 1,2,3,4 and outputting 5 — remember to normalize). If the same model architecture that you have been using works for this simple example, then you know that you are doing something right with your architecture. If you are still getting junk results, then fixing the way your model is set up should take priority. Of course, it might just be that your model needs more layers/nodes/features to capture some relationship but at the very least the simple data example gives you a sense of if you are heading in the right direction.
Deep Learning is an exciting set of techniques that can be applied to an array of problems. While I did give quite a few caveats in the beginning about whether or not you should use it, if you just want to learn more about it then I would suggest to do what I did and dive right in!