Where to start with Machine Learning
* * *
What is Machine Learning?
It's not just stats
Just like biology isn't chemistry, machine learning is not just statistics.
It's not a solution to everything
You don't need terabytes of data but you shouldn't implement a ML solution for everything.
Is the science of interpreting data
Collection of algorithms which learn how to interpret data based from example
No extensive stats knowledge needed
Unless you want to build your own ML algorithm
Machine Learning techniques
Learning from example
Data usually takes the following form
Example A car has these features
What features are more important for distinguishing between cars? In other words which features contain the most information.
Machine learning technique which will take all of the features and reconstruct new features. This reduced set of data will make it easier for classifying objects.
Common misconception is that PCA selects the best features
Non-linear dimensionality reduction. Like PCA but finds features that correlate non-linearly.
Lots of algortihms to try but are slower. Sometimes results worse than PCA.
Classification technique which uses the assumption that all features are independent for each other. Given that my dataset had five male smokers, 3 female smokers, 2 male nonsmokers and 4 female nonsmokers. Is my new datapoint a male given that he smokes?
Basic but fast. Outperformed by Random Forests.
Maps a set of features to a particular output. Given our car features is our car a "fast" type? Is it a valuable car? Acts as a set of mapping function that all work together.
Needs training data to find the weights of each mapping function (neuton)
Deep Learning: Neural Network with lots of layers (and complicated wiring)
Many more technques...
Usually a combination of techniques is used.
How do computers do this?
These ML Techniques are mostly minimisation problems. And Computers (particularly GPUs) are very good at finding minima.
Well if you really must know...
Minimisation problems can be turned into root-finding problems.
Heres the proof that Newton-Raphson method will converge
PCA & Manifold learning
Minimises loss of information
Finds weights of neurons such that loss function is minimised
Outlier detection Algorithms
Minimise covariance determinant
Minimise sum of squares or distance
In 2018 Nvidia found a way of halfing memory requirements while speeding up arithmetic. This is done by performing the calculations at half precision then storing the gradients and weights of loss to very cleverly recontruct the data in full precision
New Nvidia Tensor Cores
Use mixed precision training technique
3X Faster Deep Learning
Than previous architecture
~50% Less memory required
672GB/s Memory Bandwidth
2x faster than previous Titans
Where to start?
Machine Learning Made Easy
Make smarter solutionsEnhance the way your tools solve problems
No complex maths neededNo need to be scared
Plenty of toolsMost are free and open-sourced