Skip to content
MathJax.Hub.Config({
tex2jax: {
inlineMath: [['$','$'], ['\\(','\\)']],
displayMath: [['$$','$$']],
processEscapes: true,
processEnvironments: true,
skipTags: ['script', 'noscript', 'style', 'textarea', 'pre'],
TeX: { equationNumbers: { autoNumber: "AMS" },
extensions: ["AMSmath.js", "AMSsymbols.js"] }
}
});
MathJax.Hub.Queue(function() {
var all = MathJax.Hub.getAllJax(), i;
for(i = 0; i < all.length; i += 1) {
all[i].SourceElement().parentNode.className += ' has-jax';
}
});
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
# Posts

Following on from my previous post surveying some of the concepts underpinning statistics and probability, I thought it might be interesting to apply some Monte Carlo simulations to Roulette, as a means of consolidating some of the topics covered.
Suppose we have a friend, named Adam, who wants to visit a casino, in either Las Vegas or Monte Carlo this weekend. He intends to gamble but is equipped with nothing other than a burning hole in his wallet and a keen desire ‘to beat the house’.
Today I will be using R to create Monte Carlo simulations as a means of exploring some of the ideas and concepts that underpin probability and statistics. This will provide us with an opportunity to brush up on these concepts as well as learn some of the basics of R. So, without further ado, let’s dive into the Law of Large Numbers and the Gamblers Fallacy.
Law of Large Numbers and the Gamblers Fallacy
Data cleaning is messy. It is also perpetual. If you handle raw data, the chances are that it requires cleaning. You often hear that data scientists or data analysts spend up to 80% of their time cleaning data. However, for a beginner, it isn’t always clear how to go about cleaning data. What is clean? Here follows some techniques and concepts that I have learnt on the matter so far.
Today I will be looking at K-Means, one of the most common unsupervised machine learning algorithms for clustering problems. Clustering refers to a series of methods for finding subgroups or clusters within datasets. It is in this sense that clustering can be a thought of as an unsupervised learning method, in that we are trying to discover structure that may or may not be previously known.
The K in K-Means refers to the number of clusters we wish to partition the data into.
Upon learning about Trump twitter archive, I immediately recognised the opportunity to subject the President’s tweets to some analysis. Without doubt President Trump, before many other politicians, understood the power of Twitter as a means of disseminating his message. His message has many characteristic features. Perhaps this is not surprising given the President’s former career as a reality television star. One phrase from the election campaign that stood out to me, an uninformed observer from outside the US, was ‘Crooked Hillary’.
## Posts navigation

#
Dom Atkinson

Nederburg Hugo Theme by Appernetic.

A port of Tracks by Compete Themes.