Next it’s time for R to shine. At this point is worth to note, as far as I could see from my professional background, R is still the way to go — and this will likely remain as is. Why? R is well established, extensive in its magnitude (Machine Learning, linear algebra, ggplot, etc.), can be used for simple scripts and is already in place in many companies like Facebook, Google, Twitter, Microsoft, Uber just to name the best known — impressive, right?
Fun fact, IBM joined the R Consortium Group (remember I mentioned IBM as a competitor with SPSS earlier?)!
To my experience, R is taught at many universities, too (I had the pleasure twice). What can I say, R is great!
R Syntax is easy to read and also quite fun to learn. I appreciate its similarities to other languages and accurate error messages when it comes to debugging. As R is supported from its early years on, for development you will probably like to use Atom or (beloved) RStudio. What is note mentioning, I often experience that R is far less forgiving when it comes to inaccurate code, forgetting a comma might cause an hour of head scratching. The learning curve is quite steep.
The results — just to compare with the others. Note, like Julia but unlike Python, the index starts at 1.
53 53 0.5291503
77 77 0.5291503
51 51 0.5385165
59 59 0.6480741
87 87 0.6633250Prediction: 1
At last, let’s have a look at Python 🐍.
Why I love Python is its list comprehension, powerful iteration methods and neat data structures. Although it is great fun to use a variety of tools, it might make it quite a bit harder to start and understand. IDEs for Python are numerous, I would go for Pycharm (which is free for standard purposes != Notebooks and Web development) or Atom (for the third time).
Our knn algorithm is quite compact (maybe it is due to the fact, that I have learned Python the longest .. ), still, sorting using lambda and list comprehension may not be everyone’s best friend. I have not used a dataFrames in this case, as I did not want to import 🐼 here.
The results — just to compare with the others:
[[52, 0.5291502622129179], [76, 0.5291502622129183], [50, 0.5385164807134504], [58, 0.6480740698407865], [86, 0.6633249580710798], [77, 0.7615773105863908]]
Julia — Honestly I did really enjoy using Julia, however it sometimes feels .. complicated, more than necessary. In the below example I arbitrarily used a dictionary (which is not available in R and more convenient in Python) as well as a DataFrame to manipulate data. In contrast, working with most data structures is quite fun and seems indeed very fast. Julia might be a modern “one language fits all” language. This could indeed be the beginning of a love story!
R is just rock solid. It is providing a huge number of packages, is very well established and makes fun to code with. Due to its maturity, there is a lot of documentation and many examples out there. Going with R simply can’t be a mistake if you’re into data.
Python is great, it is not the best performing still, it is fast, easy to read and learn and comes with a variety of libraries. The main advantage, in my opinion, is, that Python can be used for so many different purposes. Python often is so intuitive, you will wonder how easy things can be.
Comparing Python and (the other general-purpose language) Julia, it is not unlikely to see a shift towards Julia one day, especially for more than just scientific purposes — but time will tell.
As always, see you next time! Stay safe, stay at home.