Today, I am opening a new section on my blog, and this time it is about analytics. As you may know, I've been working in research on IIoT and analytics for the last years, but up to now my blog has only shown my hobbyist projects. I want to change the focus of my website a little bit and add something about data analytics, machine learning, Docker technology etc. Everything that I will publishing in this section is not new. There are many tutorials and great YouTube videos that explain these topics too, but I am going to focus on building an end-to-end data science project using some of the projects that I made using MicroPython and Android Things as sensors, using Docker and the Google Cloud Platform.
This post is a short guide for using Docker as an environment for data science. Docker allows you to almost say goodbye to OS troubles and welcome to portability!
|Fig. 1: Docker|
One of the most time consuming and annoying part that you have when you try to share your work with your co-worker is to install and setup all libraries and packages of your environment, ie. if you've developed some cool algorithm or pipeline to process data and you want that your co-worker does the evaluation process, he or she needs to install all the libraries that you've locally (or in an environment) installed. You can also add as a problem, that your co-worker needs an specific version of a library that is not compatible with your code because it is an old version. Or, he/she is working on Windows and you are on Linux and you cannot find the right libraries to install on Windows. Then, your new friend should be Docker! and as I said before, say goodbye to OS troubles and welcome to portability!