September 28th, 2020 – the end of a month, the beginning of a new challenge: Today I start the Data Science Bootcamp at Spiced Academy in Berlin. For the next 12 weeks, I’ll be learning about machine learning, NLP, data infrastructure, and software engineering, work on different small projects, and present my final big project on December 21st.
I feel it’s going to be an exciting (yet stressful) couple of months, so I decided to start a series of blog posts in which I share my weekly progress: topics I learn, projects I work on, people I meet, and the overall bootcamp experience! It’s a good way to keep myself accountable and track my development, with highs and lows. And most importantly, I would like this bootcamp diary to be helpful, inspiring, or motivational for my readers 🙂
Week 1 (28.09.-02.10.)
- Topic: Visual Data Analysis
- Lessons: bash, git, pandas, seaborn, descriptive stats
- Dataset: Gapminder
- Project: Create an animated scatter plot in Hans Rosling style.
- Code: GitHub
I started the first day of the bootcamp in the comfort of my home! Due to the COVID-19 restrictions, me and the other 11 students were split into two groups, alternating daily between online and on-site attendance. It’s not the best experience, but so far it went pretty well. The first day started at 9:30 sharp with a short welcome and information presentation from our project manager and teachers.
We then introduced ourselves, and I was surprised at how diverse our Stochastic Sage cohort was, including people with backgrounds in Physics, Math, Finance, and Social Sciences, from coding newbies to experts. Another surprising fact was the even split between Linux, Mac, and Windows users! I’m curious whether this will change by the end of the course (come to the Linux side)…
Soon afterwards, the actual coding (and bugs) started – right from the basics. One day we spent several hours on writing a FizzBuzz function. On another day, two hours only for cloning two repos and creating a branch. Another class on how to use the command line. And several session of setting up Python and Jupyter Notebooks.
Then we dived into descriptive statistics and data visualization with pandas and seaborn. Our challenge for the week was to create an animated scatterplot with matplotlib/seaborn and imageio, depicting the relationship between life expectancy and fertility rate of world’s countries from 1960 to 2015, with combined data from Gapminder datasets. Here’s the result:
Friday Lightning Talk
Each week, we get a main dataset and several tasks to apply the concepts learned throughout the week. On Fridays, we present in 5 minutes a particular finding from our weekly challenge project, a chart, new library, (un)solved bugs, or anything that is worth sharing and helpful for other too. This Friday, I chose to play hacker and talk about five new bash commands for checking the installed Python libraries, their versions and dependencies.
|to list all installed libraries||pip list|
|to list only outdated libraries||pip list -o (or –outdated)|
|to list only the latest / up to date libraries||pip list -u (or –uptodate)|
|to show all information about a library||pip show <package-name>|
|to list all libraries installed in a specific environment||conda list -n <environment-name>|