Week 6 (02.11.-06.11.) Topic: DatabasesLessons: Postgres (from Python), Data Modelling, (Advanced) SQL Queries, Joins & Foreign Keys, (Cloud) Databases, Cloud ComputingProject: Build a dashboard on top of a Postgres database that runs in the AWS cloudDataset: NorthwindCode: GitHub This was another week packed with new information and experiences! In only five days, I managed to... Continue Reading →
Week 5/12 #DataScienceBootcamp
Week 5 (26.10.-30.10.) Topic: Time Series AnalysisLessons: Decomposing Time Series, Naive Forecasts, Evaluating Forecasts, Linear Autoregression, Namespaces, Plotting on Maps, Generators, ARIMA Models, Python ModulesProject: Forecast the temperature in Berlin Mitte and plot climate data on a mapDataset: ecad.euCode: GitHub I found this week's project quite challenging, because I haven't worked with time series and... Continue Reading →
Week 4/12 #DataScienceBootcamp
Week 4 (19.10.-23.10.) Topic: Text ClassificationLessons: Web Scraping, Regular Expression, HTML Parsing, Language Models, Class Imbalance, Bag-of-Words, Naive Bayes, Python Functions, Command Line InterfaceDataset: Self web-scraped song lyricsProject: Predict the artist from song lyrics.Code: GitHub I was super excited about this week, because it was about language models and first steps into NLP, my favorite... Continue Reading →
Week 3/12 #DataScienceBootcamp
Week 3 (12.10.-16.10.) Topic: Machine Learning - RegressionLessons: Linear Regression, Feature Expansion, Gradient Descent, Regularization, Feature Selection, Hyperparameter Optimization, Gradient Boosting, Debugging.Dataset: Capital BikeshareProject: Predict demand for bicycle rentals at any given hour, based on time- and weather-related features (and submit the predictions in the kaggle competition).Code: GitHub I really enjoyed this week's project for... Continue Reading →
Week 2/12 #DataScienceBootcamp
Week 2 (05.10.-09.10.) Topic: Machine Learning - ClassificationLessons: ML workflow, Logistic Regression, Feature Engineering, Decision Trees, Random Forests, Evaluating Classifiers, Cross-Validation, Loss FunctionDataset: TitanicProject: Predict passenger survival on the Titanic and submit the model to the kaggle competition. On the second week of the bootcamp, we started with Machine Learning (ML). If you think about... Continue Reading →
Week 1/12 #DataScienceBootcamp
September 28th, 2020 - the end of a month, the beginning of a new challenge: Today I start the Data Science Bootcamp at Spiced Academy in Berlin. In the next 12 weeks, I'll learn machine learning, data infrastructure, and software engineering, work on different small projects, and present my final big project on December 21st.... Continue Reading →
5 eye-opening books about ethics in Artificial Intelligence
Artificial Intelligence is undoubtedly an exciting field. We are making machines think like humans, mimic our actions, and solve problems more efficiently than us, at this at an unprecedented level and speed. But beyond the hype, I can't help but wonder: does more efficiently mean better? Are some of our actions worthy of being mimicked?... Continue Reading →
4 tips for a good data science portfolio
It is a truth universally acknowledged that a data scientist in possession of a good portfolio must be in want of a job. A curated selection of your projects is the best way to showcase your work, interests, and thinking to potential employers. From my experience and discussions with colleagues, I found four aspects that... Continue Reading →
Book review: “Invisible Women” by Caroline Criado Perez
"Most of recorded human history is one big data gap." In her book Invisible Women, Caroline Criado Perez illustrates with tenths of real-world examples and scientific studies how women are being discriminated against in all areas of everyday life: from housing and public transport planning, to cars and smartphones design and, maybe most obviously, healthcare... Continue Reading →
WAD Live Week – on the pitfalls of Deep Learning
We Are Developers is the world’s largest conference for software development. It’s a great opportunity for software engineers to connect with tech companies and learn about interesting advances in technology. This year, it was supposed to take place in Berlin, but due to COVID-19, it was moved online for a Live Week (25-29 May) of... Continue Reading →