<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Heike Terhechte — Engineering Leadership, Software, Data &amp; AI</title><description>Heike Terhechte is Director of Engineering at MOIA GmbH and writes about engineering leadership, software, data, and AI.</description><link>https://datadojo.dev/</link><item><title>Keeping up with AI Tools</title><link>https://datadojo.dev/2026/06/15/keeping-up-with-ai-tools/</link><guid isPermaLink="true">https://datadojo.dev/2026/06/15/keeping-up-with-ai-tools/</guid><description>As a developer, I used to know how to keep up with technology. AI broke that.</description><pubDate>Mon, 15 Jun 2026 00:00:00 GMT</pubDate><category>AI</category><category>AI agents</category><category>Engineering Management</category><category>Developer Tools</category><category>Workflows</category></item><item><title>Prompts Create Pictures. Systems Create Books.</title><link>https://datadojo.dev/2026/05/25/prompts-create-pictures-systems-create-books/</link><guid isPermaLink="true">https://datadojo.dev/2026/05/25/prompts-create-pictures-systems-create-books/</guid><description>What creating a personalized children&apos;s book taught me about visual consistency, shared state, reference material, and treating creative AI projects like software systems.</description><pubDate>Mon, 25 May 2026 00:00:00 GMT</pubDate><category>AI</category><category>AI agents</category><category>Image Generation</category><category>Codex</category><category>Creative Workflows</category></item><item><title>Psychological Safety Does Not Mean Work Should Feel Comfortable</title><link>https://datadojo.dev/2025/08/25/psychological-safety-does-not-mean-work-should-feel-comfortable/</link><guid isPermaLink="true">https://datadojo.dev/2025/08/25/psychological-safety-does-not-mean-work-should-feel-comfortable/</guid><description>Psychological safety makes difficult conversations possible; it does not remove disagreement, accountability, or discomfort from work.</description><pubDate>Mon, 25 Aug 2025 00:00:00 GMT</pubDate><category>Psychological Safety</category><category>Leadership</category><category>Trust</category><category>Engineering Management</category><category>Team Culture</category></item><item><title>What I Learned Redesigning My Chocolate Database Webapp with AI</title><link>https://datadojo.dev/2025/08/11/what-i-learned-redesigning-my-chocolate-database-webapp-with-ai/</link><guid isPermaLink="true">https://datadojo.dev/2025/08/11/what-i-learned-redesigning-my-chocolate-database-webapp-with-ai/</guid><description>In summer 2020 I had a small Python web app around one of my favorite datasets: 300 chocolates that my husband and me had tried over the years with reviews and ratings and a machine learning model that predicts how much we may like an unknown chocolate.
Technically, it was a SQLite database with a search interface that was a simple keyword match.
It worked: It was also really ugly. Because I am not a frontend developer, I had happily used Streamlit. That was a good choice for getting something working quickly, but it also meant the interface looked like what it was: a table with some hacky filters on the side.

And yes, I am a little embarrassed to show this screenshot to you.
With the help of aider around 2023 and later Codex, the app had grown the usual side-project way: adding more features like more information about chocolates and an admin interface. However, the design was still terrible: emojis and lots of boxes.
I tried to fix it with AI design prompts. “Make this more modern.” “Improve the UX.” “Use better spacing.” I also tried design-focused coding assistants and design skills. It was still an inconsistent design with lots of boxes. In my chocolate search resultlist every field from the SQLite Table got its own box.

The useful shift came when I stopped asking AI to decorate the old interface and started using Google Stitch to rethink what the database was for.</description><pubDate>Mon, 11 Aug 2025 00:00:00 GMT</pubDate><category>AI</category><category>Google Stitch</category><category>Playwright</category><category>FastAPI</category><category>UX</category></item><item><title>How to Create Effective Development Goals: Lessons Learned</title><link>https://datadojo.dev/2024/03/24/helping-people-reach-their-development-goals/</link><guid isPermaLink="true">https://datadojo.dev/2024/03/24/helping-people-reach-their-development-goals/</guid><description>What I learned about supporting people in choosing meaningful development goals and making steady progress toward them.</description><pubDate>Sun, 24 Mar 2024 00:00:00 GMT</pubDate><category>Engineering Management</category><category>People Development</category><category>Leadership</category><category>Career Growth</category></item><item><title>Mastering Differences and Pitfalls when Switching SQL Databases: PostgreSQL vs. MySQL vs. SQLITE vs. Hive vs. Presto (AWS Athena)</title><link>https://datadojo.dev/2022/12/09/mastering-differences-and-pitfalls-when-switching-sql-databases-postgresql-vs.-mysql-vs.-sqlite-vs.-hive-vs.-presto-aws-athena/</link><guid isPermaLink="true">https://datadojo.dev/2022/12/09/mastering-differences-and-pitfalls-when-switching-sql-databases-postgresql-vs.-mysql-vs.-sqlite-vs.-hive-vs.-presto-aws-athena/</guid><description>Transitioning to another SQL database? This blog post is for you. Shifting from one SQL dialect to another can be a journey full of surprises. While the basic syntax (SELECT FROM WHERE) is similar, there are important differences, that will make your queries slow, fast, fail or worse: fail silently!
In this blog post I’ll guide you through the intricate pathways of databases I have come across during my work as a data scientist: Postgres, MySQL, SQLite, Hive and Presto (AWS Athena). We’ll start with a brief introduction into the databases and some differences. Then we jump into three pitfalls you have to be aware of.</description><pubDate>Fri, 09 Dec 2022 00:00:00 GMT</pubDate><category>MySQL</category><category>PostgreSQL</category><category>Hive</category><category>Athena</category><category>Presto</category><category>SQL</category></item><item><title>From localhost to a web server -  How to host your Streamlit App on Heroku (for free)</title><link>https://datadojo.dev/2021/09/05/from-localhost-to-a-web-server-how-to-host-your-streamlit-app-on-heroku-for-free/</link><guid isPermaLink="true">https://datadojo.dev/2021/09/05/from-localhost-to-a-web-server-how-to-host-your-streamlit-app-on-heroku-for-free/</guid><description>You have built a great streamlit app. So far, you only ran it locally on your computer on localhost:8501. Now you would like to share your app with others, but wonder how. This blogpost introduces you to one option: Heroku. Heroku is a platform as a service that allows you to deploy your apps (not just streamlit apps, but also jvm apps, ruby apps etc.). This post will guide you through the deployment of a streamlit app on Heroku.</description><pubDate>Sun, 05 Sep 2021 00:00:00 GMT</pubDate><category>app</category><category>hosting</category><category>deployment</category></item><item><title>Biases in learning to rank models and three approaches to deal with them</title><link>https://datadojo.dev/2021/04/29/biases-in-learning-to-rank-models-and-three-approaches-to-deal-with-them/</link><guid isPermaLink="true">https://datadojo.dev/2021/04/29/biases-in-learning-to-rank-models-and-three-approaches-to-deal-with-them/</guid><description>Search engines rely on models, which rank the matching results for a given user query. These models optimize the order of items. They learn how to rank items in a result list, therefore the name Learning-to-Rank (LTR) models.</description><pubDate>Thu, 29 Apr 2021 00:00:00 GMT</pubDate><category>ltr</category><category>bias</category></item><item><title>Avro and avro schemas - how they work and why they are useful</title><link>https://datadojo.dev/2021/02/23/avro-and-avro-schemas-how-they-work-and-why-they-are-useful/</link><guid isPermaLink="true">https://datadojo.dev/2021/02/23/avro-and-avro-schemas-how-they-work-and-why-they-are-useful/</guid><description>You have kafka as your message broker up and running and you may wonder: In which format should I send my data around? Maybe the string format pops up in your mind. Why not just put all fields into a long string and separate them with a comma?</description><pubDate>Tue, 23 Feb 2021 00:00:00 GMT</pubDate><category>avro</category><category>kafka</category><category>message brokers</category></item><item><title>A Gentle Intro to the Basic Architecture of Message Brokers: RabbitMQ vs. Kafka</title><link>https://datadojo.dev/2021/01/26/a-gentle-intro-to-the-basic-architecture-of-message-brokers-rabbitmq-vs.-kafka/</link><guid isPermaLink="true">https://datadojo.dev/2021/01/26/a-gentle-intro-to-the-basic-architecture-of-message-brokers-rabbitmq-vs.-kafka/</guid><description>In this blogpost you will get a basic understanding about message brokers. We will look at two very popular message brokers, Kafka and RabbitMQ, and learn, how they handle messages.</description><pubDate>Tue, 26 Jan 2021 00:00:00 GMT</pubDate><category>kafka</category><category>message brokers</category><category>amqp</category><category>rabbitmq</category></item><item><title>Intro into APIs and how to access public REST APIs with `curl`</title><link>https://datadojo.dev/2020/11/26/intro-into-apis-and-how-to-access-public-rest-apis-with-curl/</link><guid isPermaLink="true">https://datadojo.dev/2020/11/26/intro-into-apis-and-how-to-access-public-rest-apis-with-curl/</guid><description>This post will teach you the intuition of REST APIs and how you can use them to get interesting datasets for your data projects. First, we will look at the four components of a request. In the second part of this blogpost, we will go through one example and access the coingecko API via curl.</description><pubDate>Thu, 26 Nov 2020 00:00:00 GMT</pubDate><category>curl</category><category>api</category><category>rest</category></item><item><title>Pointwise, Pairwise and Listwise Learning to Rank Models - Three Approaches to Optimize Relative Ordering</title><link>https://datadojo.dev/2020/10/15/pointwise-pairswise-and-listwise-learning-to-rank-models-three-approaches-to-optimize-relative-ordering/</link><guid isPermaLink="true">https://datadojo.dev/2020/10/15/pointwise-pairswise-and-listwise-learning-to-rank-models-three-approaches-to-optimize-relative-ordering/</guid><description>In many scenarios, such as a google search or a product recommendation in an online shop, we have tons of data and limited space to display it. We cannot show all the products of an online shop to the user as a possible next best offer. Neither would a user want to scroll through all the pages indexed by a search engine to find the most relevant page that matches his search keywords. The most relevant content should be on top. Learning to rank (LTR) models are supervised machine learning models that attempt to optimize the order of items. This blogpost introduces three approaches to optimize ranks.</description><pubDate>Thu, 15 Oct 2020 00:00:00 GMT</pubDate><category>ltr</category><category>search</category></item><item><title>AI-Machine-Learning-Buzzword-Bingo</title><link>https://datadojo.dev/2020/09/10/ai-machine-learning-buzzword-bingo/</link><guid isPermaLink="true">https://datadojo.dev/2020/09/10/ai-machine-learning-buzzword-bingo/</guid><description>I was recently invited to join a panel discussion among developers to dispel the myth of the typical BS Buzzword Bingo around machine learning and AI. In this blog post, I will share some buzzwords we talked about with a little description and links. Ooops, I already used some buzzwords. So let’s start.
AI (Artificial Intelligence) is the magic portion to fix all problems of all companies and will make us unemployed in the future.</description><pubDate>Thu, 10 Sep 2020 00:00:00 GMT</pubDate><category>deep learning</category><category>nlp</category><category>machine learning</category></item><item><title>The Intuition of Word Embeddings: How you Teach A Computer to Understand Text</title><link>https://datadojo.dev/2020/08/31/the-intuition-of-word-embeddings-how-you-teach-a-computer-to-understand-text/</link><guid isPermaLink="true">https://datadojo.dev/2020/08/31/the-intuition-of-word-embeddings-how-you-teach-a-computer-to-understand-text/</guid><description>Humans intuitively understand the meaning of words: Which words are similar, opposites or related to each other? But our machine learning models do not have this intuition. Word embeddings are numeric vectors that represent text. These vectors are learned through neural networks. The objective when creating these embedding vectors is to capture as much “meaning” as possible: Related words should be closer together than unrelated words. Also, they should be able to preserve mathematical relationships between words such as</description><pubDate>Mon, 31 Aug 2020 00:00:00 GMT</pubDate><category>Word Embeddings</category><category>One-Hot-Encoding</category><category>Word2Vec</category><category>BERT</category></item><item><title>Jupyter Notebooks: Boost your productivity with Extensions and Magic Commands</title><link>https://datadojo.dev/2020/07/12/jupyter-notebooks-boost-your-productivity-with-extensions-and-magic-commands/</link><guid isPermaLink="true">https://datadojo.dev/2020/07/12/jupyter-notebooks-boost-your-productivity-with-extensions-and-magic-commands/</guid><description>In this blogpost I will share some tips for working with Jupyter Notebooks. Those tips greatly improved my productivity when working with Jupyter Notebooks and I wish someone would have told me earlier. The two main topics of this post are extensions and magic commands.</description><pubDate>Sun, 12 Jul 2020 00:00:00 GMT</pubDate><category>python</category><category>jupyter</category><category>extension</category><category>magic commands</category></item><item><title>Mastering ElasticSearch Queries If You Have Only Worked With SQL Before</title><link>https://datadojo.dev/2020/06/27/mastering-elasticsearch-queries-if-you-have-only-worked-with-sql-before/</link><guid isPermaLink="true">https://datadojo.dev/2020/06/27/mastering-elasticsearch-queries-if-you-have-only-worked-with-sql-before/</guid><description>Elasticsearch is often the storage engine of choice for storing and querying full text data. But writing an ElasticSearch query is pretty different compared to querying a relational database in SQL. In this blogpost, you will learn some basics you need to understand before working with ElasticSearch. In the second part, you learn how to write queries in ElasticSearch.</description><pubDate>Sat, 27 Jun 2020 00:00:00 GMT</pubDate><category>ElasticSearch</category><category>DSL</category><category>Fulltext Search</category><category>Inverted Index</category><category>Scoring</category></item><item><title>How the Inverted Index and Scoring Work in ElasticSearch</title><link>https://datadojo.dev/2020/06/24/how-the-inverted-index-and-scoring-work-in-elasticsearch/</link><guid isPermaLink="true">https://datadojo.dev/2020/06/24/how-the-inverted-index-and-scoring-work-in-elasticsearch/</guid><description>In ElasticSearch querying fulltext fields is among the least resource intensive tasks and your query results are ordered putting the most relevant results on top. But how does this work?</description><pubDate>Wed, 24 Jun 2020 00:00:00 GMT</pubDate><category>ElasticSearch</category><category>Fulltext Search</category><category>Inverted Index</category><category>Scoring</category><category>Elastic</category><category>Forward Indexing</category><category>Term Frequency</category><category>Document Frequency</category></item><item><title>Working with Complex Datatypes in Hive</title><link>https://datadojo.dev/2020/06/07/working-with-complex-datatypes-in-hive/</link><guid isPermaLink="true">https://datadojo.dev/2020/06/07/working-with-complex-datatypes-in-hive/</guid><description>The basic idea of complex datatypes is to store multiple values in a single column. So if you are working with a Hive database and you query a column, but then you notice “This value I need is trapped in a column among other values…” you just came across a complex a.k.a. nested datatype.
There are three types: arrays, maps and structs. First, you have to understand, which types are present.</description><pubDate>Sun, 07 Jun 2020 00:00:00 GMT</pubDate><category>Hive</category><category>SQL</category><category>complex</category><category>struct</category><category>array</category><category>map</category></item><item><title>Plotting with Seaborn</title><link>https://datadojo.dev/2019/09/10/plotting-with-seaborn/</link><guid isPermaLink="true">https://datadojo.dev/2019/09/10/plotting-with-seaborn/</guid><description>Seaborn is a python library for creating plots. It is based on matplotlib  and provides a high-level interface for drawing statistical graphics.
Seaborn integrates nicely with pandas: It operates on DataFrames and arrays and does aggregations and semantic mapping automatically, which makes it a quick, convenient option for data visualization in your data projects. One you understand the basic concepts, you can create plots really easily without using stack overflow too much.</description><pubDate>Tue, 10 Sep 2019 00:00:00 GMT</pubDate><category>Python</category><category>Plot</category><category>Seaborn</category></item><item><title>Mastering Data Preparation with Pandas: Subsetting, Filtering and Joining DataFrames</title><link>https://datadojo.dev/2019/08/18/mastering-data-preparation-with-pandas-subsetting-filtering-and-joining-dataframes/</link><guid isPermaLink="true">https://datadojo.dev/2019/08/18/mastering-data-preparation-with-pandas-subsetting-filtering-and-joining-dataframes/</guid><description>When I started working with pandas I noticed that there were so many ways how to subset, filter and join data with pandas. But I was lacking a systematic overview. How do the different approaches differ and when to use which?
In this blogpost we’ll look at different ways for subsetting, filtering and combining DataFrames.
Subsetting Data: Selecting subsets of rows and columns by labels and positions .</description><pubDate>Sun, 18 Aug 2019 00:00:00 GMT</pubDate><category>python</category><category>data preparation</category><category>pandas</category><category>boolean mask</category></item><item><title>Everything You Need to Know to Use Git for Version Control</title><link>https://datadojo.dev/2018/12/17/everything-you-need-to-know-to-use-git-for-version-control/</link><guid isPermaLink="true">https://datadojo.dev/2018/12/17/everything-you-need-to-know-to-use-git-for-version-control/</guid><description>So many people have recommended Git as a version control system to me. I had a look at it, but I was pretty overwhelmed. Since I did not have a technical background, everything seemed so complex! Many tutorials let me copy paste code without giving you a deeper understanding of what and why I am actually doing this. This copy pasting feels like success at first, but when I tried working with it, I could not.</description><pubDate>Mon, 17 Dec 2018 00:00:00 GMT</pubDate></item><item><title>Automatically changing the R working directory on Mac OS to source file location</title><link>https://datadojo.dev/2018/12/07/changing-working-directories/</link><guid isPermaLink="true">https://datadojo.dev/2018/12/07/changing-working-directories/</guid><description>This post is about how to change your R working directory. You might be wondering:
Why would I want to do that? You need this as soon as your script interacts with folders on your computer. For example for imports or exports of data or figures. So probably almost always. Let’s say you have a script that creates plots and saves them in the folder “Plots”, which is located in your source file directory.</description><pubDate>Fri, 07 Dec 2018 00:00:00 GMT</pubDate><category>R</category><category>Reproducible Research</category><category>MacOS</category><category>working directory</category></item><item><title>Formatting tables in R Markdown with kableExtra</title><link>https://datadojo.dev/2018/11/05/formatting-tables-in-r-markdown-with-kableextra/</link><guid isPermaLink="true">https://datadojo.dev/2018/11/05/formatting-tables-in-r-markdown-with-kableextra/</guid><description>In this post, I will show you some of my best practises for formatting tables in R Markdown. We will cover
How to generally format tables (font, size, color… )
 How to create tables with conditional formatting (e.g. coloring values &lt; 0 red)
  The basics: the R package kableExtra kableExtra is an awesome package that allows you to format and style your tables. It works similar to ggplot2: You create a base table and then add formatting layers with the pipe operator %&gt;%.</description><pubDate>Mon, 05 Nov 2018 00:00:00 GMT</pubDate><category>R</category><category>R Markdown</category><category>Reporting</category><category>Analytics</category><category>kableExtra</category><category>Tables</category><category>Knitr</category></item><item><title>R Markdown for Novices: All you need to know to get started</title><link>https://datadojo.dev/2018/11/03/r-markdown-for-novices-all-you-need-to-know-to-get-started/</link><guid isPermaLink="true">https://datadojo.dev/2018/11/03/r-markdown-for-novices-all-you-need-to-know-to-get-started/</guid><description>I write this blogpost for someone, who has never worked with R Markdown. After you read this post, you will
understand why R Markdown may be useful for your daily work as a student, researcher, analyst or data scientist.
 understand the basic structure of an R Markdown document and how you can get started.
  I strongly encourage everybody working with R to use R Markdown. I promise, it will make your life so much easier.</description><pubDate>Sat, 03 Nov 2018 00:00:00 GMT</pubDate><category>R Markdown</category><category>R</category><category>knitr</category><category>Markdown</category><category>YAML</category><category>Documentation</category><category>Reproducible Research</category></item></channel></rss>