Data into results

Covid-19 dashboard with Power Query / Power Pivot / Excel

Sébastien Derivaux — Tue, 10 Mar 2020 18:15:14 +0000

As everyone, I want to monitor the propagation of the Covid-19. There is a nice dashboard from the Johns Hopkins University but it didn’t fit my need. It displays all the numbers but doesn’t provide a clear information of the development of the situation. Dashboard from Dong, Du and Gardner (screen caputure) Luckily for me, I’m love swimming in data so what better usage of a...

Source

Data science: software engineering or business intelligence?

Sébastien Derivaux — Tue, 18 Feb 2020 15:43:51 +0000

Kind of a rant discussing the shift of Data Science towards technology and losing sight of the business value. With ideas to think outside of engineering. There is currently a trend in Data Science to converge toward software engineering. I’ve seen it recently in a Toward Data Science podcast with Adam Waksman from Foursquare (very interesting podcast). As he said...

Source

Web Analytics Data Warehouse : Google Analytics and Search

Sébastien Derivaux — Wed, 22 Jan 2020 11:13:39 +0000

In this tutorial, we will see how to create a dashboard for content marketing that relies on SEO (Search Engine Optimization). It will leverage and blend data from Google Analytics (that track traffic on your website) and Google Search Console (how your website perform on Google Search). The result will be this Google Sheet dashboard (you can follow the link or see the capture below) which is...

Source

Data Brewery – A modern ETL for agile data warehouse

Sébastien Derivaux — Wed, 25 Dec 2019 07:25:00 +0000

Today, I turn 37, and I feel the day is great to launch my first product, Data Brewery. It’s an open-source ETL for data warehouse with an emphasis on agility and productivity. The current consensus is to use Pentaho Data Integration (PDI). At the time of writing, the open-source version is downloaded 25 000 times per month. I have used it for 10 years. I remember when I arrived at Blizzard...

Source

PowerBI and PostgreSQL : SSL, Let’s encrypt and Gateway

Sébastien Derivaux — Sun, 15 Dec 2019 15:42:41 +0000

While Microsoft have great tools for data science (Excel being one of the best front end lately with PowerBI powers), they are also great at shooting themselves in the foot. I mean, connecting to the universally acclaim open source PostgreSQL database (check PostgreSQL for Data Science Pro and cons) should be working right out of the box? Just like any other BI tools, right? You bet ….

Source

JupyterLab for complex Python and Scala Spark projects

Sébastien Derivaux — Sun, 06 Oct 2019 16:17:47 +0000

JupyterLab is an awesome piece of technology for prototyping and self-documenting research. But can you use it for projects that have a big codebase ? The notebook workflow was a big improvement for all data scientist around the globe. The ability to directly see the result of each step and not running over an over the same program was a huge productivity boost. Moreover, the self-documenting...

Source

MySQL/MariaDB for Data Science : Pro and cons

Sébastien Derivaux — Thu, 22 Aug 2019 13:30:05 +0000

After analyzing the pro and cons of PostgreSQL for data science, I thought it was fair to do the same for MySQL and its fork MariaDB. According to Coursera, you can manage Big Data with MySQL. Is MySQL or MariaDB suited for the analytics workload of data science? Can it be used as a data warehouse? Last time I looked into MySQL was teen years ago. What has changed since then? Let’s dive in.

Source

How to escape the data monkey trap to leverage analytics

Sébastien Derivaux — Tue, 16 Jul 2019 09:36:08 +0000

Despite all the talks about big data and data science, it seems that at many companies, we are still in the era of data monkey. Hours are wasted copy/pasting cells from tools to Excel and making manual reports. Few insights, if any, are generated. “Most of my time is spent churning out a bunch of fairly standard reports on a monthly/quarterly basis.

Source

Supercharged Excel for startup analytics with PowerBI

Sébastien Derivaux — Fri, 10 May 2019 12:00:10 +0000

Excel seems to be the most hated tool I ever encountered. That’s a shame because if you look past this bad reputation it’s one of the best tools you can have on your belt for analytics. In my experience, Excel is great for two purposes: high level reporting and data analysis. High level reporting is getting the pulse of your business. You can see more details in the article Startup...

Source

Master SQL query for SaaS startups analytics

Sébastien Derivaux — Sat, 09 Mar 2019 10:11:35 +0000

We saw earlier the master table needed for SaaS analytics in order to get subscribers, MRR (monthly recurring revenues) and ARPPU (average revenue per paying users) per acquisition channel and persona. From this base, and with the help of SQL craft we can get more insight on what is happening. Presently, we just know if the MRR is increasing, stable or decreasing. For instance in the chart below...

Source