Data Infrastructure Layers 1: Control
Design options for how users interact with your data platform.
|Nov 14, 2020|
New Post: Data Infrastructure Layers 1: Control
This post continues my exploration of how to determine what software tools will allow you to support a wide range of data use cases, with the first of a three-part series on design decisions.
The quick-read version:
If you think of the Use Case Categories and Stages as forming a grid like we saw in my last post, there are three "layers" you need to decide on for each box:
Control - How the user will tell the computer what they want to do.
Compute - How the computer will do that work.
Storage - How the results and intermediates will be stored.
For the Control layer, the main decision is between a graphical interface or a text/code interface.
Graphical interfaces are easier to learn and generally preferred for some users.
Many technical users prefer, and are more productive, with a text interface.
Within each of these two are a number of more specific options explored in the post.
For Further Consideration
What kinds of interfaces do you, your team and your stakeholders currently use?
Which ones are more efficient to use? Which ones are more frustrating?
Can you bucket your stakeholders and users by what type of interface they prefer?
Do those divisions fall along use cases and/or stages?
The control layer is at the center of the no-code (or low-code) movement, which tries to convert activities that traditionally required a text-based interface to completely graphical. This includes data science and data pipeline development, as well as simple apps. Here are some articles that explore the pros and cons of these interface layer options:
GUI-fying the Machine Learning Workflow argues that a graphical interface (GUI) is the best way to do data science.
Data Scientist? Programmer? Are They Mutually Exclusive? summarizes a presentation arguing the opposite - that coding is the only proper way to do data science.
The Low-Code/No-Code Movement explores why this trend has had a growing business impact.
My next two posts will explore the Compute and Storage layers,
Followed by my attempt to demystify some aspects of data governance,
Then a series of case studies of exploring design options of specific use cases.