Data Infrastructure Layers 2: Compute

Design options for how you manage computation workload.

New Post: Data Infrastructure Layers 2: Compute

This post is the second in a three-part series exploring the high-level design decisions you need to make for each stage in each use case category. The Compute layer is what does the work that the user manages in the Control stage.

The quick-read version:

  • Often the compute layer will be split between multiple options, with the Compute that directly supports the Control and Storage layers separate from the more intensive parts.

  • Local compute happens on the user's computer and is the simplest option, but limits you to what that one computer can handle.

  • Remote compute uses additional computers on a network. Two options - remote sessions or remote services - offer different ways of managing the additional complexity.

  • Software as a Service (SaaS) and Serverless architecture allow you to delegate some of the complexity to external services.

For Further Consideration

  • What software using remote compute do you, your team and your users rely on?

  • Which of these applications rely on remote sessions, and which are remote services? 

  • How do these design decisions impact how usable they are? How would a different choice impact their effectiveness?

Further Reading

Serverless architecture and Platform as a Service (PaaS) are two trends that attempt to mitigate some of the complexity of moving from local compute to remote.

Up Next

  • My next post will complete the three-part design decision series with the Storage layer,

  • Followed by a post that attempts to demystify some aspects of data governance,

  • Then a series of case studies of exploring design options of specific use cases.