Skip to content
Home>All Insights>Restaurant Barnfield’s recipe for the perfect Data Spaghetti 

Restaurant Barnfield’s recipe for the perfect Data Spaghetti 

Do you hate your data? Want to make sure that no one, not even future you, can extract meaningful insights from it? Maybe you want to bankrupt yourself with extortionate storage and processing fees? Or perhaps simply to provide your teams with the worst job satisfaction you possibly can? 

Well, good news! Step into Restaurant Barnfield, an exceptionally bespoke pop-up eatery. Why not try our signature dish: a heaped plate of Data Spaghetti.  

Restaurant Barnfield offers you a how-to guide for making your data unusable, inaccessible, unaffordable, and quite simply infuriating. Our chefs lovingly pair this Data Spaghetti architecture with an implementation side dish: Hand-Rolled Garlic Bread or Cool Tech Salad. 

A playful, illustrated menu titled "Restaurant Barnfield" featuring a fictional dish called "Data Spaghetti." The design includes cartoon-style spaghetti noodles, smiling meatballs, vegetables, and a cheerful empanada. The set menu lists three items: "Perplexing Spaghetti" (described as mysterious and complex, filled with dread), "The Sauce of Obfuscation" (surprisingly thick and infamously opaque), and "A Choice of Side" (hand-rolled garlic bread or cool tech salad). The visual uses a bright and whimsical style with orange spaghetti strands woven throughout the layout.

Considerations for making the perfect Data Spaghetti 

We’ll start by considering the key properties of PASTA in data architecture, meaning we’ll need to design a system to minimise Performance, Access, Scalability, Trustworthiness, and Automation. 

Performance: haute cuisine takes time and money 

This isn’t McDonalds. Good food takes time, and Data Spaghetti is no different. We give our guests the chance to sit and enjoy their WINE (Warehouse Ingestion with No ETL), while we slow-cook large batch processes that run for hours, maximising time-to-insight. One tip for effective slow-cooking is to avoid highly parallelised tools like Spark and Flink, the data pressure cookers. These will only speed things up by allowing you to arrange your workflows as efficiently as possible, which sounds like all work and no play really. 

As any health inspector will tell you, your choice for how to store your spaghetti conveniently and safely is important too. If you choose carefully, you can spend a lot more money for a much narrower access bottleneck by storing large datasets on un-backed-up network drives rather than in blob storage like S3 or GCS. 

On the topic of BLOB storage: you can also burn a whole pile of extra money with practically no extra effort by setting up poor archiving and retention strategies. Generally, data can be stored on a spectrum from: 

  •  “Hot” storage: low access fees, high storage cost, to 
  •  “Cold” storage: high access fees, low storage cost.  

To maximise cost, store the data you access the most often in a cold archive, and the data you rarely access nice and warm in hot storage, like a petrol station pasty. 

Two cartoon-style pasties with expressive faces, representing hot and cold states. The orange pasty on the left is sweating with visible heat lines and droplets, smiling with wide eyes and flushed cheeks. The blue pasty on the right appears chilled, with shivering lines and icicle-like marks, also smiling but with a cool demeanor. The background features warm and cool color accents behind each pasty to emphasize temperature contrast.

Access: use exclusivity to foster mystique and intrigue 

As with any top-end restaurant, we want to add as many barriers to access as possible, so our guests know they’re getting a premium experience.  

We’ll store little meatballs of data across multiple different systems, separating spheres with new access controls for each system, then scatter them all over the Data Spaghetti to lower discoverability. 

Now, with how difficult it can be to: 

  • Discover the data exists 
  • Find out who owns it 
  • Confirm you have the right version 
  • Confirm that the data is clean 
  • Get access 

Many users will conclude that it’s easier to regenerate a dataset than to get access to an existing one – causing proliferation of data silos. 

A playful cartoon graphic of four bowls of spaghetti with an increasing number of meatballs. The meatballs have eyes and smiling faces, with action marks representing the motion of being dropped into the bowl.

Scalability: scales are for fish and we don’t serve seafood 

Our charm comes from being a small, independent, family-run business. We want to avoid scaling like a restaurant chain, lest we lose our zing. 

Much like a well-aged wine is preferred by the connoisseur, we’ll stick to traditional databases over modern lakehouse architectures like Apache Iceberg or Delta Lake. Traditional databases delight the palate by coupling storage growth with compute, limiting the ability to scale effectively with volume. A fine vintage! 

Another excellent pairing would be to forgo the database entirely and email files around. Ideally, we’d recommend the fortified Excel file “Jan Sales Figures 24 v2 Final 1.1 Feb Matt edit.xls”, meaning the number of files you have can scale easily with each person you email or Slack them to.  

A playful cartoon graphic showing a web of spaghetti with a series of customised meatballs showing a number of different images, including Mary Berry, Gordon Ramsay, Jamie Oliver, Nadiya Hussain, a spreadsheets, screwed up paper, an alert icon, a computer chip, a data cloud visualisation, and code.

Trustworthiness: if you really trusted me you wouldn’t need proof 

Food critics and health inspectors are always bad for business. What diners don’t know can’t hurt them. We’ll start by minimising observability. 

Firstly, we’re not going to track data lineage. Data lineage offerings are common among cloud platforms, allowing you to see the source of all your data and the processing steps it’s gone through. But all this extra information just clutters our core concept. Does anyone really need allergens on the menu anyway? 

We can also take this opportunity to disable audit logs. Audit logs would provide a clear trail of who did what in the event of any enquiry, and it’s important to us to protect our staff from potential legal action. I’m sure they were just trying their best. 

Automation: you can’t take the labour out of a labour of love 

Diners come to us for a unique experience, so we’d never betray them with mass-produced garbage. Fortunately, there are lots of areas where we can introduce manual steps!  

The customer experience starts with orchestration, so we inject a personal touch by only taking reservations over the phone and writing them down in a single paper diary, which Rachel sometimes takes home with her when she forgets to leave it by the landline. In the same vein, we advise manually triggered ETLs. Better yet, you could have a series of ETLs that depend on each other, each needing to be manually triggered after the previous is complete. 

If you can’t stomach such beautiful orchestration, a subtle way you can add a little spice is to sequence pipelines using time-based schedules. This introduces the risk that on any given day an ETL could run out of order and break the pipeline. Just make sure to push back on proper orchestration solutions like Apache Airflow, which are as cold and efficient as online booking apps. 

Testing and deployments are also excellent opportunities to introduce the benefits of manual steps. Manual steps make testing and deployment more frustrating, leading to the temptation to skip testing, and reducing the frequency of releases, introducing production bugs and increasing helpful downtime so you can have a nice nap in between courses.  

A choice of side dish to elevate your plans 

We’ve prepared the main architectural dish, but what about implementation? There are two great choices here: a) Make absolutely everything in house; or b) Mix together every tool you skimread about in the last two years of digest emails from Medium. 

A playful cartoon graphic of pasta sides showing two pieces of garlic bread with eyes and a smile, accompanied by three baguettes with waves lines illustrating steam.

Hand-Rolled Garlic Bread 

A good chef loves a take on the classics. Why not show off your creativity by re-inventing the wheel? Make everything yourself from fresh ingredients to yield a sumptuous combination of high upfront cost and high maintenance burden. 

A playful cartoon depiction showing a plate of salad with leaves, tomatoes and carrots alongside icons for data engineering.

Cool Tech Salad 

In the mood for Ameri-Afro-Eurasian fusion? Buckle up! We’re going to glue together 5000 different new and exciting specialised tools. 

This is a complex flavour that develops over time. In fact, Cool Tech Salad will probably cause disappointingly few remarkable sensations at first implementation. But the real fun begins 6-12 months down the line, where you can expect problems like: 

  • High specificity often means low flexibility 
  • High onboarding burden because new engineers need to learn many tools before they can be effective 
  • Lack of support or online resources around new tools too immature to have a community 
  • Dependency hell 

A digestif on side orders 

The important thing here that you should only pick one of the Hand-Rolled Garlic Bread or the Cool Tech Salad, and you should immerse your senses in it. Attempting a bit of one with a bit of the other might end up with using the correct tools where they’re a good fit, and hand-rolling only the unique part of your use case, leading to a horrific productivity boost. 

Takeaways 

Alas, our service now draws to a close, and we can no longer seat new diners. But fear not! We offer takeaways

  • Waste money with outdated compute tooling and incorrect data temperature 
  • Hide your data like meatballs in spaghetti  
  • Keep data personal by emailing Excel files 
  • Avoid accountability with no auditing or lineage tracking 
  • Manual steps show that you really care 
  • Either hand-roll everything yourself, or apply every tool you’ve ever heard of and hope for the best 

Does Restaurant Barnfield sound all too familiar?

Our popular Data Engineering Kickstart program helps people avoid this kitchen nightmare. Across a 4-6 week period, you’ll get an end-to-end slice of great data tooling, integrated with analytics dashboards and Excel, giving you the core ingredients and base sauce to easily rustle up high quality data that everyone wants to consume.

Data Engineering

Streamline Your Data Journey: Discover how our expertise turns complexity into clarity.