4 PM on a Friday, I’m still reviewing an analytics dashboard from a vendor, manually moving files to cloud storage, running three different long SQL scripts, and, begrudgingly, pasting the output to an Excel file to calculate rough numbers. It’s not sustainable, it’s imprecise, but launch day was earlier this week, and I have not created a Python processing pipeline yet, so I have to do all of this manually. We simply cannot delay any further to understand how the product is doing. Earlier this morning, I received an email from my Product Manager, cc’ed my direct manager, asking if I can provide an update. After a few exchanges, I was pulled into a meeting. The message was clear: we need the numbers because senior leadership needs to report to their higher-ups.
You know that sound in your car that nobody else can hear?
No, this is not reactionary. This time, I was there from the beginning. The product is a flagship feature, so there was an appetite from leadership for more robust reporting. I already worked with data engineering to get things set up, but the fullstack software team moves fast, so the data side is behind. My manager reassured the PM that everything should be fine, so while we’re waiting for something more robust, I can do some manual pulls first for reporting purposes.
That is mostly correct.
You know that sound in your car that nobody else can hear? The mechanics said it’s fine. You continue to drive and it continues to bother you. Most of the time, it’s actually fine. But sometimes, it’s the transmission.
Two weeks before the launch, there was some pushback from other teams, so the product team made a slight design change to address it. Devs are fast, they could pull it off after everything got the green light. But I did not know about that until three days before the launch. Can we change that on the data side? Yes. Within 3 days? No! Are you kidding me? And I can’t just tell the product side to slow down. The launch had already been pushed back once before.
So yeah, that is mostly correct. Even after data engineering finishes their work, the data does not reflect the design change.
Let me give you some background. I joined the project at its very inception eighteen months ago, the third person on the team. It was a rare chance to be with the team on the analytics side when the product was built from the ground up. I felt as if we were part of a cult. We were all co-conspirators. We all believed strongly in the purpose of the thing we built: it would bring real benefits to our clients. Heck, we would use the tool ourselves when we became eligible for it. I was not simply an analyst: I was part owner of the product. User interview? I was there. Tactical product moves? I listened in and gave my opinions. I know this product.
The combination of client types, eligibility rules, and product stages means understanding product performance can’t rely on web analytics and ad-hoc queries alone. Multiple sources have to be combined. Something more robust is needed. That’s where data engineering comes in. Months before the launch, I worked with them to lay the foundation for product analytics. We walked through user flows, examined data sources and structures, and devised a plan of attack. All was good. But data engineering’s rigor came with the cost of speed. It was months before something concrete could be put into production.
And then that design change happened.
So here I am, at 4 PM on a Friday afternoon, trying to patch things in an Excel spreadsheet.
Fast-forward a few months, it’s done. The stop-gap system that I built with Python and spreadsheets is replaced by production-grade structures that data engineering set up. The robustness provides reliability and consistency that my duct-taped system cannot match. I know the metrics don’t quite capture all the design changes. But should I dig that up, get through another intake request for engineering, and get it prioritized in their backlog? Should I let that delay the go-live of the data table for another two months, or one month if I’m optimistic? No, I cannot. I have to make that call because I am the conduit between the product and the data side. Nobody else bears that weight. It’s me.
And I choose to move forward with production. That thing that keeps gnawing at the back of my mind? I think I can live with it.
Then comes the news I heard through the grapevine: management wants to move me to another initiative. They want someone with my experience to handle that portion. Someone more junior can inherit the data system I helped build for the current product. My colleague will have a chance to learn, and I will have a chance to be exposed to something new. Makes sense. We all need to grow in our careers. You can’t stay in one place and expect a diverse experience.
So we work through the transition together. He gets to know the product team, I walk him through the flow documentation, we pair on how to query the data. We even design the new dashboards together so he’s not merely an observer of the system, but a builder too. But I realize there’s something that can never be transferred. That slight design change has wedged its way between the product and the data side, and the gap only widens with each subsequent iteration. The call I made between efficiency and fidelity never made its way to a documentation page. When I’m gone, the knowledge becomes invisible.
A few months after my departure, I see in my old team chat thread a flurry of congratulations to the tech lead and a couple of engineers. The tech lead has been promoted and will move to another team, and the other engineers also get new assignments. And to my surprise, in an analytics staff meeting, even the analyst who replaced me is being moved to another initiative. I’m happy for them, everybody is moving up in their careers. But I can’t help feeling nervous about the product. How would the transition go with so many changes to the original crew?
Nine months after the transition, as I’m crafting some queries at my cubicle on the opposite wing from where I used to sit, a tall man walks across the hallway and introduces himself. It’s the new tech lead of my previous team. After some conversation, I learn that more features have been added to the product based on user feedback. He then broaches the subject of data. It turns out that an upstream data pipeline has been working incorrectly for months. The pipeline had been maintained by another team that’s since rotated off, and his team inherited it without the expertise to maintain it. He is trying to assess its impact on analytics.
The design wedge and the compromise I made are hanging over me. I started that gap. And with every little change the team encounters, the gap widens.
I share with him how the data engineering side pulls data from the product, how that maps to the dashboards and reports downstream. It’s not easy. How do you compress two years of work into a single conversation in a hallway? Only the most important stuff gets distilled. The design gap and the compromise do not.
And he walks back to the other wing of the building.
The engineering rigor preserves the code, not the meaning.
Data products come with institutional authority, and that’s the whole point: they’re designed to be trusted. But that authority poses the most danger because the data sometimes does not fully reflect reality. The engineering rigor preserves the code, not the meaning. The meaning lives in the minds and organic interactions of the teams [1]. When they move on, it dissipates. So when you see gaps in the data products, you are seeing the void left behind by the people who are gone. A data product might look like solid rock, but the organic reality of how teams work drives wedges into it, invisible from the surface.
When you pull from a data product today, do you know who the builder was? Do you know what they knew that didn’t make it into the table?
[1] This is a different failure mode from the one I wrote about in Drilling Through Sediment, where analysts inherit fragments from ad hoc work. Here, the infrastructure persists — it’s the meaning behind it that doesn’t.
