-
Data contracts. Building universal data access proxy api
Data contracts usually requires us to build around them a universal data access proxy API for users to consume. API utilizing proper data contracts, negotiated with different teams, acts as a unified gateway providing the necesities. Allows access to data sources like databases, REST APIs, GraphQL endpoints or other file systems. One api to rule them all Steps to build data contracts for proxy api You could try and adopt a similar flow for creating such access points, even make a template in JIRa so You will know where to get proper data and how to acquire it… or maybe expose the library and just aprove properly looking merge requests……
-
97 Things Every Data Engineer Should Know – book review
97 Things Every Data Engineer Should Know review will be a positive one. This style of books is currently my favorite. Might get another one from the series 🙂 It is bits and pieces of knowledge You can digest easily. Scattered across multiple disciplines teachings are of a principle design. Book is technologically agnostic, meaning rules, law, principles and methodologies presented You can use with any framework or system. It is like the design principles. Great read, would recommend. 97 lessons to pick from The book is all about best practices, system design, queues, asynchronouse communication and many more. You can easily read it day by day when You „meditate”…
-
How to check if Your email and password was leaked
Why You need a strong passwords and check if it was breached How to check if Your email and password was leaked ? Just check the security blogs or institutions like https://databreach.com/. Google the title of this blog post. Fix your breached credentials or You will be sorry ! Passwords are Your first line of defense. Multiple factor authentication (MFA) like sms, email verification or authentication application should also be something You do! How to make a good password ? Many different ways to make it happend. I would recommend to consider : This is a simple algorith You can use to remember all of those passwords and every one…
-
The Golden Byte. Most valuable data
The Golden Byte. Most valuable data In data engineering, every byte has a cost but not all bytes are made to be equal ( read Animal Farm by George Orwell). We collect terabytes of data in the form of logs, metrics, cookies, text, pictorues and transactions. Yet only a small portion of this information is truly crucial and drives business outcomes. That fraction is what can call the Golden Byte, single most valuable unit of data that fuels strategic insight and decision-making. Data tiers architecture The Golden Byte embodies the essence of a gold layer in modern data architecture: raw ,curated , aggregated, and business-ready information. It is the outcome…
-
Popular LLMs training data, what do they use ?
Popular LLMs training data seems to be universal and generic. This is why such models are so popular, they more or less know an answer to everything. But how do they come about to those answers ? What is the source of that ? Where do they get the data from ? Let`s search the web the old fashioned way and find out. Popular LLMs training data types The training data for these models come from all around the world. We humans are the ones that provide it. It is our work that is pushed into a model. LLMs training data reflects carefully curated huge datasets designed to provide high…
-
Know Your data. Cost per byte vs value per byte
Cost per byte vs. Value per byte: Rethinking Data Efficiency We are living in an era where nothing gets erased (just archived). Let us dwell on cost per byte vs value per byte of such data. Every byte you store, move, or process has a cost. We focus on cost saving. Data engineering isn’t just about hoarding everything, it’s a calculated risk about understanding whether those bytes are worth to store them. Pro hint – do not fall into the trap of 'let us grab everything and think about it later’. It does make sense until you figure out what is what but then remember to delete it ? Oh…











