Data contracts. Building universal data access proxy api
Mini essays,  Code,  Data,  Tech

Data contracts. Building universal data access proxy api

Data contracts usually requires us to build around them a universal data access proxy API for users to consume. API utilizing proper data contracts, negotiated with different teams, acts as a unified gateway providing the necesities. Allows access to data sources like databases, REST APIs, GraphQL endpoints or other file systems.

One api to rule them all

Steps to build data contracts for proxy api

You could try and adopt a similar flow for creating such access points, even make a template in JIRa so You will know where to get proper data and how to acquire it… or maybe expose the library and just aprove properly looking merge requests…

  • Define contracts – use your own template, openAPI or similar standards. Specify paths, request/response schemas for endpoints
  • Choose a language and framework – Pythons fastApi ? Some static or interpreted language ? framework ?
  • Proof of Concept – Make an example to build Your own universal data access proxy api
  • Modules – Show a working module that fetches data for some contracts.
  • Add features ! – Library methods, caching, logging, error handling or document certain behaviours
  • Give it to the users

Data contracts proxy api biggest issue

Currently everyone is praising giving access to raw data, the whole lake. Let people figure out what they need and why. Let them be creative. Issues ? Most of people with access have no saying when and if the data will change. Broken changes can be common.

You can still give access to raw data for the time beeing. After a sensible, final minimum viable product will emerge, then they will have to create a proper contract and a module for themselves. Otherwise they should be aware that this might break any time.

Code example

openapi: 3.0.0
info:
  title: Proxy API - Order Price
  version: 1.0.0
paths:
  /orders/{orderId}/price:
    get:
      summary: Get order price by ID
      parameters:
        - name: orderId
          in: path
          required: true
          schema:
            type: string
            example: "ORD-12345"
      responses:
        '200':
          description: Order price retrieved
          content:
            application/json:
              schema:
                type: object
                properties:
                  orderId:
                    type: string
                    example: "ORD-12345"
                  price:
                    type: number
                    format: float
                    example: 29.99
                  currency:
                    type: string
                    example: "USD"
                example:
                  orderId: "ORD-12345"
                  price: 29.99
                  currency: "USD"
        '404':
          description: Order not found

Easy testing cause of single source of truth

Giving universal data access proxy api definitions reside in a single repository it is realy easy to test them. Additionaly giving access to the code to itnerested parties could make the work easier but rather maintaining then providing the whole framework.

Why Should You provide tools not solutions ?

Let`s leave it will quote from Churchill, 1941 :

„Give us the tools, and we will finish the job

… or Neil Gershenfeld’s
„Give ordinary people the right tools, and they will design and build the most extraordinary things,”

AspectProsCons
FlexibilityUsers use tools fostering innovation and ownership​.Requires user experience, novices might struggle without guidance​.
ScalabilityEnable self-service reducing dependency on providers​.Learning curve delays for complex scenarios​.
Long-term ValueBuilds skills and reusable capabilities.Risk of suboptimal results without best practices​.
CustomizationAllows tailoring to edge cases, unlike rigid solutions​.Higher upfront development cost for versatile tools​.
EmpowermentEncourages problem solving mindset and autonomy ​.Overhead for quick fixes

Summary tl;dr

Universal data access proxy is a very based solution. Looking on those scenarios we could imagine a perfect world where it works. A single point of entry, spawning new lambdas or simple AppSync function providing access for a cloud or sharded lake and data access. How will You push it through the business and seel it to people that will approve or dissaprove such an approach is a totaly different subject. Academically the approach feels sound and commons sense can see the gain in the long run.

Piotr Kowalski