Member-only story

How to write own connector to Rest API in Spark Databricks

Hubert Dudek
5 min readJan 13, 2025

--

Access for free via a friend’s link https://databrickster.medium.com/how-to-write-own-connector-to-rest-api-in-spark-databricks-42321a5021dd?sk=89c9b3cf0c1d008d54c9d0bce73ac769

Imagine all a data engineer or analyst needs to do to read from a REST API is:

No direct requests calls, no manual JSON parsing — just Spark in databricks notebook. That’s the power of a custom Spark Data Source. In this article, we’ll explore how to build such a connector in Python using Spark’s newer Python Data Source API.

We’ll demonstrate using the JSONPlaceholder public API, which provides fake JSON data.

The Big Picture

In many organizations, analysts and data scientists often need to grab data from internal or external REST APIs. Typically, they end up writing boilerplate code:

This approach is repeated in multiple notebooks or pipelines — leading to duplication, inconsistency, and lots of maintenance headaches. By creating one custom connector, you:

  1. Hide the complexity of making HTTP requests.

--

--

Responses (2)