# Delta Sharing

Delta Sharing lets Amigo provide governed, read-only access to selected tables so you can consume data directly within your analytics platform or warehouse using an open sharing protocol.

{% hint style="info" %}
New to Delta Sharing?

Delta Sharing is an open protocol for secure data exchange across platforms and clouds. Providers expose read-only tables via a simple REST endpoint; recipients use a small profile file (credentials + endpoint) to query those tables directly from tools like pandas or Apache Spark - without copying data.

Learn more:

* [Overview](https://delta.io/sharing/)
* [Protocol spec](https://github.com/delta-io/delta-sharing/blob/main/PROTOCOL.md)
* [Quickstart](https://github.com/delta-io/delta-sharing#quick-start)
* [Python connector](https://github.com/delta-io/delta-sharing#python-connector)
* [Spark connector](https://github.com/delta-io/delta-sharing#apache-spark-connector)
* [Delta Sharing documentation](https://github.com/delta-io/delta-sharing#documentation)
  {% endhint %}

## Key Concepts

{% columns %}
{% column %}

* Share - Container your organization is granted access to; exposes one or more schemas.
* Schema - Logical grouping of tables within a share.
* Table - Read-only dataset you can query from supported tools.
  {% endcolumn %}

{% column %}

* Profile file - JSON with endpoint and credentials used by clients.
* Open protocol - REST-based access to underlying Parquet/Delta data.
  {% endcolumn %}
  {% endcolumns %}

## Access & Provisioning

* Ask your Amigo representative (via Slack) to provision a Delta Share for your organization and specify which tables you need.
* Beta: Delta Sharing is in active development; schemas and operational policies may evolve.
* You will receive either:
  * A share profile file (containing the share endpoint and credentials), or
  * An endpoint URL plus recipient credentials, along with the list of shared schemas/tables.

## How You Use It

* Most platforms support Delta Sharing natively or via an open-source connector.
* Import the provided share profile or configure the endpoint and credentials per your platform’s instructions.
* Browse the shared schemas/tables and query them like native read-only tables in your environment.

{% hint style="info" %}
Example workflows

* Connect your warehouse or lakehouse to read shared tables for BI dashboards.
* Consume curated datasets for ML feature engineering or offline evaluation.
* Join Amigo-shared tables with your internal datasets without copying data.
  {% endhint %}

## Supported Clients

* Python: `delta-sharing` library for pandas and PySpark.
* Apache Spark: Delta Sharing Spark connector (SQL, Python, Scala, Java, R).
* BI/ETL: Many tools integrate via the open protocol or vendor connectors.

## Quick Start

{% hint style="info" %}
Trying it quickly?

Use the public demo profile file to explore example datasets: [Open datasets profile](https://databricks-datasets-oregon.s3-us-west-2.amazonaws.com/delta-sharing/share/open-datasets.share)
{% endhint %}

{% tabs %}
{% tab title="Python (pandas)" %}

```bash
pip install delta-sharing
```

```python
import delta_sharing

profile_file = "/path/to/profile.share"  # e.g., ./open-datasets.share
table_url = f"{profile_file}#my_share.my_schema.my_table"

pdf = delta_sharing.load_as_pandas(table_url)
print(pdf.head())
```

{% endtab %}

{% tab title="PySpark" %}
Option A - Python connector in PySpark (requires the Spark connector installed on the cluster):

```python
import delta_sharing

profile_file = "/path/to/profile.share"
table_url = f"{profile_file}#my_share.my_schema.my_table"

df = delta_sharing.load_as_spark(table_url)
df.createOrReplaceTempView("shared_table")
spark.sql("SELECT COUNT(*) FROM shared_table").show()
```

Option B - Spark reader directly:

```bash
pyspark --packages io.delta:delta-sharing-spark_2.12:3.1.0
```

```python
profile_file = "/path/to/profile.share"
table_url = f"{profile_file}#my_share.my_schema.my_table"
df = spark.read.format("deltaSharing").load(table_url)
df.show()
```

{% endtab %}

{% tab title="Spark (Scala)" %}

```bash
spark-shell --packages io.delta:delta-sharing-spark_2.12:3.1.0
```

```scala
val profileFile = "/path/to/profile.share"
val tableUrl = s"$profileFile#my_share.my_schema.my_table"
val df = spark.read.format("deltaSharing").load(tableUrl)
df.selectExpr("count(*) as rows").show()
```

{% endtab %}
{% endtabs %}

{% hint style="info" %}
Table URL format: `<profile_file>#<share>.<schema>.<table>`. The profile file can reside locally or in cloud storage (e.g., `s3a://...`). See connector docs for supported path schemes.
{% endhint %}

## Governance & Operations

* Scope: Shares are read-only and limited to the schemas/tables you request.
* Security: Credentials or recipient tokens can be rotated; IP allowlisting is available upon request.
* Observability: Access and query activity are logged by Amigo for security and auditing.
