These prebuilt wheel files can be used to install our Python packages as of a specific commit.
Built at 2025-10-29T02:51:50.151404+00:00.
{
"timestamp": "2025-10-29T02:51:50.151404+00:00",
"branch": "feat-make-pyspark-optional",
"commit": {
"hash": "208965cec2c13211c2f39595cc19e88acc917f1b",
"message": "feat(ingestion): Make PySpark default for s3/gcs/abs with -slim variants\n\nFlips the implementation to maintain backward compatibility while providing\nlightweight installation options. S3, GCS, and ABS sources now include PySpark\nby default, with new -slim variants for PySpark-less installations.\n\n**Changes:**\n\n1. **Setup.py - Default PySpark inclusion:**\n - `s3`, `gcs`, `abs` extras now include `data-lake-profiling` by default\n - New `s3-slim`, `gcs-slim`, `abs-slim` extras without PySpark\n - Ensures existing users have no breaking changes\n - Naming aligns with Docker image conventions (slim/full)\n\n2. **Config validation removed:**\n - Removed PySpark dependency validation from S3/ABS config\n - Profiling failures now occur at runtime (not config time)\n - Maintains pre-PR behavior for backward compatibility\n\n3. **Documentation updated:**\n - Updated PYSPARK.md to reflect new installation approach\n - Standard installation: pip install 'acryl-datahub[s3]' (with PySpark)\n - Lightweight installation: pip install 'acryl-datahub[s3-slim]' (no PySpark)\n - Added migration path note for future DataHub 2.0\n - Explained benefits for DataHub Actions with -slim variants\n\n4. **Tests updated:**\n - Removed tests expecting validation failures without PySpark\n - Added tests confirming config accepts profiling without validation\n - All tests pass with new behavior\n\n**Rationale:**\n\nThis approach provides:\n- **Backward compatibility**: Existing users see no changes\n- **Migration path**: Users can opt into -slim variants now\n- **Future flexibility**: DataHub 2.0 can flip defaults to -slim\n- **No breaking changes**: Maintains pre-PR functionality\n- **Naming consistency**: Aligns with Docker slim/full convention\n\n**Installation examples:**\n\n\\`\\`\\`bash\npip install 'acryl-datahub[s3]'\npip install 'acryl-datahub[gcs]'\npip install 'acryl-datahub[abs]'\n\npip install 'acryl-datahub[s3-slim]'\npip install 'acryl-datahub[gcs-slim]'\npip install 'acryl-datahub[abs-slim]'\n\\`\\`\\`"
},
"pr": {
"number": 15123,
"title": "feat(ingestion): Make PySpark optional for S3, ABS, and Unity Catalog sources",
"url": "https://github.com/datahub-project/datahub/pull/15123"
}
}
Current base URL: unknown
| Package | Size | Install command |
|---|---|---|
acryl-datahub |
2.414 MB | uv pip install 'acryl-datahub @ <base-url>/artifacts/wheels/acryl_datahub-0.0.0.dev1-py3-none-any.whl' |
acryl-datahub-actions |
0.101 MB | uv pip install 'acryl-datahub-actions @ <base-url>/artifacts/wheels/acryl_datahub_actions-0.0.0.dev1-py3-none-any.whl' |
acryl-datahub-airflow-plugin |
0.039 MB | uv pip install 'acryl-datahub-airflow-plugin @ <base-url>/artifacts/wheels/acryl_datahub_airflow_plugin-0.0.0.dev1-py3-none-any.whl' |
acryl-datahub-dagster-plugin |
0.019 MB | uv pip install 'acryl-datahub-dagster-plugin @ <base-url>/artifacts/wheels/acryl_datahub_dagster_plugin-0.0.0.dev1-py3-none-any.whl' |
acryl-datahub-gx-plugin |
0.010 MB | uv pip install 'acryl-datahub-gx-plugin @ <base-url>/artifacts/wheels/acryl_datahub_gx_plugin-0.0.0.dev1-py3-none-any.whl' |
prefect-datahub |
0.011 MB | uv pip install 'prefect-datahub @ <base-url>/artifacts/wheels/prefect_datahub-0.0.0.dev1-py3-none-any.whl' |