ploosh.
Documentation
Ploosh can be executed over spark (in Databricks, Microsoft Fabric or local)) using spark connectors and by calling from python code.
Examples
Microsoft Fabric
Cell 1 : Install Ploosh package from PyPi package manager
pip install ploosh
Cell 2 : Mount the lakehouse to acces the case and connection files
mountpoint = "/plooshconfig"
workspace_name = "ploosh"
lakehouse_name = "data"if(mssparkutils.fs.mount(f"abfss://{workspacename}@onelake.dfs.fabric.microsoft.com/{lakehousename}.Lakehouse/", mount_point)):
plooshconfigpath = mssparkutils.fs.getMountPath(mountPoint = mount_point)
Cell 3 : Execute ploosh framework
from ploosh import execute_casesconnectionsfilepath = f"{plooshconfigpath}/Files/connections.yaml"
casesfolderpath = f"{plooshconfigpath}/Files/cases"
executecases(cases = casesfolderpath, connections = connectionsfilepath, sparksession = spark)
Databricks
Cell 1 : Install Ploosh package from PyPi package manager
%pip install ploosh
Cell 2 : Restart python to make the package available
dbutils.library.restartPython()
Cell 3 : Execute ploosh framework
from ploosh import execute_casesroot_folder = "/Workspace/Shared"
executecases(cases=f"{rootfolder}/cases", pathoutput=f"{rootfolder}/output", spark_session=spark)
Local
Step 1 : Install Ploosh package from PyPi package manager
pip install ploosh
Step 2 : Initialize the spark session
from pyspark.sql import SparkSessionspark = SparkSession.builder.appName("Ploosh").getOrCreate()
Step 3 : Execute ploosh framework
from ploosh import execute_casesexecutecases(cases = "testcases", connections = "connections.yml", spark_session = spark)