daft.DataFrame.write_parquet#

DataFrame.write_parquet(root_dir: str, compression: str = 'snappy', partition_cols: list[Union[daft.expressions.Expression, str]] | None = None) daft.dataframe.dataframe.DataFrame[source]#

Writes the DataFrame as parquet files, returning a new DataFrame with paths to the files that were written

Files will be written to <root_dir>/* with randomly generated UUIDs as the file names.

Currently generates a parquet file per partition unless partition_cols are used, then the number of files can equal the number of partitions times the number of values of partition col.

Note

This call is blocking and will execute the DataFrame when called

Parameters
  • root_dir (str) – root file path to write parquet files to.

  • compression (str, optional) – compression algorithm. Defaults to “snappy”.

  • partition_cols (Optional[List[ColumnInputType]], optional) – How to subpartition each partition further. Currently only supports Column Expressions with any calls. Defaults to None.

Returns

The filenames that were written out as strings.

Note

This call is blocking and will execute the DataFrame when called

Return type

DataFrame