This connector is used to read CSV files using Spark.

⚠️ A spark connector can be use only with another spark connector. It is not possible to use a spark connector with a non spark connector.

See Spark documentation for more information.

Connection configuration

No connection is required by this connector

Configuration

Test case configuration

NameMandatoryDefaultDescription
pathyesPath to the CSV
delimiterno,Column delimiter
headernotrueUse the first row as header
inferSchemanoFalseInfers the input schema automatically from data
multilinenoFalseParse one record, which may span multiple lines, per file
quoteno'"'Character used to denote the start and end of a quoted item
encodingno"UTF-8"Column delimiter
lineSepno"\n"Column delimiter

Example

Example CSV Spark:
  source:
    type: csv_spark
    path: data/employees/*.csv
    multiline: False
    inferSchema: False
    encoding: "UTF-8" 
  expected:
    type: sql_spark
    query: |
      select * 
          from employees
          where hire_date < "2000-01-01"