The Multi Files source node is used to upload or connect to multiple files of the same type. The Multi File source node supports Text, JSON, XML, and Excel files, and supports both local file upload and pointer to a shared folder on a network drive.
The given files must have the same structure (columns) and file type (i.e. Text, JSON, or XML). All files are then combined into a single table in Pyramid.
Note: this data source is not available in the Community Edition.
Connect to Multi File Source
To upload or connect to multiple files, add the Multi Files node and go to its Properties panel. Select the file type, then choose whether to upload the files or connect to a shared folder.
Upload the required files from their folder locations. To create tables, select the required files from the File Uploads list and click 'Apply'. The resulting tables will appear in the 'Tables' window; select the required tables and click 'Add Tables' to add them to the data flow.
To connect to a shared folder, provide the properties:
- Shared Folder Path: provide the shared folder path.
- Expression: enable this option to create a dynamic PQL expression for the shared folder path.
- Tree Mode: enable to include subfolders within the given folder.
- Files to include: enter the extension(s) of the files to upload. Each extension must be preceded by an asterisk, and multiple extensions must be delimited by a comma and then a space. For example: *.txt, *.csv. Each file must have the same structure (columns) and file type (i.e. Text, JSON, or XML).
- Click here to learn more about connecting to a shared file.
Configure the Multi File Source
After uploading or connecting to the files, you'll need to configure them, according to the file type. Follow the links below to review the configuration for each file type:
- Click here to learn how to configure Text file sources.
- Click here to learn how to configure JSON file sources.
- Click here to learn how to configure XML file sources.
- Click here to learn how to configure Excel file sources.
Add Source Name as Column
Enable this option (green highlight below) to add a column to the table listing the source file for each row.
Select the table(s) from the Tables window and add it to the data flow:
The table will be added to the data flow and connected to the source node:
Expand the Description window to add a description or notes to the node. The description is visible only from the Properties panel of the node, and does not produce any outputs. This is a useful way to document the ETL pipeline for yourself and other users.
In this example, the Mutli File source node was used to connect to multiple CSV files in a shared folder:
After the folder path is provided, the files must be configured. As Pyramid is connecting to Text files, the encoding and reading method must be supplied, as well as the value delimiter and text delimiter. Next, the first row in the files was specified as column names, and the source name column added:
Finally, the table is connected to the data flow using the Add Table function from the Tables window:
When the table is previewed, we see the column headers and the source column: