Pyramid allows you to upload or connect to XML files and convert the data into a table. Because XML is a tree structure, you will need to determine how to define the columns in the table.
The XML file source supports local file upload, or pointer to a shared file on a network drive, or URL address.
Provide an XML File Source
Start by adding the XML source node and go to its Properties panel. Connect the required file by uploading it, or pasting a pointer to a shared file, or a URL address.
Upload the file by dragging it from its folder location onto the upload widget. Alternatively, click Upload File to open your file explorer and find and select the file from its folder location.
Enter the file path of the shared file (including the file name and extension) in the File Path field (green highlight below). Enable 'Expression' (yellow highlight) to provide the file path in the context a dynamic PQL expression, created in the PQL Editor.
- Click here to learn more about connecting to a shared file.
To connect to an XML file via its URL, paste the URL in the Set URL field. Enable 'Expression' to provide the URL in the context a dynamic PQL expression, created in the PQL Editor.
Select the appropriate authentication type:
- None: select None if no authorization is required
- Basic Authentication: if basic authentication is required
- Custom Header: if custom header authorization is required
Once you've uploaded or connected to the file, you'll need to configure the properties to determine the structure of the table.
- Encoding: encoding is set to UTF-8 by default. If your XML file source has a different encoding, select it from the drop-down.
- Start reading at path: set the element tag level from which to start reading the file.
- Without Root Element: determine whether or not to include the root element as a column.
- Root element as column: include the root element as a column.
- Element tag as column: include the element tag as a column.
- XPath as column names: use the XPath as the column names.
- Lists of elements: use the drop-down selection to determine how the elements will appear.
- Append values: all elements are listed in a single column. Each row is coma delimited.
- Create new columns: organizes elements into separate columns.
- Max depth to extract columns: the maximum number of elements to extract to column(s).
- Change Source: upload or connect to a different XML file.
- Update Table: update the resulting table with any changes made to the XML properties. This option appears only after the table has been added to the data flow.
The XML file will be converted into a single table; select the table and click the 'Add Table' button.
The table will be added to the data flow and connected to the source node:
Expand the Description window to add a description or notes to the node. The description is visible only from the Properties panel of the node, and does not produce any outputs. This is a useful way to document the ETL pipeline for yourself and other users.
In this example, the user connected to a URL; the URL was pasted in the URL field, and then the username and password provided:
After clicking OK, the properties for the XML file need to be set:
After setting the XML properties, the table is added to the data flow by clicking Add Table from the Tables window: