The process for connecting to IMDB as a source is the same as that for connecting to relational databases.
- Click here to learn how to connect to IMDB as a target.
Connect to Pyramid IMDB as a Source
Add the In Memory node to the canvas and go to its Properties panel, where you need to connect to the source (red highlight below) and determine which tables to add to the data flow (green highlight). You can also add notes in the Description window (blue highlight).
Select the required server (green highlight below). If the server doesn't appear in the list, try clicking refresh (blue arrow below). Admins can also configure additional servers by clicking the Add Server button (orange arrow).
Next, select the required database (red highlight). If the database isn't listed, try refreshing the list (purple arrow).
Choose an existing semantic model from the relevant drop down if needed (white arrow below).
To enable direct querying, select 'Direct Query Datasource', and then progress immediately to Data Modeling. However, if you want to create a flow diagram and apply data cleansing, or simply don't want to allow direct querying of the model, do not enable direct querying.
Go to the Tables window (image below) to choose which tables to copy into the new data model. Table selection is relevant for both direct querying and data ingestion. Click the refresh button (green arrow below) to ensure the list of tables is up to date, and use the 'Filter Table List' field to search for tables.
Add Tables to the Data Flow
Once you've selected the required tables, you'll need to add them to the data flow (unless you've enabled direct query).
If you want to apply data cleansing, manipulation, or machine learning to the model, copy the selected tables by clicking the 'Add Tables' button (yellow highlight). Each selected table will be copied to an individual table node, to which you connect a range of functions and formulations.
If you don't intend to apply any data cleansing, you can copy them sing the 'Add as Multi-Select' button (blue highlight). This latter option copies all selected tables to a single node, using the multi-select function. The resulting node must then be connected directly to the target.
If you have enabled direct query, the 'Add' buttons will be disabled, as no nodes can be connected to a source designated for direct query.
Another way to add tables from the source to the data flow is via the Select functions, using the single-select Table or multi-select Tables nodes. You can then input the column(s) for each select operation. Another option is to use the Query node to copy a data set from the source using an SQL or SOQL expression.
You can add text to the Description window, which is useful for documenting the ETL pipeline.
In this example, the user connected to an IMDB server that was configured in their system as 'In-Memory A', and connected to a database on that server called 'SpreadSheetDemo Intl' (green arrow below).
Five tables from the given database were then selected and copied to the data flow using the 'Add Tables' function, connecting each chosen table to the datasource via a separate node (blue arrow).
The user can now connect any required functions or machine learning to the tables.
Here, the IMDB node was connected to the in-memory server called 'In-Memory A', and to a database on that server called 'SpreadSheetDemo Intl' (green arrow below).
Five tables were selected from the given database and copied to the data flow using the 'Add as Multi-Select' function, connecting each chosen table to the datasource via a single multi-select node (blue arrow). Only a target node may be connected to the mult-select Tables node.
Here, the IMDB node was connected to the in-memory server called 'In-Memory A', and to a database on that server called 'SpreadSheetDemo Intl' (green arrow below). Direct query was enables (red highlight), disabling the Add functions in the Tables window (orange highlight). The user then selected 5 tables from the given database, so that only these given tables can be queried.
The user cannot connect any functions or nodes to the source node because direct query is enabled.