Data Flows for Data Transformation

Review, Cleanse, and Transform Your Data

Data preparation is a critical step when aggregating information for data analysis, reporting and collaboration.

In many cases Datasets don’t contain all desired information when originally pulled from their source. Or, the data within the Dataset may not be formatted in the desired fashion. For example, each record in a Dataset may contain two text fields but not the concatenation of the two. Or, a Dataset could contain two numeric fields but not the product of the two.

Flow Step Add Field  

Informer’s unique Data Flows data transformation feature makes it easy for you to review, cleanse, and prepare your data as it comes into the system. With Flows, you can intercept Dataset records as they are being indexed (e.g. a database, network document, or API call result) and augment the records with additional data, metadata, or logic, such as the concatenated text and multiplied numeric fields example above. As a result, you can perform data analysis and use Informer Dashboards and Ad-hoc Reports to track and analyze key performance metrics based on one version of the truth.

Data Flows enable you to easily transform data however you choose, for example:

  • Join data from disparate sources and append incomplete entries
  • Scrub duplicates and normalize inconsistent fields
  • Author custom scripts to transform data into exactly how you want it to appear.

Data Flows can be much more than simple data transformation expressions for data analysis. For example, consider a Dataset which contains a street address for each record. You could apply a flow to these records which:

  • Authenticates against the Google Maps API
  • Returns latitude and longitude values
  • Adds those coordinates as a new field in your Dataset.

You can then add a heat map as a visual to the Dataset or use Informer Discover to create a heat map for you.

Another common usage is two-pass evaluation. You may have a numeric field and want to calculate its percentage of the total of all records in the Dataset. This requires retrieving the grand total by first processing all records, then passing through your records again to calculate the per record percent of total. You can do this using a Data Flow.

Data Transformation Using Standard Data Flows

The Data Flows in Table 1 are provided as stock functionality.

Table 1: Stock Flows

Flow Name Description
Calculated Field Create a new Field
Percent of Total Calculates the percent of the overall total of a selected field.
Fields from another Datasource Connects to another Datasource and retrieves one or more fields from that Datasource. The two Datasources do not need to be of the same database type, but there must be a common value in both Datasources that joins the two together (e.g. Customer ID, or Product ID, or Student ID, etc.).
Fields from another Dataset Connects to another Dataset and retrieves one or more fields from that Dataset. There must be a common value in both Datasets that joins the two together (e.g. Customer ID, or Product ID, or Student ID, etc.).
Templated Field Uses static text and placeholders for other fields to create a new field. Example: using HTML, create a link to start an email when the email address is clicked.
Concatenate Merges two or more fields into one
Coordinates from Latitude & Longitude Coordinates and latitude and longitude pair into a location value to be used in geo mapping
Counter Adds a row counter to the output
Percentage of Total Creates a column that is the row’s percentage of another field’s sum
Time Between Create a new Field containing the difference in selected units between two Time Fields
Zip2geo Create a Field containing longitude and latitude of a provided Zip Code. Useful for generating Map Visuals.
Field Settings Allows you to adjust the datatype for fields in the Dataset.
Find & Replace Finds a string value within a selected field and replace it with a given value.
Flush Process all records in the Dataset prior to continuing to next Flow step.
Merge Duplicates Remove duplicates of selected Fields from Dataset. Other associated fields are listed with multiple values per row.
Normalize Splits fields with multiple values per row into separate rows, repeating the single values with which they are associated.
Remove Field Remove a Field from display in Dataset
Power Script Execute logic based on custom script execution

Data Flow steps include several options for you to choose from: Add Field, Transform, Remove, and Advanced:

• Add Field – Enables you to create new Fields that provide stock functionalities for one or more selected Fields from the Dataset or Fields from a different Dataset.
• Transform – Enables you to configure the Fields that you’ve added by adjusting datatypes, replacing Field values, removing duplicates, and splitting Fields with multiple values.

Data Transformation Using Custom Data Flows

You may have the need for a type of Data Flow that is uniquely specific to your organization or industry. Or, you may have a commonly used calculation and want to stop authoring the same script calculation repeatedly. In this scenario, you would simply author an Informer Plugin to register your desired functionality.

Let’s say you manage the Order Processing Department and your users commonly work with result sets that contain an ‘Order Quantity’ Field, a ‘Unit Price’ Field, and a ‘Zip Code’ field. Your processors then calculate an order sub-total from the Order Quantity and Unit Price Fields, a tax amount based on Zip Code, a Shipping & Handling Fee, an estimated delivery date based on Order Quantity and Zip Code, and a total amount from the whole lot.

In this scenario you have a choice:

  • rely on your processors to calculate each of these correctly for every order that comes through the system
  • or, you could author an Informer Plugin as part of your Flow that prompts for the required input fields from the existing results and amends each record with the appropriate values, all before the user retrieves the Dataset.

Want to learn more about Informer?