Blog

← Back to Blog

Data Flows For Data Transformation

Review, Cleanse, and Transform Your Data

In many cases, Datasets don’t contain all desired information when originally pulled from their source. Or, the data within the Dataset may not be formatted in the desired fashion. For example, each record in a Dataset may contain two text fields but not the concatenation of the two. Or, a Dataset could contain two numeric fields but not the product of the two.

Informer’s unique Data Flows data transformation feature makes it easy for you to review, cleanse, and prepare your data as it comes into the system. With Flows, you can intercept Dataset records as they are being indexed (e.g. a database, network document, or API call result) and augment the records with additional data, metadata, or logic, such as the concatenated text and multiplied numeric fields example above. As a result, you can perform data analysis and use Informer Dashboards and Ad-hoc Reports to track and analyze key performance metrics based on one version of the truth.

Data Flows enable you to easily transform data however you choose, for example:

  • Join data from disparate sources and append incomplete entries
  • Scrub duplicates and normalize inconsistent fields
  • Author custom scripts to transform data into exactly how you want it to appear

Data Flows can be much more than simple data transformation expressions for data analysis. For example, consider a Dataset that contains a street address for each record. You could apply a flow to these records which:

  • Authenticates against the Google Maps API
  • Returns latitude and longitude values
  • Adds those coordinates as a new field in your Dataset

You can then add a heat map as a visual to the Dataset or use Informer Discover to create a heat map for you.

Another common usage is two-pass evaluation. You may have a numeric field and want to calculate its percentage of the total of all records in the Dataset. This requires retrieving the grand total by first processing all records, then passing through your records again to calculate the per record percent of the total. You can do this using a Data Flow.

Data Transformation Using Standard Data Flows

Data Flow steps include several options for you to choose from: Add Field, Transform, Remove, and Advanced:

• Add Field – Enables you to create new Fields that provide stock functionalities for one or more selected Fields from the Dataset or Fields from a different Dataset.
• Transform – Enables you to configure the Fields that you’ve added by adjusting datatypes, replacing Field values, removing duplicates, and splitting Fields with multiple values.

Data Transformation Using Custom Data Flows

You may have the need for a type of Data Flow that is uniquely specific to your organization or industry. Or, you may have a commonly used calculation and want to stop authoring the same script calculation repeatedly. In this scenario, you would simply author an Informer Plugin to register your desired functionality.

Let’s say you manage the Order Processing Department and your users commonly work with result sets that contain an ‘Order Quantity’ Field, a ‘Unit Price’ Field, and a ‘Zip Code’ field. Your processors then calculate an order sub-total from the Order Quantity and Unit Price Fields, a tax amount based on Zip Code, a Shipping & Handling Fee, an estimated delivery date based on Order Quantity and Zip Code, and a total amount from the whole lot.

In this scenario you have a choice:

  • rely on your processors to calculate each of these correctly for every order that comes through the system
  • or, you could author an Informer Plugin as part of your Flow that prompts for the required input fields from the existing results and amends each record with the appropriate values, all before the user retrieves the Dataset.

Related Posts