Data Flows

 

Data Flows for Data Transformation

Review, Cleanse, and Transform Your Data

Data preparation is a critical step when aggregating information for data analysis, reporting and collaboration.

In many cases Datasets don’t contain all desired information when originally pulled from their source. Or, the data within the Dataset may not be formatted in the desired fashion. For example, each record in a Dataset may contain two text fields but not the concatenation of the two. Or, a Dataset could contain two numeric fields but not the product of the two.

 

Dataflows

Informer’s unique Data Flows data transformation feature makes it easy for you to review, cleanse, and prepare your data as it comes into the system. With Flows, you can intercept Dataset records as they are being indexed (e.g. a database, network document, or API call result) and augment the records with additional data, metadata, or logic, such as the concatenated text and multiplied numeric fields example above. As a result, you can perform data analysis and use Informer Dashboards and Ad-hoc Reports to track and analyze key performance metrics based on one version of the truth.

Data Flows enable you to easily transform data however you choose, for example:

  • Join data from disparate sources and append incomplete entries
  • Scrub duplicates and normalize inconsistent fields
  • Author custom scripts to transform data into exactly how you want it to appear.

Data Flows can be much more than simple data transformation expressions for data analysis. For example, consider a Dataset which contains a street address for each record. You could apply a flow to these records which:

  • Authenticates against the Google Maps API
  • Returns latitude and longitude values
  • Adds those coordinates as a new field in your Dataset.

You can then add a heat map as a visual to the Dataset or use Informer Discover to create a heat map for you.

Another common usage is two-pass evaluation. You may have a numeric field and want to calculate its percentage of the total of all records in the Dataset. This requires retrieving the grand total by first processing all records, then passing through your records again to calculate the per record percent of total. You can do this using a Data Flow.

Data Transformation Using Standard Data Flows

The Data Flows in Table 1 are provided as stock functionality.

Table 1: Stock Flows

Flow NameDescription
Calculated FieldCreate a new Field
Percent of TotalCalculates the percent of the overall total of a selected field.
Fields from another DatasourceConnects to another Datasource and retrieves one or more fields from that Datasource. The two Datasources do not need to be of the same database type, but there must be a common value in both Datasources that joins the two together (e.g. Customer ID, or Product ID, or Student ID, etc.).
Fields from another DatasetConnects to another Dataset and retrieves one or more fields from that Dataset. There must be a common value in both Datasets that joins the two together (e.g. Customer ID, or Product ID, or Student ID, etc.).
Templated FieldUses static text and placeholders for other fields to create a new field. Example: using HTML, create a link to start an email when the email address is clicked.
ConcatenateMerges two or more fields into one
Coordinates from Latitude & LongitudeCoordinates and latitude and longitude pair into a location value to be used in geo mapping
CounterAdds a row counter to the output
Percentage of TotalCreates a column that is the row’s percentage of another field’s sum
Time BetweenCreate a new Field containing the difference in selected units between two Time Fields
Zip2geoCreate a Field containing longitude and latitude of a provided Zip Code. Useful for generating Map Visuals.
Field SettingsAllows you to adjust the datatype for fields in the Dataset.
Find & ReplaceFinds a string value within a selected field and replace it with a given value.
FlushProcess all records in the Dataset prior to continuing to next Flow step.
Merge DuplicatesRemove duplicates of selected Fields from Dataset. Other associated fields are listed with multiple values per row.
NormalizeSplits fields with multiple values per row into separate rows, repeating the single values with which they are associated.
Remove FieldRemove a Field from display in Dataset
Power ScriptExecute logic based on custom script execution

Data Flow steps include several options for you to choose from: Add Field, Transform, Remove, and Advanced:

• Add Field – Enables you to create new Fields that provide stock functionalities for one or more selected Fields from the Dataset or Fields from a different Dataset.
• Transform – Enables you to configure the Fields that you’ve added by adjusting datatypes, replacing Field values, removing duplicates, and splitting Fields with multiple values.

Data Transformation Using Custom Data Flows

You may have the need for a type of Data Flow that is uniquely specific to your organization or industry. Or, you may have a commonly used calculation and want to stop authoring the same script calculation repeatedly. In this scenario, you would simply author an Informer Plugin to register your desired functionality.

Let’s say you manage the Order Processing Department and your users commonly work with result sets that contain an ‘Order Quantity’ Field, a ‘Unit Price’ Field, and a ‘Zip Code’ field. Your processors then calculate an order sub-total from the Order Quantity and Unit Price Fields, a tax amount based on Zip Code, a Shipping & Handling Fee, an estimated delivery date based on Order Quantity and Zip Code, and a total amount from the whole lot.

In this scenario you have a choice:

  • rely on your processors to calculate each of these correctly for every order that comes through the system
  • or, you could author an Informer Plugin as part of your Flow that prompts for the required input fields from the existing results and amends each record with the appropriate values, all before the user retrieves the Dataset.

Want to learn more about Informer?