shape2
shape2
shape_round
shape_round
shape_round

Oracle Analytics Cloud - Data Preparation Feature

Table of Contents

    Overview

    Oracle Analytics Cloud is a scalable, secure public cloud service that provides a full set of capabilities to explore and perform collaborative analytics.

    With Oracle Analytics Cloud you also get flexible service management capabilities, including fast setup, easy scaling and patching, and automated lifecycle management.

    Importance of Data Preparation in Analytic Implementations

    Data preparation is typically the most time-consuming part of the life cycle of data analytics. Data preparation is a prerequisite step which involves cleaning, transforming, and consolidating data. It’s a crucial step that can consist of connecting multiple data sources, cleaning noisy data, and merging the data into digestible content.

    Commonly Used Tools and Techniques for Data Preparation

    There are a lot of choices available in the market to assist with data preparation. Some common tools are Trifacta, Alteryx, Datawatch, and Paxata which offer features such as visual profiling & transformations, building & scheduling pipelines, data mashups & blending, and self-service discovery. The primary objective of these tools are to:

    • Improve the efficiency and productivity of analyses
    • Develop new data pipelines
    • Enable collaboration between business

    Features of Oracle Analytics Cloud- Data Preparation

    OAC is built on a high-performance platform with flexible data storage and provides a complete set of tools for deriving and sharing data insights, one of these being data preparation. With data preparation, analysts can ingest, profile, and cleanse data using a variety of algorithms.They can also transform and enrich underlying data to generate meaningful analyses.

    Below are the sample steps to Transform and Enrich Data in Oracle Analytics Cloud using a flat file as a data source:

    Create a project and data source.


    a. Sign in OACS, create a project, then click on “Data Set.”


    Create project create data sheet

    b. In Create Data Set, click Drop data file here or Click to browse, select the accountinfo_sales.xlsx file, and then click Open.

    Image-3

    c. In Add Data Set, select the id column. In Properties, click Measure in the treat As row, and then select Attribute

    Image-4

    d. Select the Sales column. In Properties, click Number Format. In the Number Format row, click Number, and then select Currency. --Add

    Image-5

    Clean up the data using the Groups feature.

    Let’s create a group to consolidate data gender) in the data set.

    a. InResults, select the gender column, right click, and then select Group

    Image-6

    b. In Group, enter Gender_Fix in the Name field.

    c. Enter Female in the field that contains Group 1. In the list of values for the Gender column, click the rows with female and f, and then click >Add Group.

    d. Enter Male in the field that contains Group 2. In the list of values for the Gender column, click Add all, and then click Add Step.

    Image-7

    e. In Results, select the gender, click Options , and then select Delete.

    Image-8

    Transforming Email Addresses

    Let's change the mixed-case email addresses to lowercase

    a. In Results, select the email column, Right click and then select Lowercase

    Image-9

    Extract Data from a Column

    Now lets extract the area code from phone numbers that use the North American Numbering Plan.

    a. In Results, select the phone column.

    b. In the Recommendation list, click Extract area code from phone. 

    Image-10

    Obfuscate Sensitive Customer Data

    In order to comply with security policies for sensitive data, obfuscate secure data in a column.

    a. In Results, select the column.

    b.In the Recommendations list, click Obfuscate First 12 Digits of ccnumber.

    Image-11

    See below for the obfuscated data column.

    Image-12

    Enrich Data with long and lat Geographic Coordinate

    a. On the Home page, click Data in the search bar, enter accountinfo and the click Search.

    Image-13

    b. In accountinfo project, click the Actions menu , and select Open.

    c. In Results, select the zip column, click Enrich zip with Lat (latitude).

    d. In Recommendations, click Enrich zip with Lon (longitude).

    Image-14

    Apply the Preparation Script

    The preparation script contains the pending changes that are applied when the script is run.

    a. Click Apply Script.

    b. When the blue dot next to Results, the Apply Script progress indicator disappears and the script completes the transformations.

    Image-15

    Oracle Analytics automatically saves the changes in the data set.

    a. Click Data in the search bar, enter accountinfo_sales, and then click Search.

    b. On the Home page, select the accountinfo_sales data set, right click and then select Inspect. In the accountinfo_sales data set, click Data Elements to review the updated data set columns.

    Image-16 Image-17

    Conclusion

    Thus we see how data preparation ensures accuracy in the data which in turn leads to accurate and meaningful insights. Without data preparation, it is highly possible that insights will be off due to junk data and may be overlooked by an calibration issue or an easily fixed discrepancy between datasets.

    References:

    To learn more about Oracle Analytics Cloud visit: Oracle Analytics Cloud with Analytics Cloud Enablement or schedule a meeting with an Astute Oracle Analytics Cloud team member.

    Jyoti Shah

    Jyoti Shah is a senior OCI Architect and PeopleSoft Architect/DBA at Astute Business Solutions. She has experience managing large PeopleSoft environments and performing Peoplesoft upgrades.