Skip to main content

MuleSoft DataWeave - A Powerful Language For Data Transformations


In today's enterprise infrastructure, system and application integration is more and more frequently a mission critical concern. There are number of Enterprise Service Bus (ESB) products available in the market today. These products can help you remove basic dependencies between applications by eliminating the need for one application to be aware of the other's location, but connectivity is not the only issue. In reality, most systems do not speak same language so Data transformation is one of the most important topic in Integration space.

Data transformation is the process of converting data or information from one format to another, usually from the format of a source system into the required format of a new destination system. MuleSoft has developed "DataWeave" - a new language and module for querying and transforming data. DataWeave is a full-featured and fully native framework for querying and transforming data on Anypoint Platform. Fully integrated with the graphical interface of Anypoint Studio and DataSense, DataWeave makes even the most complex data integration simple.

The DataWeave language supports a variety of transformations, from simple one-to-one mappings to more elaborate mappings including normalization, grouping, joins, deduplication, pivoting and filtering. It also supports XML, JSON, CSV, Java and EDI out of the box. The DataWeave Language is a powerful template engine that allows you to transform data to and from any kind of format (XML, CSV, JSON, Pojos, Maps, etc). It is fully integrated with Anypoint Studio, making on-ramp and continued development easy. It includes full integration with DataSense, allowing payload-aware development with auto-completion, auto-scaffolding of transforms, and live previews.

Let's learn little bit more about the basics of this elegant and lightweight expression language. DataWeave files are divided into two main sections: 1) The Header, which defines directives (optional) and 2) The Body, which describes the output structure. Both sections are delimited by a separator, which is not required if no header is present. The separator consists of three dashes: "---".

Header

The DataWeave header contains the directives, these define high level information about your transformation. The structure of the Header is a sequence of lines, each with its own Directives. Through directives you can define:
  • DataWeave version
  • Input types and sources
  • Output type
  • Namespaces to import into your transform
  • Constants that can be referenced throughout the body
  • Functions that can be called throughout the body 

Body

The body contains the expression that generates the output structure. Regardless of the types of the input and output, the data model for the output is always described in the standard DataWeave language, and this model that the transform executes. The data model of the produced output can consist of three different types of data:
  • Objects: represented as collection of key value pairs
  • Arrays: represented as a sequence of comma separated values
  • Simple literals
Let’s take a look at a simple example to understand more about DataWeave. In this example, we will first transform JSON input data to Java object and finally Java object to XML format.

Step 1. Import JSON Schema
Employee JSON Schema:

Step 2. Import XML Schema
Employee XML Schema:

Step 3. Create Java Classes
Employee.java

Contact.java

Address.java

Email.java

Step 4. Write Mule Flow in Anypoint Studio
MuleFlow
Step 5. Write JSON to Java Transformation
DW Transformation (JSON to Java)

Step 6. Write Java to XML Transformation
DW Transformation (Java to XML)

Step 7. Input and Output
Input Data (JSON Format)

Output Data (XML Format)

References

Advantages

  • Mapping and transforming with DataWeave eliminates error-prone custom code
  • Rules, lookups, and editing capabilities enable advanced transformations
  • DataSense™ discovers end-point meta-data for intelligent design
  • Delivers both batch and real-time event-driven data integration capabilities
  • Supports XML, JSON, CSV, POJOs, Excel, and more
P.S. Click here to access my other posts.

Comments

  1. Thank a lot for this post that was very interesting. Keep posting like those amazing posts, this is really awesome :) Thank you for sharing any good knowledge and thanks for fantastic efforts. I have learned a lot from this
    oracle training in chennai

    oracle training institute in chennai

    oracle training in bangalore

    oracle training in hyderabad

    oracle training

    hadoop training in chennai

    hadoop training in bangalore


    ReplyDelete
  2. Thanks for sharing valuable information and very well explained. Keep posting.

    mulesoft self learning
    mulesoft training

    ReplyDelete

Post a Comment

Popular posts from this blog

Postman - Set Timeout / Think Time / Pause / Delay

Those who are involved in API or web service development should be knowing about Postman , it is one of the most popular tools to build API requests and test them. Collection Runner is one of the feature of Postman . You can create one or more requests and group them in Collection, and as name suggests, you can run the entire collection i.e. series or requests. What if you need to add "Think Time" or "Delay" or "Pause" between two requests? It is surely possible, here are some options: Using Collection Runner GUI This option will be applicable to all the requests in the collection In the Collection Runner window, enter value for Delay in milliseconds Using Command Line This option will be applicable to all the requests in the collection Newman is a comman line collection runner for Postman Command To Execute: newman run <collection-file-source> --delay-request [number] Click here to get the details about Newman Newman installation

How to Extract Values from Response Header in JMeter?

JMeter is a powerful tool for API testing. Let's say you are are writing test cases for one of your RESTful service; and you want to extract and validate the value returned as part of response header. It is little bit tricky to extract the value from Response Header in JMeter , but it is possible. For example, your RESTful service returns "ETag" in response header. When you look at the raw response data, the value is displayed something like this: ETag: 2666d92fa9ebf10250acdb235546f045 To exact value of this reaponse header in JMeter: Right click on your HTTP request, then add Post Processor element - Regular Expression Extractor Select Radio button - Main sample only Select Radio burron - Response Headers Type some name in Reference Name section - for example, eTagVariable Type this expression in Reference Expression section - ETag:\s+(.+) IMPORTANT: This expression will select pick the ETag response header parameter and select everything after colon bla

Setup OpenLDAP on MAC

macOS (Mac OS X or OS X) is the current series of Unix-based graphical operating systems developed and marketed by Apple Inc . designed to run on Apple's Macintosh computers ("Macs"). Within the market of desktop, laptop and home computers, and by web usage, it is the second most widely used desktop OS after Microsoft Windows . Recently, while working on one of my projects, there was a requiremnt to integrate our system with LDAP i.e. Lightweight Directory Access Protocol ( LDAP ). So, I wanted to try out some samples by installing OpenLDAP on my MAC. Initially, it looked streightforward but later I realized that there are multiple steps involved to get OpenLDAP up and running in my Mac. So, I thought of documenting various steps so that others can refer them. What do you need to begin? MAC Obviously you need MAC because these steps will work only for MAC (Sorry Windows users) Homebrew This is a package manager for macOS. Click here and follow the instructio

MuleSoft - Static IP Addresses and Multiple Workers

Did you just realize that your Mule Application requires horizontal scalling ? Well, thats easy - just go to CloudHub Runtime Manager, select your application and change the "Workers" count from 1 to either 2, 3 or 4. If numbers are grayed out, then you might have to adjust the "Worker Size", or purchase additional capacity from MuleSoft. So, what is the issue then? At the time of the blog post, based on Mule documentation, if you are using only 1 worker and if you want to apply static IP, then you can simply navigate to "Static IPs" section in Runtime Manager and allocate the static IP address. The issue is when you want to apply Static IP addresses for more than 1 workers for various reasons including IP whitelisting. Based on Mule documentation you cannot apply static IPs if you are using more than 1 worker. From Mule Documentation: "Static IPs are not supported for private IP addresses inside a CloudHub VPC and it is only supported for app

Sublime Text 3 - Pertty Format JSON

You must be looking for some easy options to pretty format JSON data. Well, there are many websites which provides this capability and you can pretty format JSON data in browser. I use Sublime Text 3 text editor, and I was trying to see if I can pretty format JSON within the text editor itself. You can follow these simple steps: Open Sublime Text 3 text editor If you are using MAC OS Press Command + Shift + P Then select "Install Package" Search for "Pretty JSON" Install If you are using Windows OS Press CTRL + Shift + P Then select "Install Package" Search for "Pretty JSON" Install Once the installation is complete, select JSON string If you are using MAC OS Press Command + Control + J If you are using Windows OS Press CTRL + ALT + J Thats it! Now you don't have to copy your JSON string from Sublime Text 3, paste in your browser, format it, then copy formatted JSON from browser and paste it back in your Sublime

Access GitHub Repositories with SourceTree and 2FA (2 Factor Authentication)

Many developers and organizations use GitHub as code repository. Similarly, many prefer to commit code in GitHub repositories using various commands. I personally think that using any kind of user interface for committing code and performing various GIT operations is much more simpler and productive. SourceTree is one such Atlassian product. SourceTree is a free Git client and provides beautiful GUI that offers a visual representation for various Git repositories. For added security, you can enable 2FA (i.e. 2 Factor Authentication) for your GitHub account. Configuring GitHub with SourceTree is very very simple. Here are the steps on MAC OS (steps on Windows OS are similar): If you have not enabled 2FA: Open SourceTree and go to Preferences Select Accounts Click on Add button to add account On the pop-up window, select Host as "GitHub" Select Auth Type as "Basic" Type your GitHub username (not email) Type your password Select Protocol as HTTPS

MuleSoft LDAP Connector With Example

As we all know, the L ightweight D irectory A ccess Protocol ( LDAP ) is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. A common use of LDAP is to provide a central place to store usernames and passwords. This allows many different applications and services to connect to the LDAP server to validate users. This has a major benefit that allows a central place to update and change user passwords. With MuleSoft’s LDAP Connector , we can access and maintain directory information services over an IP network by connecting to any LDAP server. LDAP Connector is developed by MuleSoft's developer community and it is categorized as Community Connectors. Latest LDAP Connector documentation is available here . Unfortunately, the information and examples of this connector are limited and scattered all over the web. Today, I am trying to cover as many details as p

MuleSoft - JSON Schema Validation with Dynamic Schema Location

While working on a Mule Application project, if you a planning to validate JSON payload against some JSON schema , and more importantly if you want to pass schema location dynamically, then this post will be benificial for you. Mule provides JSON Schema Validator as out of the box feature. From mule documentation , evaluates JSON payloads at runtime and verifies that they match a referenced JSON schema. You can match against schemas that exist in a local file or in an external URI. If the validation fails, an exception is raised with feedback about what went wrong and a reference to the original invalid payload. Above code works but here is the catch, in my scenario, the schema location was dynamic. Let's say the schema location is present in the database. Once the schema location value is retrieved from database, it is stored in a flow variable. For simplicity, I am creating a flow variable with hard-coded value. In actual application, the flow variable value will be the va

Mule SFTP and PGP Encryption

Many companies use standard protocols such as FTP and SFTP to transfer files to external partner and receive files from external partners. Using FTP and SFTP provides simple to use and low cost platform for file exchange. In some business scenarios, companies may have to exchange sensitive information such as employee’s personal information, expense reports, payment information etc. Mule has the ability to encrypt a message payload, or part of a payload, using Pretty Good Privacy ( PGP ). PGP combines data compression and data encryption to secure messages. The compression reduces message transmission time between origin and destination. There are two scenarios that this document addresses: Using another party’s public key to encrypt a messages in a Mule application Using one’s own set of private and public keys so as to accept, and decrypt messages in a Mule application. What is Pretty Good Privacy (PGP)? Pretty Good Privacy ( PGP ) is a data encryption and decryption

Run JMeter Tests with Maven

In this article, I will be focusing on configuring JMeter with Maven but lets first understand some basics of JMeter and Maven. The Apache JMeter™ application is open source software, a 100% pure Java application designed to load test functional behavior and measure performance. These days, performance testing is very very important especially when the applications are targeting large number of users. There are many tools available in market, some are paid, some are free. Apache JMeter is one such free and open source software. Though JMeter's was initially developed for load testing web applications, it is now far more advanced. The biggest advantage of the JMeter is that it can do many things like performance and functional testing for web services, databases, FTPs or Web Servers, LDAP, JMS, trigger emails/notifications. Most of these features are implemented with plugins. JMeter is powerful, easy to install and use and FREE! It is a Java desktop application with simple us