Microsoft Azure Data Lake storage connector example¶
Given below is a sample scenario that demonstrates how to work with container and blob operations using the WSO2 Microsoft Azure Storage Connector.
What you'll build¶
This example demonstrates how to use Microsoft Azure Storage connector to:
-
Create a file system in a Microsoft Azure Storage account.
Example: Create a file system called
customer-behavior-analytics
to store transaction logs, web activity data, and processed analytics. -
Create a directory.
Example: Create a directory called
raw-data/transactions/
directory. to store transaction logs. -
Upload a file.
Example: Upload
transactions_2025_01.json
containing customer transaction data (product IDs, timestamps, etc.) into theraw-data/transactions/
directory. -
Download a file.
Example: Download
transactions_2025_01.json
containing customer segments into the local environment. -
Retrieve the metadata from a specific file.
Example: Retrieve metadata for the
transactions_2025_01.json
file stored in theraw-data/transactions/
directory. -
Remove created File System.
Example: Remove the
customer-behavior-analytics
file system from the Azure Data Lake Storage account.
For more information about these operations, please refer to the Microsoft Azure Storage connector reference guide.
Note: Before invoking the API, you need to create a Storage Account in Microsoft Azure Data Lake Storage account. See Azure Storage Configuration documentation for more information.
Set up the integration project¶
Follow the steps in the create integration project guide to set up the Integration Project.
Create the integration logic¶
-
Click
+
on the Extension panel APIs to create the REST API. Specify the API name asMSAzureDataLakeStorageTestAPI
and the API context as/azure
. Then, delete the default/resource
endpoint. -
First, we will create the
/createfilesystem
resource. This API resource will retrieve the file system name from the incoming HTTP POST request and create a file system in Microsoft Azure Storage. Click on the+ Resource
and fill in the details. Use a URL template called/createfilesystem
and the POST HTTP method. -
Click the created resource. Next, click the
+
arrow below the Start node to open the side panel. Select Connections and click Add new Connection. Searchmsazuredatalakestorage
and click. -
Create a connection as shown below.
You can use Access key , SAS Token or OAuth2 method for authentication.
Note
You can either define the Account Access key or Client Credentials for authentication. For more information, please refer to Initialize the connector guide.
-
After the connection is successfully created, select the created connection in Connections. In the drop down menu, click CreateFileSystem.
-
Next, configure the following parameters in the properties window, the click the submit button.
- File System Name - payload.filesystemName
- Metadata(Optional) - payload.metadata
- Response Variable - msazuredatalakestorage_createFileSystem_1
- Overwrite Message Body - true
Note Tick on overwrite message body, if you want to replace the Message Body in Message Context with the response of the operation (This will remove the payload from the above variable).
-
Click
+
below theCreateFileSystem
node to add the Respond Mediator to send back the response from creating the container as shown below. -
Follow the same steps to create the next API resource,
/createdirectory
. This API resource will create the directory from the incoming HTTP POST request. -
Next, add the
CreateDirectory
operation from the Connections tab using the created connection. In the properties view, provide the following expressions for the below properties:- File System name - payload.filesystemName
- Target Path - payload.targetPath (eg:
raw-data/transactions
, no need to add file system name) - Response Variable - msazuredatalakestorage_createDirectory_504
- Overwrite Message Body - true
-
Add the Respond Mediator to send back the response from the
CreateDirectory
operation. -
Follow the same steps to create the next API resource,
/uploadfile
. This API resource will retrieve information about the blob from the incoming HTTP POST request. -
Next, add the
UploadFile
operation from the Connections tab using the created connection. In the properties view, provide the following expressions for the below properties:- File System name - payload.filesystemName
- Target Path - payload.targetPath (eg:
raw-data/transactions/transactions_2025_01.json
, no need to add file system name) - Input Type - Local File
- Metadata - payload.source
- Local File Path - payload.localPath
- Response Variable - msazuredatalakestorage_uploadFile_1
- Overwrite Message Body - true
Note: The example
transactions_2025_01.json
file is as follows:{ "transactions": [ { "transaction_id": "TXN1001", "date": "2025-01-05", "amount": 150.75, "currency": "USD", "status": "Completed", "payment_method": "Credit Card", "customer": { "customer_id": "CUST001", "name": "John Doe", "email": "[email protected]" } }, { "transaction_id": "TXN1002", "date": "2025-01-12", "amount": 299.99, "currency": "EUR", "status": "Pending", "payment_method": "PayPal", "customer": { "customer_id": "CUST002", "name": "Jane Smith", "email": "[email protected]" } }, { "transaction_id": "TXN1003", "date": "2025-01-20", "amount": 500.00, "currency": "GBP", "status": "Failed", "payment_method": "Bank Transfer", "customer": { "customer_id": "CUST003", "name": "Alice Johnson", "email": "[email protected]" } } ] }
-
Add the Respond Mediator to send back the response from the
UploadFile
operation. -
Follow the same steps to create the next API resource,
/downloadfile
. This API resource will download file from the incoming HTTP POST request. -
Next, add the
DownloadFile
operation from the Connections tab using the created connection. In the properties view, provide the following expressions for the below properties:- File System name - payload.filesystemName
- Target File - payload.targetFile (eg:
raw-data/transactions/transactions_2025_01.json
, no need to add file system name) - Download Location - payload.downloadLocation(eg: local-path/transactions_2025_01.json)
- Response Variable - msazuredatalakestorage_uploadFile_1
- Overwrite Message Body - true
-
Add the Respond Mediator to send back the response from the
DownloadFile
operation. -
Follow the same steps to create the next API resource,
/readmetadata
. This API resource will retrieve metadata of the file from the incoming HTTP POST request. -
Next, add the
GetMetadata
operation from the Connections tab using the created connection. In the properties view, provide the following expressions for the below properties:- File System name - payload.filesystemName
- Target Path - payload.targetPath (eg:
raw-data/transactions/transactions_2025_01.json
, no need to add file system name) - Response Variable - msazuredatalakestorage_uploadFile_1
- Overwrite Message Body - true
-
Add the Respond Mediator to send back the response from the
GetMetadata
operation. -
Follow the same steps to create the next API resource,
/deletefilesystem
. This API resource will delete the file system from the incoming HTTP POST request. -
Next, add the
DeleteFileSystem
operation from the Connections tab using the created connection. In the properties view, provide the following expressions for the below properties:- File System name - payload.filesystemName
- Response Variable - msazuredatalakestorage_uploadFile_1
- Overwrite Message Body - true
-
Finally, add the Respond Mediator to send back the response from the
DeleteFileSystem
operation. -
You can find the complete API XML configuration below. You can go to the source view (by clicking the
</>
icon on the top right corner) and copy-paste the following configuration.
<?xml version="1.0" encoding="UTF-8"?>
<api context="/azure" name="MSAzureDataLakeStorageTestAPI" xmlns="http://ws.apache.org/ns/synapse">
<resource methods="POST" uri-template="/createfilesystem">
<inSequence>
<msazuredatalakestorage.createFileSystem configKey="con1">
<fileSystemName >{${payload.filesystemName}}</fileSystemName>
<timeout ></timeout>
<metadata >[["source","${payload.metadata}"],]</metadata>
<accessType >NONE</accessType>
<responseVariable >msazuredatalakestorage_createFileSystem_204</responseVariable>
<overwriteBody >true</overwriteBody>
</msazuredatalakestorage.createFileSystem>
<respond description="create-file-system"/>
</inSequence>
<faultSequence>
</faultSequence>
</resource>
<resource methods="POST" uri-template="/createdirectory">
<inSequence>
<msazuredatalakestorage.createDirectory configKey="con1">
<fileSystemName>{${payload.filesystemName}}</fileSystemName>
<directoryName>{${payload.targetPath}}</directoryName>
<responseVariable>msazuredatalakestorage_createDirectory_1</responseVariable>
<overwriteBody>true</overwriteBody>
<metadata>[]</metadata>
<timeout></timeout>
<contentLanguage></contentLanguage>
<contentType></contentType>
<contentEncoding></contentEncoding>
<contentDisposition></contentDisposition>
<cacheControl></cacheControl>
<permissions></permissions>
<umask></umask>
<sourceLeaseId></sourceLeaseId>
<group></group>
<owner></owner>
</msazuredatalakestorage.createDirectory>
<respond description="create-directory"/>
</inSequence>
<faultSequence>
</faultSequence>
</resource>
<resource methods="POST" uri-template="/uploadfile">
<inSequence>
<msazuredatalakestorage.uploadFile configKey="con1">
<fileSystemName>{${payload.filesystemName}}</fileSystemName>
<filePathToUpload>{${payload.targ}}</filePathToUpload>
<localFilePath>{${payload.localPath }}</localFilePath>
<metadata>[["source","${ payload.source}"],]</metadata>
<timeout></timeout>
<responseVariable>msazuredatalakestorage_uploadFile_428</responseVariable>
<overwriteBody>true</overwriteBody>
<contentLanguage></contentLanguage>
<contentType></contentType>
<contentEncoding></contentEncoding>
<contentDisposition></contentDisposition>
<cacheControl></cacheControl>
<blockSize></blockSize>
<maxSingleUploadSize></maxSingleUploadSize>
<maxConcurrency></maxConcurrency>
<inputType>Local File</inputType>
</msazuredatalakestorage.uploadFile>
<respond description="upload-file"/>
</inSequence>
<faultSequence>
</faultSequence>
</resource>
<resource methods="POST" uri-template="/downloadfile">
<inSequence>
<msazuredatalakestorage.downloadFile configKey="con1">
<fileSystemName>{${payload.filesystemName}}</fileSystemName>
<filePathToDownload>{${ payload.targetFile}}</filePathToDownload>
<downloadLocation>{${payload.downloadLocation}}</downloadLocation>
<responseVariable>msazuredatalakestorage_downloadFile_741</responseVariable>
<overwriteBody>true</overwriteBody>
<timeout></timeout>
<leaseId></leaseId>
<ifUnmodifiedSince></ifUnmodifiedSince>
<ifMatch></ifMatch>
<ifModifiedSince></ifModifiedSince>
<ifNoneMatch></ifNoneMatch>
<blockSize></blockSize>
<maxConcurrency></maxConcurrency>
<offset></offset>
<count></count>
<maxRetryRequests></maxRetryRequests>
<rangeGetContentMd5>false</rangeGetContentMd5>
</msazuredatalakestorage.downloadFile>
<respond description="doenload-file"/>
</inSequence>
<faultSequence>
</faultSequence>
</resource>
<resource methods="POST" uri-template="/readmetadata">
<inSequence>
<msazuredatalakestorage.getMetadata configKey="con1">
<fileSystemName>{${payload.filesystemName}}</fileSystemName>
<filePath>{${payload.targetPath}}</filePath>
<responseVariable>msazuredatalakestorage_getMetadata_316</responseVariable>
<overwriteBody>true</overwriteBody>
</msazuredatalakestorage.getMetadata>
<respond description="read-metadata"/>
</inSequence>
<faultSequence>
</faultSequence>
</resource>
<resource methods="POST" uri-template="/deletefilesystem">
<inSequence>
<msazuredatalakestorage.deleteFileSystem configKey="con1">
<fileSystemName>{${ payload.filesystemName}}</fileSystemName>
<responseVariable>msazuredatalakestorage_deleteFileSystem_137</responseVariable>
<overwriteBody>true</overwriteBody>
<timeout></timeout>
<leaseId></leaseId>
<ifUnmodifiedSince></ifUnmodifiedSince>
<ifMatch></ifMatch>
<ifModifiedSince></ifModifiedSince>
<ifNoneMatch></ifNoneMatch>
</msazuredatalakestorage.deleteFileSystem>
<respond description="delete-filesystem"/>
</inSequence>
<faultSequence>
</faultSequence>
</resource>
</api>
Build and run the artifacts¶
Now that you have developed an integration using the Micro Integrator for the Visual Studio Code plugin, it's time to deploy the integration to the Micro Integrator server runtime.
Click the Build and Run icon located in the top right corner of VS Code.
Refer to the Build and Run guide.
Get the project¶
You can download the ZIP file and extract the contents to get the project code.
Tip
You may need to update the value of the credentials and make other such changes before deploying and running this project.
Test¶
Invoke the API as shown below using inbuilt try-it functionality in the MI for VS Code extension or the curl command. Curl Application can be downloaded from here.
For inbuilt try-it functionality, Select MSAzureDataLakeStorageTestAPI
API and click on the Try it
button. Then, select the required API resource and add sample payload. Click on the Execute
button to invoke the API.
-
Creating a new file system in Microsoft Azure Data Lake Storage for storing customer behavior data.
Sample Payload
{ "filesystemName":"customer-behavior-analytics", "metadata":"customers" }
Sample Curl request
curl -v POST -d {"filesystemName":"customer-behavior-analytics", "metadata":"customers"} "http://localhost:8290/azure/createfilesystem" -H "Content-Type:application/json"
Expected Response
{ "status": true, "message": "Successfully created the filesystem" }
-
Create a directory.
Sample Payload
{ "filesystemName":"customer-behavior-analytics", "targetPath":"raw-data/transactions" }
Sample Curl request
curl -v POST 'http://localhost:8290/azure/createdirectory' --header 'Content-Type: application/json' -d '{"filesystemName": "customer-behavior-analytics", "targetPath": "raw-data/transactions"}'
Expected Response
{ "status": true, "message": "Successfully created the directory." }
-
Upload a file.
Note: The
localPath
should be the absolute path of the file in your local machine.
**Sample Payload**
```json
{
"filesystemName":"customer-behavior-analytics",
"targetPath":"raw-data/transactions/transactions_2025_01.json",
"localPath":"/path_to_file/transactions_2025_01.json",
"source":"customers"
}
```
**Sample Curl request**
```curl
curl -v POST 'http://localhost:8290/azure/uploadfile' --header 'Content-Type: application/json' -d '{"filesystemName": "customer-behavior-analytics", "targetPath": "raw-data/transactions/transactions_2025_01.json", "localPath" : "/path_to_file/transactions_2025_01.json", "source":"customers" }'
```
**Expected Response**
It will retrieve the content text.
```json
{
"status": true,
"message": "Successfully uploaded the file"
}
```
-
Download a file.
Note: The
downloadLocation
should be the absolute path of the file in your local machine.Sample Payload
{ "filesystemName":"customer-behavior-analytics", "targetFile":"raw-data/transactions/transactions_2025_01.json", "downloadLocation":"/path_to_download/transactions_2025_01.json" }
Sample Curl request
curl -v POST 'http://localhost:8290/azure/downloadfile' --header 'Content-Type: application/json' -d '{"filesystemName": "customer-behavior-analytics", "targetFile": "raw-data/transactions/transactions_2025_01.json", "downloadLocation":"/path_to_download/transactions_2025_01.json"}'
Expected Response
{ "status": true, "message": "Successfully downloaded the file" }
-
Retrieve the metadata from a specific file.
Sample Payload
{ "filesystemName":"customer-behavior-analytics", "targetPath":"raw-data/transactions/transactions_2025_01.json" }
Sample Curl request
curl -v POST 'http://localhost:8290/azure/readmetadata' --header 'Content-Type: application/json' -d '{"filesystemName": "customer-behavior-analytics", "targetPath": "raw-data/transactions/transactions_2025_01.json"}'
Expected Response
{ "status": true, "result": { "source" : "customers" } }
-
Remove created File System.
Sample Payload
{ "fileSystemName":"customer-behavior-analytics" }
Sample Curl request
curl -v POST -d {"fileSystemName":"customer-behavior-analytics"} "http://localhost:8290/azure/deletefilesystem" -H "Content-Type:application/json"
Expected Response
{ "status": true, "message": "Successfully deleted the file system" }
What's next¶
- You can deploy and run your project on Docker or Kubernetes. See the instructions in Running the Micro Integrator on Containers.