Knowledge Base

Creating Materialized Views with Dedicated Clusters in NQL

Creating Materialized Views with Dedicated Clusters in NQL

Overview

To improve data processing performance, the Narrative Data Collaboration Platform supports dedicated clusters for executing materialized view jobs. This feature provides faster job execution, better resource utilization, and cost-effective data management, particularly for jobs that require repeated or high-volume queries.

What Are Dedicated and Shared Clusters?

The platform offers two types of clusters for executing materialized view jobs: shared clusters and dedicated clusters. Understanding the difference between these cluster types can help you choose the best option for your specific needs.

Shared Clusters: Shared clusters are general-purpose environments that handle multiple types of jobs simultaneously, including materialized views, forecasts, and other data operations.

  • Execution: When a job is submitted to a shared cluster, it enters a common queue. This can lead to longer wait times, especially if the queue is busy with large or complex jobs.
  • Use Case: Shared clusters are ideal for users who do not require immediate job execution.

Dedicated Clusters: Dedicated clusters are specialized environments created exclusively for executing materialized view jobs. Each job runs on its own cluster, isolated from other job types.

  • Execution: When a materialized view job is submitted to a dedicated cluster, a new cluster is spun up specifically for that job. This ensures the job does not compete with others for resources, resulting in faster execution.
  • Use Case: Dedicated clusters are perfect for users who need quick turnaround times for their materialized view jobs or have high-volume queries that require isolated processing.

Key Benefits of Dedicated Clusters

  1. Faster Job Execution: By isolating materialized view jobs from other types of jobs, dedicated clusters reduce delays and ensure that jobs are processed quickly.
  2. Dynamic Scaling: Dedicated clusters automatically scale up or down based on the number and size of the jobs queued. This dynamic scaling optimizes resource use and helps manage costs effectively.
  3. Cost-Efficiency: Dedicated clusters provide a cost-effective solution by allowing users to pay only for the resources they use, without the need for additional expenses associated with private executors.

How to Use Dedicated Clusters

To utilize a dedicated cluster for a materialized view job, you can specify the cluster type when submitting the job. By default, jobs will be submitted to a shared cluster, but you can opt for a dedicated cluster for improved performance and efficiency.

Example API Call:

{
  "nql": "CREATE MATERIALIZED VIEW \"dedicated_cluster_job\" EXPIRE = 'P1D' 
  TAGS = ('_nio_interactive') AS SELECT company_data.\"1234\".\"_rosetta_stone\".\"hashed_email\" 
  FROM company_data.\"1234\" WHERE (company_data.\"1234\".\"BRAND\" = 'Nike') LIMIT 1000 ROWS",
  "data_plane_id": "aaaa-bbbb-ccccc-ddddddddd",
  "execution_cluster": {
    "type": "dedicated"  // Choose "shared" or "dedicated" based on your needs
  }
}

Conclusion

Dedicated materialized view clusters provide a powerful solution for optimizing data processing on the Narrative Data Collaboration Platform. By choosing between shared and dedicated clusters, users can tailor their data processing environments to meet their specific needs, ensuring faster execution, better resource management, and cost-efficiency. Whether you need quick turnaround times or are looking for a more budget-friendly option, the platform's flexible cluster options can help you achieve your data processing goals.

For more information on setting up and using dedicated clusters, please refer to our API Documentation.

< Back
Rosetta

Hi! I’m Rosetta, your big data assistant. Ask me anything! If you want to talk to one of our wonderful human team members, let me know! I can schedule a call for you.