top of page

Policies to manage Data in Microsoft Fabric's Synapse Real-Time Analytics

Writer's picture: Manpreet SinghManpreet Singh

Synapse Real-Time Analytics in Microsoft Fabric is a comprehensive analytics service that brings together big data and data warehousing into a unified, integrated environment. It provides real-time analytics on large volumes of data streaming from applications, websites, IoT devices, and more. It’s designed to handle both relational and non-relational data, making it a versatile choice for a wide range of data analytics needs.


Key features of Synapse Real-Time Analytics include:

  • Real-time analytics: Analyze data in real-time as it’s ingested.

  • Scalability: Scale up or down based on your needs, and pay only for what you use.

  • Integration: Seamlessly integrate with various data sources and services within the Microsoft ecosystem.

  • Security: Benefit from Microsoft’s robust security measures to protect your data.


Benefits of Data Management

Effective data management is crucial in today’s data-driven world. It ensures that high-quality data is available to the right people at the right time. Here’s why it’s important:

  • Improved decision-making: With accurate, up-to-date data, businesses can make informed decisions that drive growth and profitability.

  • Efficiency and productivity: Proper data management eliminates redundancies and streamlines operations, saving time and resources.

  • Compliance: Effective data management helps businesses comply with regulations and standards, avoiding penalties and reputational damage.

  • Security: Good data management practices protect sensitive data from breaches and cyber threats.


In the context of Synapse Real-Time Analytics, effective data management allows businesses to fully leverage the power of real-time analytics, leading to actionable insights and a competitive edge


Managing Data in Synapse Real-Time Analytics (Microsoft Fabric)

Below are some of the policies using which you can manage your data in Microsoft Fabric:


1. Use Data Retention Policy

In Synapse Real-Time Analytics in Microsoft Fabric, the data retention policy is a mechanism that automatically removes data from tables or materialized views after a certain period. This policy is useful for managing data that continuously flows into a table and whose relevance is age-based.


For example, it can be used for a table that holds diagnostic events that might become uninteresting after a certain period.


Here’s how you can set up a data retention policy:

STEP 1: Navigate to your KQL database in your Synapse Real-Time Analytics environment.


STEP 2: Select "Manage > Data policies".

Data Retention Policy to manage Data in Microsoft Fabric's Sypanse Real-Time Analytics

STEP 3: Under Retention, you can either select the toggle to set the retention period to Unlimited, or enter a specific period. By default, your data is stored for 3650 days.

Data Retention Policy to manage Data in Microsoft Fabric's Sypanse Real-Time Analytics 2

STEP 4: Select Done to save your changes.


Benefits of Data Retention Policy in Synapse Real-Time Analytics in Microsoft Fabric

  • Efficient Management: Automates data removal, saving storage space and keeping your environment clean.

  • Relevance: Removes data with limited shelf life (e.g., old diagnostics events).

  • Compliance: Helps meet data protection regulations by deleting data after a set time.


Limitations of Data Retention Policy in Synapse Real-Time Analytics in Microsoft Fabric

While the data retention policy offers several benefits, it’s important to be aware of potential limitations. These can include:

  • Data Loss: Risk of losing valuable data if the retention period is too short.

  • Inflexibility: May not allow different retention periods for various data types.

  • Performance: May impact transactional workloads due to data retention overhead.

  • Insights Delay: Data insights might be delayed due to batch-wise data retrieval.

  • Data Management: Requires additional management of data formats and storage layer for analytics.


2. Use Caching Policy

The caching policy in Synapse Real-Time Analytics in Microsoft Fabric is a feature that allows you to choose which data should be cached and kept in local SSD storage. This policy is particularly beneficial for improving query performance.


Here’s how you can set up a caching policy:

Under Caching, you can either select the toggle to set the caching period to Unlimited, or enter a specific period. By default, your data is cached for 3650 days.

Catching Policy to manage Data in Microsoft Fabric's Sypanse Real-Time Analytics

The caching policy is a powerful tool for managing your data in Synapse Real-Time Analytics. It allows you to optimize the performance of your queries by keeping frequently accessed data in a faster, more accessible storage. However, it’s important to note that while caching can increase query performance, it can also increase storage costs.


Benefits of Caching Policy in Synapse Real-Time Analytics in Microsoft Fabric

  • Improved Performance: Cache policy stores frequently accessed data, reducing query latency and improving overall performance for subsequent queries.

  • Reduced Costs: By minimizing data retrieval from the source tables, the cache policy can potentially lower data processing costs.


Limitations of Caching Policy in Synapse Real-Time Analytics in Microsoft Fabric

  • Data Consistency: There might be a delay in updating the cache with the latest data from the source tables. This could lead to inconsistencies where the cache holds outdated information.

  • Cache Invalidation: Managing cache invalidation to ensure the cache reflects recent changes in the source tables can be complex.

  • Storage Overhead: The cached data requires additional storage space, which can incur costs.

  • Limited Applicability: Cache policy might not be beneficial for all queries, especially those dealing with constantly changing data.


3. Use One Logical Copy

The “One Logical Copy” policy in Synapse Real-Time Analytics in Microsoft Fabric is a feature that allows data to be available to Microsoft OneLake and exposed to other Fabric experiences. This policy is designed to maintain a single copy of data that is managed once and paid for once.


The “One Logical Copy” policy ensures that the data in your KQL database is available to all Microsoft Fabric experiences. This means that the data can be accessed and used across different services within the Microsoft Fabric platform.


Since the data is managed once and paid for once, this policy can lead to significant cost savings. You’re not charged multiple times for storing the same data in different services.


By maintaining a single logical copy of data, this policy helps ensure data consistency across different services. Any changes made to the data in one service are reflected in all other services that use the data.


While the “One Logical Copy” policy offers several benefits, it’s important to be aware of potential limitations. These can include:

  1. Data Synchronization: Depending on the system, there might be a delay in synchronizing the data across all services. This could potentially lead to situations where some services are working with outdated data.

  2. Data Management Complexity: Managing a single logical copy of data across multiple services can be complex. It requires a good understanding of data management and the specific requirements of each service.

  3. Potential for Data Inconsistency: If not properly managed, maintaining a single logical copy of data could potentially lead to data inconsistency issues. For example, if an error occurs during data synchronization, some services might end up with incorrect data.


4. Use Table Update Policy

The Table Update Policy in Synapse Real-Time Analytics in Microsoft Fabric is a feature that allows data to be appended to a target table when an update policy is triggered with a command that adds data to a source table. This means that when new data is added to a source table, the same data is also added to a target table.

The target table can have a different schema, retention policy, and other policies from the source table.


For example, a high-rate trace source table can contain data formatted as a free-text column. The target table can include specific trace lines, with a well-structured schema generated from a transformation of the source table’s free-text data using the parse operator.


Here’s how you can set up a table update policy:

STEP 1: Select "New > Table update policy". The .alter update policy command is automatically populated in the Explore your data window.

Table update Policy to manage Data in Microsoft Fabric's Sypanse Real-Time Analytics

STEP 2: Enter the parameters of your table update policy, and then select Run.


Table Update Policy to manage Data in Microsoft Fabric's Sypanse Real-Time Analytics 2


Benefits of Table Update Policy in Synapse Real-Time Analytics in Microsoft Fabric

  • Syncs Data: Keeps target tables up-to-date with source tables when new data is added.

  • Flexible Management: Allows different schemas, retention, and other policies for target tables.

  • Saves Time: Automates target table updates, reducing manual work.


Limitations of Table Update Policy in Synapse Real-Time Analytics in Microsoft Fabric

  • Data Consistency Delays: Potential for lag in synchronizing between source and target tables.

  • Complexity: Requires understanding of data management and table needs to use effectively.

  • Performance Impact: May impact transactional workloads due to data synchronization overhead.

  • Insights Delay: Data insights might be delayed due to batch-wise data retrieval.

  • Data Management: Requires additional management of data formats and storage layer for analytics.


5. Use Materialized View

A materialized view in Synapse Real-Time Analytics in Microsoft Fabric is an aggregation query over a source table, or another materialized view. It represents a single summary statement.

.

There are two types of materialized views:

  • Empty Materialized View: Includes data ingested after creation. Creation is quick, and the view is available for queries immediately.

  • Materialized View with Existing Source Table Data: Creation might take longer depending on the source table size.


Here’s how you can create a materialized view:

STEP 1: Select +New > Materialized view. The materialized view command is populated in the Explore your data window.


Materialized View Policy to manage Data in Microsoft Fabric's Sypanse Real-Time Analytics

STEP 2: Enter the table name and query statement of your materialized view instead of the placeholder text, and then select Run.


Materialized views appear under Materialized views in the Explorer pane. For more information, see .create materialized-view.

Materialized Policy to manage Data in Microsoft Fabric's Sypanse Real-Time Analytics


Benefits of Materialized View in Synapse Real-Time Analytics in Microsoft Fabric

  • Faster Queries: Pre-computed data speeds up complex queries with aggregations and joins.

  • Always Fresh: Results reflect the latest data, even if the view wasn't recently materialized.

  • Reduced Execution Time: Queries run much faster, saving time and resources.

  • Low Maintenance: Easy to use for performance gains without modifying original queries.


Limitations of Materialized View in Synapse Real-Time Analytics in Microsoft Fabric

  • Resource Consumption: Materialization requires memory and CPU, which have limits.

  • Data Staleness: Removed data from the source table won't be reflected in the view.

  • Storage Cost: Disabled materialized views still incur storage charges.


Conclusion

Synapse Real-Time Analytics offers built-in data management policies (retention, caching, etc.) and powerful management commands for automation and advanced tasks. Utilizing these tools ensures clean, secure data and unlocks the full potential of SRTA for real-time insights.

1 Comment


Patrisia Tendi
Patrisia Tendi
Jun 27, 2024

I'm looking for some help with my literature review and would appreciate any advice or recommendations. Has anyone here used professional services or online resources to assist with their literature review? How was the quality of the support you received? Were the reviews comprehensive, well-organized, and properly cited? I'm particularly concerned about ensuring literature review help originality and avoiding plagiarism. How responsive and helpful was the customer service with any issues or revisions you needed? Additionally, was the pricing reasonable for the quality of assistance provided? Balancing a heavy workload with producing a thorough and high-quality literature review is challenging, so any insights or personal experiences would be extremely helpful. Your feedback will help me make an informed decision. Thanks…

Like
bottom of page