Tag Archives: Data

Hive SQL vs Regular SQL: A Practical Guide for Data Engineers and Analysts

In the world of data processing, SQL is the lingua franca—but not all SQLs are created equal. If you’ve worked with big data tools like Apache Hive, you’ve probably noticed that Hive SQL isn’t exactly the same as traditional SQL used in relational databases like MySQL, PostgreSQL, or SQL Server.

Continue reading →

How a WHERE Clause Can Break Your LEFT JOIN — and How to Fix It

TL;DR:
If you’re using a LEFT JOIN but filtering on the right table in the WHERE clause, you might unintentionally turn it into an INNER JOIN. Here’s why that happens, how to fix it, and how different databases handle it.

Continue reading →

Understanding Partitioning in Databases: What p_date Means and Why It Matters

As data grows larger and more complex, optimizing for performance and scalability becomes essential. Partitioning is one of the most powerful strategies for managing big datasets efficiently. If you’ve come across a column like p_date in an SQL query, it often signals the use of table partitioning. But what does that mean, and how is it different from traditional databases?

Continue reading →

Mastering SQL LEAD and LAG Functions: Advanced Data Analysis

SQL window functions like LEAD and LAG are indispensable tools for business intelligence and data analysis, enabling professionals to uncover trends, track changes, and derive actionable insights. In this guide, we’ll break down how these functions work and demonstrate their real-world applications in SQL-driven analytics.

Continue reading →

Advanced SQL: Windows Function Cheat Sheet

SQL window functions are one of the most powerful tools for performing calculations across a defined range of rows while maintaining access to individual row-level details. Unlike aggregate functions that collapse rows into a single value, window functions allow you to perform calculations over a subset of rows (a window) while preserving the original row structure.

Continue reading →

From OLTP to OLAP to HTAP and What’s More?

Data processing systems have come a long way, transforming to meet the ever-growing needs of modern businesses. In this blog, we’ll dive into three major systems that have shaped the landscape: OLTP (Online Transaction Processing), OLAP (Online Analytical Processing), and the cutting-edge HTAP (Hybrid Transaction/Analytical Processing).

Continue reading →

The Ultimate Guide to SQL Functions for Data Analysts

If you work with data, SQL functions are your toolkit for unlocking powerful insights. Whether you’re crafting detailed reports or uncovering business-changing analytics, SQL is the backbone for data doers in every industry. This guide explores 27 SQL functions that every data analyst, data engineer, or analytics enthusiast should know. These functions are grouped into five categories for easier understanding and practical use.

Continue reading →

How to Convert Degree Minute Second (DMS) to Latitude and Longitude in Tableau

Geographic data is often stored in a Degree-Minute-Second (DMS) format, making it a bit tricky to convert into the standard decimal degrees (DD) required for Tableau’s mapping functions. In this post, I’ll walk you through how to transform DMS coordinates to decimal degrees directly in Tableau, step-by-step, so you can plot your geographic data without headaches.

Continue reading →

How to add Current DateTime as New Column to Fabric Lakehouse using Notebook PySpark

In this post, I’ll explain how to add the current date and time to an existing PySpark DataFrame in a Fabric Notebook. This is particularly helpful when inserting data into a Fabric Lakehouse table, as it allows you to track when each record was added. I’ll walk through an example using a DataFrame to first load some sample data and show how to append a new column with the current timestamp to capture the insertion time.

Continue reading →

Slowly Changing Dimensions(SCD) with Microsoft Fabric — What are they and How to Implement

Data warehousing and business intelligence are crucial for modern enterprises, enabling them to make informed decisions based on historical data. One key aspect of data warehousing is handling changing data over time, especially when tracking historical changes in dimensions. This is where Slowly Changing Dimensions (SCD) come into play. In this blog post, we’ll delve into what SCDs are, the different types of SCDs, and how to implement them effectively.

Continue reading →

Daily BI Talks

Business Intelligence Chats and Tips for Data Professionals!

Tag Archives: Data

Hive SQL vs Regular SQL: A Practical Guide for Data Engineers and Analysts

How a WHERE Clause Can Break Your LEFT JOIN — and How to Fix It

Understanding Partitioning in Databases: What p_date Means and Why It Matters

Mastering SQL LEAD and LAG Functions: Advanced Data Analysis

Advanced SQL: Windows Function Cheat Sheet

From OLTP to OLAP to HTAP and What’s More?

The Ultimate Guide to SQL Functions for Data Analysts

How to Convert Degree Minute Second (DMS) to Latitude and Longitude in Tableau

How to add Current DateTime as New Column to Fabric Lakehouse using Notebook PySpark

Slowly Changing Dimensions(SCD) with Microsoft Fabric — What are they and How to Implement