If you’re learning data analysis with Python, mastering the pandas DataFrame is essential. A DataFrame is a powerful, table-like data structure that lets you load, explore, clean, and analyze data quickly.
In this beginner-friendly guide, we’ll walk through the most common Python DataFrame functions you’ll use in day-to-day data analysis. Whether you’re working with CSV files, Excel sheets, or SQL query results, these functions will help you move from raw data to valuable insights faster.
1. Creating a DataFrame
import pandas as pd
# From dictionary
df = pd.DataFrame({
"city":["LA", "SF", "NYC", "MIA"],
"price":[12.5,13.0,9.9,15.2],
"date":["2025-08-01","2025-08-02","2025-08-03","2025-08-10"]
})
# From CSV
df = pd.read_csv("sales.csv", parse_dates=["date"])
2. Inspecting The Data
df.head() # First 5 rows
df.tail() # Last 3 rows
df.shape #(rows, columns)
df.info() # coloumn types & null counts
df.describe() # numeric summary
3. Selecting Rows and Columns
df["price"] # single column
df[["city","price"]] # multiple columns
df[df["price"]>12] # filter rows
df.query("city == 'LA'") # cleaner filtering
4. Sorting and Indexing
df.sort_value("price", ascending = False)
df.set_index("city")
df.reset_index()
5. Adding and Modifying Columns
df["price_with_tax"] = df["price"] * 1.09
df.rename(columns ={"price_with_tax": "taxed_price"}, inplace = True)
6. Handling Missing Values
df.isna().sum()
df.fillna(0)
df.dropna(subset=["price"])
7. Removing Duplicates
df.drop_duplicates()
df.drop_duplicates(subset=["city","date"])
8. Grouping and Aggregating
df.groupby("city")["price"].mean()
df.groupby("city").agg(
avg_price=("price","mean"),
count=("price","size")
)
9. Reshaping Data
# Pivot table
pd.pivot_table(df, value="price", index="city", columns="date", aggfunc="mean", fill_value=0)
# Melt
pd.melt(df, id_vars="city", var_name="metric", value_name="value")
10. Changing Data Types
df["price"] = pd.to_numeric(df["price"], errors="coerce")
df["date"] = pd.to_datetime(df["date"])
df["city"] = df["city"].astype("category")
11. Working with Strings
df["city"] = df["city"].str.strip().str.upper()
df[df["city"].str.contains("LA")]
12. Combining DataFrames
stores = pd.DataFrame=({"city":["LA","SF"],"region":["West","West"]})
df.merge(stores, on ="city", how="left")
pd.concat([df1,df2], axis=0, ignore_index=True)
13. Counting and Frequencies
df["city"].value_counts()
df["city"].nunique()
14. Saving Your Data
df.to_csv("clean_data.csv", index=False)
df.to_parquet("clean_data.parquet")
Conclusion
Learning these common Python DataFrame functions is the first step toward becoming confident in data analysis with pandas. Once you can load, inspect, filter, and summarize your data, you’ll be able to tackle more advanced analytics tasks like feature engineering, joining multiple datasets, and building dashboards.
Practice these functions with your own datasets, and you’ll quickly see how much faster and easier your analysis becomes. With this foundation, you’re ready to explore more advanced pandas capabilities — but remember, the basics here will always be part of your toolkit.
Discover more from Daily BI Talks
Subscribe to get the latest posts sent to your email.