Contents

Python Pandas - Cheat Sheet

1. Installation/Import

Install Pandas
pip install pandas
Import Pandas
import pandas as pd

2. Reading in Data

Read in a CSV
df = pd.read_csv('data.csv')
Read in XLS/XLSX
df = pd.read_excel('data.xlsx', sheet_name='Sheet1')
Read in HTML table(s)
df = pd.read_html('data.html')[0]
Read in a SQL table
import sqlite3
con = sqlite3.connect("database.db")
df = pd.read_sql_table("table_name", con)

3. DataFrame Basics

RCreate a new DataFrame from a dictionary
df = pd.DataFrame({'Animal': ['Dog', 'Cat', 'Bird'], 'Age': [3, 4, 5]})
Create a new DataFrame from a list of lists
df = pd.DataFrame([['Dog', 3], ['Cat', 4], ['Bird', 5]], columns=['Animal', 'Age'])
Access the column names
df.columns
Access the values as a NumPy array
df.values
Access a column
df['Animal']
Access multiple columns
df[['Animal', 'Age']]

4. Data Wrangling

Select rows with a certain value
df[df['Animal'] == 'Dog']
Select rows with multiple conditions
df[(df['Animal'] == 'Dog') & (df['Age'] > 3)]
Replace a value
df['Animal'].replace('Bird', 'Parrot')
Rename a column
df.rename(columns={'Animal': 'Species'}, inplace=True)
Sort values
df.sort_values(by='Age')
Drop a row
df.drop(index=0)
Drop a column
df.drop(columns='Age')
Drop duplicates
df.drop_duplicates()
Concatenate two DataFrames
df1 = pd.DataFrame({'Animal': ['Dog', 'Cat', 'Bird'], 'Age': [3, 4, 5]})
df2 = pd.DataFrame({'Animal': ['Dog', 'Cat', 'Fish'], 'Age': [3, 4, 5]})
df3 = pd.concat([df1, df2])

5. Resources

Python

Pandas