Language

Data Analysis · Lesson 33 of 56

Pandas

Source: 10-Data Analysis With Python/10.2-pandas.ipynb

Start here — no coding background needed

What you will learn

Tables with rows and columns — like spreadsheets in code.

In simple words

Pandas `DataFrame` is a table you can filter, sort, and analyze.

Spreadsheet-style work with code — for data jobs. Beginners: read concepts, run small examples.

Easy example — try this first

Easy example — run this first. Change values and press Run again.

Python

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference notes (from full bootcamp)

Optional — deeper detail for when you are ready

Pandas-DataFrame And Series

Pandas is a powerful data manipulation library in Python, widely used for data analysis and data cleaning. It provides two primary data structures: Series and DataFrame. A Series is a one-dimensional array-like object, while a DataFrame is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns).

Reference example
Python

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
Series 
 0    1
1    2
2    3
3    4
4    5
dtype: int64
<class 'pandas.core.series.Series'>

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
a    1
b    2
c    3
dtype: int64

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
a    10
b    20
c    30
dtype: int64

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
    Name  Age       City
0  Anshul   25  Bangalore
1   John   30   New York
2   Jack   45    Florida
<class 'pandas.core.frame.DataFrame'>

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
    Name  Age       City
0  Anshul   32  Bangalore
1   John   34  Bangalore
2  Bappy   32  Bangalore
3   JAck   32  Bangalore
<class 'pandas.core.frame.DataFrame'>

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Example HCL
HCL
df=pd.read_csv('sales_data.csv')
df.head(5)

Browser practice only — full example needs Python on your computer (files, Flask, threads, etc.).

Reference example
Python
Output
Expected (from notebook):
     Transaction ID        Date Product Category  \
235           10236  2024-08-23  Home Appliances   
236           10237  2024-08-24         Clothing   
237           10238  2024-08-25            Books   
238           10239  2024-08-26  Beauty Products   
239           10240  2024-08-27           Sports   

                                        Product Name  Units Sold  Unit Price  \
235  Nespresso Vertuo Next Coffee and Espresso Maker           1      159.99   
236                        Nike Air Force 1 Sneakers           3       90.00   
237           The Handmaid's Tale by Margaret Atwood           3       10.99   
238             Sunday Riley Luna Sleeping Night Oil           1       55.00   
239                       Yeti Rambler 20 oz Tumbler           2       29.99   

     Total Revenue         Region Payment Method  
235         159.99         Europe         PayPal  
236         270.00           Asia     Debit Card  
237          32.97  North America    Credit Card  
238          55.00         Europe         PayPal  
239          59.98           Asia    Credit Card  

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
    Name  Age       City
0  Anshul   25  Bangalore
1   John   30   New York
2   Jack   45    Florida

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
0    Anshul
1     John
2     Jack
Name: Name, dtype: object

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
Name        Anshul
Age            25
City    Bangalore
Name: 0, dtype: object

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
Name        Anshul
Age            25
City    Bangalore
Name: 0, dtype: object

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
    Name  Age       City
0  Anshul   25  Bangalore
1   John   30   New York
2   Jack   45    Florida

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
45

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
'Jack'

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
'Florida'

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
    Name  Age       City
0  Anshul   25  Bangalore
1   John   30   New York
2   Jack   45    Florida

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
    Name  Age       City
0  Anshul   25  Bangalore
1   John   30   New York
2   Jack   45    Florida

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
    Name  Age       City  Salary
0  Anshul   25  Bangalore   50000
1   John   30   New York   60000
2   Jack   45    Florida   70000

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
    Name  Age       City
0  Anshul   25  Bangalore
1   John   30   New York
2   Jack   45    Florida

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
    Name  Age       City
0  Anshul   26  Bangalore
1   John   31   New York
2   Jack   46    Florida

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
   Name  Age      City
1  John   31  New York
2  Jack   46   Florida

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Example HCL
HCL
df=pd.read_csv('sales_data.csv')
df.head(5)

Browser practice only — full example needs Python on your computer (files, Flask, threads, etc.).

Reference example
Python
Output
Expected (from notebook):
Data types:
 Transaction ID        int64
Date                 object
Product Category     object
Product Name         object
Units Sold            int64
Unit Price          float64
Total Revenue       float64
Region               object
Payment Method       object
dtype: object
Statistical summary:
        Transaction ID  Units Sold   Unit Price  Total Revenue
count       240.00000  240.000000   240.000000     240.000000
mean      10120.50000    2.158333   236.395583     335.699375
std          69.42622    1.322454   429.446695     485.804469
min       10001.00000    1.000000     6.500000       6.500000
25%       10060.75000    1.000000    29.500000      62.965000
50%       10120.50000    2.000000    89.990000     179.970000
75%       10180.25000    3.000000   249.990000     399.225000
max       10240.00000   10.000000  3899.990000    3899.990000

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Reference example
Python
Output
Expected (from notebook):
       Transaction ID  Units Sold   Unit Price  Total Revenue
count       240.00000  240.000000   240.000000     240.000000
mean      10120.50000    2.158333   236.395583     335.699375
std          69.42622    1.322454   429.446695     485.804469
min       10001.00000    1.000000     6.500000       6.500000
25%       10060.75000    1.000000    29.500000      62.965000
50%       10120.50000    2.000000    89.990000     179.970000
75%       10180.25000    3.000000   249.990000     399.225000
max       10240.00000   10.000000  3899.990000    3899.990000

Runs in your browser via Pyodide — no server. First run may take a few seconds.

Practice test — try yourself

Write code, press Check. Wrong answer shows the correct code to copy & run.

You learned "Pandas". Use print() to show: Done: Pandas

Hint: Use one print() with the exact text.

Python