bugl
bugl
HomeLearnPatternsPathsSearch
HomeLearnPatternsPathsSearch

Loading lesson path

Learn/Data Science/Data Science
Data Science•Data Science

Data Science - Python DataFrame

Flash cards

Review the key moves

1/4
Core idea

What is the main idea behind Data Science - Python DataFrame?

Lesson checks

Practice each idea before moving on

Short Mimo-style checks built from this lesson's code, terms, and sequence.

1Quick choice

Which statement best captures the main point of this lesson?

2Fill blank

Complete the missing token from the example code.

___ pandas as pd
3Order

Put the learning moves in the order that makes the concept easiest to apply.

Why Can We Not Just Count the Rows and Columns Ourselves?
Interpreting the Output
Create a DataFrame with Pandas
4Data move

Before charting or modeling a dataset, which move should come first?

Create a DataFrame with Pandas

A data frame is a structured representation of data.

Let's define a data frame with 3 columns and 5 rows with fictional numbers:

Example

import pandas as pd
d = {'col1': [1, 2, 3, 4, 7], 'col2': [4, 5, 6, 9,
5], 'col3': [7, 8, 12, 1, 11]}
df = pd.DataFrame(data=d)

print(df)

Example Explained

  • Import the Pandas library as pd
  • Define data with column and rows in a variable named d
  • Create a data frame using the function pd.DataFrame()
  • The data frame contains 3 columns and 5 rows
  • Print the data frame output with the print() function

We write pd. in front of DataFrame() to let Python know that we want to activate the DataFrame() function from the Pandas library.

Be aware of the capital D and F in DataFrame!

Interpreting the Output

This is the output

We see that "col1", "col2" and "col3" are the names of the columns.

Do not be confused about the vertical numbers ranging from 0-4. They tell us the information about the position of the rows.

In Python, the numbering of rows starts with zero.

Now, we can use Python to count the columns and rows.

We can use df.shape[1] to find the number of columns:

Example

import pandas as pd

df = pd.DataFrame({
    "city": ["Vancouver", "Calgary", "Toronto"],
    "visits": [1200, 860, 2100],
    "signups": [156, 95, 252],
})

count_column = df.shape[1]
print(count_column)

We can use df.shape[0] to find the number of rows:

Example

import pandas as pd

df = pd.DataFrame({
    "city": ["Vancouver", "Calgary", "Toronto"],
    "visits": [1200, 860, 2100],
    "signups": [156, 95, 252],
})

count_row = df.shape[0]
print(count_row)

Why Can We Not Just Count the Rows and Columns Ourselves?

If we work with larger data sets with many columns and rows, it will be confusing to count it by yourself. You risk to count it wrongly. If we use the built-in functions in Python correctly, we assure that the count is correct.

Previous

Data Science & Python

Next

Data Science Functions