![]() By using the isna with the sum function, we can see the number of missing values in each column. The isna function determines the missing values in a dataframe. frac: The ratio of the sample size to the whole dataframe sizeĭf_sample = df.sample(n=1000) df_sample.shape (1000,10) df_sample2 = df.sample(frac=0.1) df_sample2.shape (1000,10) 5.We can either use the n parameter or frac parameter to determine the sample size. ![]() SampleĪfter creating a dataframe, we may want to draw a small sample to work. Skiprows=5000 means that we will skip the first 5000 rows while reading the csv file. We can also select rows from the end of the file by using the skiprows parameter. Using the nrows parameters, we created a dataframe that contains the first 5000 rows of the csv file. The first one is to read the first n number of rows. The read_csv function allows reading a part of the dataframe in terms of the rows. df_spec = pd.read_csv("/content/churn.csv", usecols=) df_spec.head() It is better than dropping later on if you know the column names beforehand. The list of columns is passed to the usecols parameter while reading. We can read only some of the columns from the csv file. ![]() We dropped 4 columns so the number of columns reduced to 10 from 14. The inplace parameter is set as True to save the changes. The axis parameter is set as 1 to drop columns and 0 for rows. We pass the labels of rows or columns to be dropped. The drop function is used to drop columns and rows. import numpy as np import pandas as pd df = pd.read_csv("/content/churn.csv") df.shape (10000,14) df.columns Index(, dtype='object') 1. Let’s start by reading the csv file into a pandas dataframe. The examples will cover almost all the functions and methods you are likely to use in a typical data analysis process. I will do examples on a customer churn dataset that is available on Kaggle. This one will be one of them but heavily focusing on the practical side. It provides numerous functions and methods that expedite the data analysis and preprocessing steps.ĭue to its popularity, there are lots of articles and tutorials about Pandas. Pandas is a widely-used data analysis and manipulation library for Python.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |