Data Science using Python

Data Science using Python


18% GST Extra

Please login to purchase the course.

New batch starting from 2nd January (Timing 4:00 – 5:00 PM)

SKU: cid_136016 Category:
Topics to be covered

Data science is divided into two parts:

a) Data Analysis and Data Visualization

b) Predictive Modeling

A) Data Analysis and Visualization

1: Numpy: Dealing with N-dim array

  • Overview
  • Creating ndims arrays
  • Why do we need arrays?
  • Numeric operations using NumPy
  • Indexing and slicing
  • Some Mathematical functions
  • Generate Random array

2: Pandas: Data analysis and Manipulation

  • Pandas Overview
  • Data Structures
    • Series
    • DataFrame
  • Series and DataFrame operations
  • Missing Data
  • Categorical Data
  • Working on DateTime data
  • Read data from the different file format
  • Merging and Grouping Data
  • Many other data operations using Pandas

3: Matplotlib / Seaborn : Data visualization

  • Overview
  • Scatter plot, line plot, bar plot
  • Histogram
  • Xlabel, Ylabel, Xticks, Yticks, title
  • Marker style,type, size
  • Figure and Subplot
  • Saving a Figure
  • HeatMap,BoxPlot

4: Text analysis using NLTK

  • What is NLP?
  • NLP libraries
  • NLP Applications
  • Cleaning text data
  • Tokenization
  • Removal Stop words
  • Stemming and Lemmatization
  • part-of-speech(POS) tagging
B) Predictive Modeling using scikit-learn

1: scikit-learn

  • Regression
    • Introduction
    • Simple Linear Regression
    • Multiple Linear Regression
    • Polynomial Regression
    • Evaluate Performance of a linear regression model
    • Overfitting and underfitting
    • Regularization
  • Logistic Regression
    • Logistic Regression theory
    • Implementing Logistic regression with scikit-learn
    • Logistic Regression Parameters
    • MNIST digit dataset with Logistic Regression
    • Predictive modeling on adult income dataset
  • Naive Bayes Classification
    • Theory Naive Bayes Algorithm
    • Features Extraction
      • Countvectorizer
      • TF-IDF
    • Email Spam filtering
    • Sentiment analysis
  • Decision Tree and Random Forest
    • The theory behind the decision tree
    • Implementing a decision tree with scikit-learn
    • Decision tree parameters
    • Combining multiple decision trees via Random forest
    • How random forest works..?
  • Model Evaluation and Parameter Tuning
    • Cross-validation via K-Fold
    • Tuning hyperparameters via grid search
    • Confusion matrix
    • Recall and Precision
    • ROC and AUC
  • Clustering and Dimension Reduction
    • K-means Clustering
    • Elbow method
    • Principal components analysis(PCA)
    • PCA step by step
    • Implementing PCA with scikit-learn

Target Audience

The course can be taken by:

Students: All students who are pursuing professional graduate/post-graduate courses related to computer science or Information Technology.

Teachers/Faculties: All computer science and engineering teachers/faculties.

Professionals: All IT professionals, who wish to acquire new skills or improve their existing skills.

Test & Evaluation

1. During the program, the participants will have to take all the assignments given to them for better learning.

2. At the end of the program, a final assessment will be conducted.


1. All successful participants will be provided with a certificate of completion.

2. Students who do not complete the course / leave it midway will not be awarded any certificate.

  • Time-saving & Cost-effective
  • Get trained via industry experts (having 10+ years of experience in the same field, corporate trainers)
  • Full of hands-on practical exposure for better understanding
  • Adding super solid value in your professional career
  • Weekend Doubt clearing sessions.

For inquiry call:  9910043510

Online Live Training Program 2020