The GATE DA syllabus is the foundation on which every preparation decision should be built. Understanding which topics are included, how much each section typically contributes to the final paper, and which areas demand the deepest investment of time is what separates structured preparation from scattered effort.
GATE DA covers an intentionally broad and interdisciplinary range of subjects — from classical mathematics and statistics to modern machine learning and artificial intelligence. This combination is what makes the paper both distinctive and challenging. Candidates who approach it with the same depth they would bring to a pure mathematics or pure computer science exam tend to perform well. Candidates who treat it superficially across all sections tend to fall short.
This page breaks down every section of the GATE DA syllabus with topic-level detail, historical weightage insights, recommended books, and preparation sequencing guidance.
For the complete exam overview, visit the GATE DA Complete Guide. Track every topic as it is completed using the GATE DA Syllabus Tracker on AspirantMitraa, which provides a visual progress map across all sections.
GATE DA Syllabus Structure
The GATE DA paper is divided into two parts:
| Part Marks | |
| General Aptitude (GA) | 15 marks |
| Data Science and AI Core | 85 marks |
| Total | 100 marks |
The core 85 marks are distributed across seven subject sections. The number of questions and exact marks per section vary slightly year to year, but the proportional distribution has been relatively consistent.
Section 1: General Aptitude (15 Marks)
General Aptitude is common across all GATE papers. It consists of 10 questions — five 1-mark questions and five 2-mark questions.
Verbal Ability
- English grammar: subject-verb agreement, tense, prepositions, articles
- Sentence completion and fill in the blanks
- Verbal analogies and word groups
- Reading comprehension: inference, tone, main idea
- Critical reasoning: argument evaluation, logical deductions
Numerical Ability
- Basic arithmetic: percentages, ratios, proportions, averages
- Profit and loss, simple and compound interest
- Time and work, time and distance
- Data interpretation: bar graphs, pie charts, line graphs, tables
- Logical and analytical puzzles
- Number series and pattern recognition
Preparation note: GA is often approached as an afterthought by technical candidates, which is a strategic mistake. Scoring 13 to 15 out of 15 in GA is achievable with 3 to 4 weeks of focused practice and can make a material difference in the final score. Solve all GA sections from GATE DA PYQs and supplement with a standard aptitude book.
Recommended Book: A Modern Approach to Verbal and Non-Verbal Reasoning by R.S. Aggarwal (for practice); GATE official previous year GA questions for calibration.
Section 2: Probability and Statistics (Approx. 13 to 17 Marks)
Probability and Statistics is the most heavily weighted technical section in GATE DA and the section that most distinguishes this paper from GATE CS. A candidate who excels here has a significant advantage.
Counting and Probability Fundamentals
- Counting principles: multiplication rule, addition rule, permutations, combinations
- Sample spaces and events
- Probability axioms and basic theorems
- Classical, empirical, and axiomatic probability
Conditional Probability and Independence
- Conditional probability definition and computation
- Multiplication rule for dependent events
- Independence of events and pair-wise independence
- Total probability theorem
- Bayes theorem and posterior probability
Random Variables
- Discrete random variables: PMF, CDF
- Continuous random variables: PDF, CDF
- Expectation (mean), variance, standard deviation
- Functions of random variables
- Joint distributions: marginal and conditional distributions
- Covariance and correlation coefficient
Probability Distributions
Discrete distributions:
- Bernoulli and Binomial
- Geometric
- Poisson
Continuous distributions:
- Uniform
- Normal (Gaussian) — properties, standard normal, Z-scores
- Exponential
- Gamma (conceptual)
Descriptive Statistics
- Measures of central tendency: mean, median, mode
- Measures of dispersion: range, variance, standard deviation, IQR
- Skewness and kurtosis (conceptual)
- Percentiles and quartiles
- Box plots and their interpretation
Statistical Inference
- Point estimation: unbiasedness, consistency
- Confidence intervals: for mean (known and unknown variance)
- Hypothesis testing framework: null and alternative hypotheses, Type I and Type II errors
- z-test, t-test, paired t-test
- Chi-square test for goodness of fit and independence
- p-values and significance levels
Regression Analysis
- Simple linear regression: model, least squares estimation
- Multiple linear regression: matrix formulation, interpretation
- Coefficient of determination (R²)
- Residual analysis
Recommended Books:
- Introduction to Probability by Dimitri Bertsekas and John Tsitsiklis
- Probability and Statistics for Engineers and Scientists by Walpole, Myers, and Myers
- Statistics by Freedman, Pisani, and Purves (for intuitive understanding)
Section 3: Linear Algebra (Approx. 10 to 14 Marks)
Linear Algebra is the mathematical language of machine learning, making it doubly important in GATE DA — it is tested both directly in this section and implicitly in the Machine Learning section.
Vectors and Vector Spaces
- Vectors in Rn: addition, scalar multiplication, linear combinations
- Vector spaces and subspaces: definition, span, basis
- Linear independence and dependence
- Dimension and rank of a vector space
- Inner products and orthogonality
- Gram-Schmidt orthogonalization (conceptual)
Matrices
- Matrix types: square, symmetric, skew-symmetric, orthogonal, diagonal, identity
- Matrix operations: addition, multiplication, transpose, inverse
- Determinants: properties, cofactor expansion
- Rank of a matrix: row rank, column rank equivalence
- Null space (kernel) and column space
- Trace of a matrix
Systems of Linear Equations
- Existence and uniqueness of solutions
- Gaussian elimination and back substitution
- Row echelon form and reduced row echelon form
- Homogeneous and non-homogeneous systems
- Overdetermined and underdetermined systems
Eigenvalues and Eigenvectors
- Definition and computation
- Characteristic polynomial
- Properties: trace = sum of eigenvalues, determinant = product of eigenvalues
- Eigendecomposition
- Symmetric matrices: real eigenvalues, orthogonal eigenvectors
- Applications: diagonalization, PCA foundations
Matrix Factorizations
- LU decomposition (conceptual and applications)
- Singular Value Decomposition (SVD): definition, geometric interpretation
- SVD in dimensionality reduction and PCA
- Positive definite and positive semi-definite matrices
Recommended Books:
- Introduction to Linear Algebra by Gilbert Strang (essential)
- Linear Algebra and Its Applications by David Lay
Section 4: Calculus and Optimization (Approx. 8 to 12 Marks)
Calculus and optimization underpin how machine learning models are trained. Gradient descent is fundamentally a calculus operation. Understanding it mathematically, not just conceptually, is important for GATE DA.
Single-variable Calculus
- Limits: definition, L'Hopital's rule
- Continuity and differentiability
- Derivatives: rules (product, quotient, chain)
- Higher-order derivatives
- Maxima and minima: first and second derivative tests
- Mean Value Theorem and Rolle's Theorem
- Definite and indefinite integrals: standard forms, substitution, integration by parts
Multivariable Calculus
- Partial derivatives and their interpretation
- Gradient vector: definition and geometric meaning
- Directional derivatives
- Chain rule for multivariable functions
- Jacobian matrix (conceptual)
- Maxima, minima, and saddle points in 2D
- Lagrange multipliers for constrained optimization
Optimization
- Unconstrained optimization: gradient-based methods
- Gradient Descent: batch, stochastic (SGD), mini-batch
- Learning rate and convergence
- Convex functions and convex sets
- Convex optimization: why it matters in ML
- Newton's method (conceptual)
- Constrained optimization: KKT conditions (introduction)
Recommended Books:
- Calculus: Early Transcendentals by James Stewart
- Mathematics for Machine Learning by Deisenroth, Faisal, and Ong (Chapters 5 and 6 specifically)
Section 5: Programming, Data Structures and Algorithms (Approx. 10 to 14 Marks)
This section reflects GATE DA's requirement that data scientists be grounded in computational thinking. The emphasis is on Python programming, fundamental data structures, and algorithm design and analysis.
Programming in Python
- Data types: integers, floats, strings, lists, tuples, sets, dictionaries
- Control flow: if-else, loops (for, while), comprehensions
- Functions: definition, arguments, lambda functions, recursion
- Object-oriented programming: classes, objects, inheritance (basic)
- File handling and exception handling (basic concepts)
- Libraries: NumPy arrays (basic operations), Pandas DataFrames (basic)
Programming in C (Basic)
- Variables, data types, operators
- Control structures: loops and conditionals
- Arrays and pointers (basic)
- Functions and recursion
Data Structures
- Arrays and strings: operations, two-pointer techniques
- Stacks and queues: array and linked-list implementations
- Linked lists: singly, doubly, operations
- Trees: binary tree, binary search tree (BST), AVL tree (introduction), heap
- Graphs: adjacency matrix, adjacency list, BFS, DFS
- Hash tables: hash functions, collision resolution
Algorithm Analysis
- Time complexity and space complexity
- Big-O, Omega, Theta notation
- Recurrence relations: substitution, Master Theorem
- Best, average, and worst case analysis
Sorting and Searching
- Sorting: Bubble, Selection, Insertion, Merge Sort, Quick Sort, Heap Sort
- Sorting algorithm comparison: stability, time and space complexity
- Searching: Linear search, Binary search
- Binary search variations (first/last occurrence, rotated array)
Graph Algorithms
- BFS and DFS: applications (connected components, cycle detection)
- Shortest path algorithms: Dijkstra, Bellman-Ford
- Minimum Spanning Tree: Prim's and Kruskal's
- Topological sorting
Dynamic Programming
- Memoization and tabulation
- Classic problems: Fibonacci, 0-1 Knapsack, LCS, LIS
- Recursion to DP transformation
Recommended Books:
- Data Structures and Algorithm Analysis in C by Mark Allen Weiss
- Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein (CLRS) — selected chapters
Section 6: Database Management and Warehousing (Approx. 5 to 8 Marks)
Database Management
- Entity-Relationship (ER) model: entities, attributes, relationships, cardinality
- Relational model: tables, keys (primary, foreign, candidate, super), constraints
- Relational algebra: select, project, join, union, difference, intersection
- SQL: DDL (CREATE, DROP, ALTER), DML (INSERT, UPDATE, DELETE), DQL (SELECT)
- SQL joins: INNER, LEFT, RIGHT, FULL OUTER
- Subqueries, aggregate functions (COUNT, SUM, AVG, MIN, MAX), GROUP BY, HAVING
- Views and indexing (basic)
- Functional dependencies and normalization: 1NF, 2NF, 3NF, BCNF
- Transaction management: ACID properties
Data Warehousing
- Data warehouse concepts: subject-oriented, integrated, time-variant, non-volatile
- OLAP vs OLTP
- Data marts
- Star schema and Snowflake schema
- Dimensions and facts
- ETL process: Extract, Transform, Load
- Aggregation and drill-down operations
Recommended Book: Database System Concepts by Silberschatz, Korth, and Sudarshan
Section 7: Machine Learning (Approx. 14 to 20 Marks)
Machine Learning is the highest-weightage and most dynamic section of GATE DA. It rewards candidates who understand not just what algorithms do but why they work mathematically.
Supervised Learning
Regression:
- Simple and multiple linear regression
- Polynomial regression
- Ridge (L2) and Lasso (L1) regression: motivation, effect on coefficients
- Interpretation of regression coefficients
Classification:
- Logistic regression: sigmoid function, log-loss, decision boundary
- Support Vector Machine (SVM): maximum margin classifier, support vectors, kernel trick
- Decision Trees: information gain, Gini impurity, splitting criteria, pruning
- Random Forests: bagging, feature importance, out-of-bag error
- Gradient Boosting: AdaBoost, XGBoost (conceptual)
- K-Nearest Neighbors (KNN): distance metrics, classification and regression
- Naive Bayes: Gaussian, Bernoulli, Multinomial variants
Unsupervised Learning
- K-means clustering: algorithm, initialization (K-means++), convergence, choosing K
- Hierarchical clustering: agglomerative, dendrograms, linkage criteria
- DBSCAN (conceptual)
- Principal Component Analysis (PCA): variance maximization, eigenvalue decomposition, explained variance ratio
- Autoencoders (conceptual)
Model Evaluation and Selection
- Train-test split and cross-validation: k-fold, stratified k-fold, LOOCV
- Bias-variance tradeoff: mathematical decomposition
- Overfitting and underfitting: diagnosis and remedies
- Confusion matrix: TP, FP, TN, FN
- Metrics: Accuracy, Precision, Recall, F1 score, ROC curve, AUC
- Precision-Recall tradeoff
- Hyperparameter tuning: grid search, random search
Neural Networks
- Perceptron: McCulloch-Pitts model, learning rule
- Multi-layer Perceptron (MLP): architecture, forward pass
- Activation functions: Sigmoid, Tanh, ReLU, Leaky ReLU, Softmax
- Loss functions: MSE, Cross-Entropy
- Backpropagation: chain rule application, gradient computation
- Optimization: SGD, Momentum, Adam (conceptual)
- Regularization: L1/L2 weight decay, Dropout, Early stopping
Deep Learning (Conceptual Understanding)
- Convolutional Neural Networks (CNNs): convolution, pooling, feature maps
- Recurrent Neural Networks (RNNs): sequential data, vanishing gradient problem
- LSTM and GRU: gating mechanisms (conceptual)
- Batch Normalization (conceptual)
Recommended Books:
- Pattern Recognition and Machine Learning by Christopher Bishop (primary reference for theory)
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron (practical intuition)
- The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman (advanced reference)
- Mathematics for Machine Learning by Deisenroth, Faisal, and Ong (connects math and ML)
Section 8: Artificial Intelligence (Approx. 5 to 8 Marks)
Search Algorithms
- Uninformed search: BFS, DFS, Iterative Deepening DFS
- Informed search: Greedy Best-First Search, A* algorithm
- Heuristic functions: admissibility, consistency
- Local search: Hill Climbing, Simulated Annealing (conceptual)
Knowledge Representation
- Propositional logic: syntax, semantics, connectives, truth tables
- First-order logic: predicates, quantifiers, sentences
- Inference: modus ponens, resolution
- Knowledge bases and inference engines
Probabilistic Reasoning
- Bayesian networks: structure, conditional independence
- Probabilistic inference in Bayesian networks
- Markov models: basic concepts
Planning
- State space representation
- STRIPS-based planning (conceptual)
- Partial-order planning (conceptual)
Recommended Book: Artificial Intelligence: A Modern Approach by Russell and Norvig (Chapters 3, 4, 6, 13, 14)
Section-wise Weightage Summary
| Section Approximate Marks | |
| General Aptitude | 15 |
| Probability and Statistics | 13 to 17 |
| Machine Learning | 14 to 20 |
| Programming, DSA | 10 to 14 |
| Linear Algebra | 10 to 14 |
| Calculus and Optimization | 8 to 12 |
| Databases and Warehousing | 5 to 8 |
| Artificial Intelligence | 5 to 8 |
Preparation Priority Guide
Tier 1 — Maximum Time Investment
Probability and Statistics, Machine Learning, Linear Algebra
These three sections together account for 37 to 51 marks. A candidate who masters these has a near-certain path to a competitive score. None of these can be treated as secondary subjects.
Tier 2 — Strong Coverage Required
Programming and DSA, Calculus and Optimization, General Aptitude
These sections contribute 33 to 41 marks combined and are also required for qualifying. DSA questions are often more straightforward than in GATE CS, and GA can be scored highly with relatively less effort.
Tier 3 — Thorough but Efficient Coverage
Databases and Warehousing, Artificial Intelligence
These contribute around 10 to 16 marks combined. They should be covered completely but do not require the same depth as Tier 1 topics. For AI, a thorough reading of the relevant chapters from Russell and Norvig is usually sufficient.
Using the Syllabus Tracker
Systematic tracking of topic completion prevents the common problem of "I think I've covered everything" before the exam when several topics are actually untouched.
The GATE DA Syllabus Tracker on AspirantMitraa lists every topic across all eight sections. Candidates can mark each topic as:
- Not started
- In progress
- Completed (reading only)
- Completed with PYQ practice
Using the fourth status — completed with PYQ practice — is the most valuable workflow. A topic is only truly prepared when the candidate can solve actual exam-style questions from it, not just when reading is done.
Pair the tracker with the GATE DA PYQ Master Bank to access topic-wise questions immediately after completing each topic.
How to Use PYQs with the Syllabus
After completing any topic from the syllabus, immediately solve all GATE DA PYQs tagged to that topic. Given that GATE DA only has PYQs from 2024, 2025, and 2026, every available question is highly valuable.
Access year-wise papers here:
For GATE CS PYQs on overlapping topics (Programming, DSA, Linear Algebra, Probability), the GATE DA Complete Guide notes that GATE CS questions on these subjects provide additional practice material since the topic overlap is substantial.
Frequently Asked Questions about GATE DA Syllabus
Q. Has the GATE DA syllabus changed since the paper was introduced? Minor updates have been made in the Machine Learning and AI sections since 2023. The core mathematical sections (Probability, Linear Algebra, Calculus) have remained stable. Always verify the current year's syllabus from the official GATE notification.
Q. Is deep learning heavily tested in GATE DA? Deep learning is tested at a conceptual and mathematical level rather than as implementation knowledge. Neural Networks, backpropagation, and basic CNN/RNN concepts appear regularly. Very advanced deep learning topics like transformers and attention mechanisms are not typically tested.
Q. Which is the hardest section in GATE DA? This varies by background. For candidates from pure CS backgrounds, Probability and Statistics is usually the hardest. For candidates from Mathematics or Statistics backgrounds, the DSA and AI sections may require more effort.
Q. Are Python coding questions part of GATE DA? GATE DA tests Python concepts, data structure usage, and algorithm logic but not actual code execution or debugging. Questions are typically multiple choice or NAT based on what a code snippet outputs or what the time complexity of an algorithm is.
Q. Is GATE DA syllabus the same as GATE CS syllabus? No. GATE DA and GATE CS share some overlap in Programming, DSA, and Databases. However, GATE DA replaces large CS-specific sections (TOC, Compiler Design, Computer Organization, Computer Networks) with Probability and Statistics, Machine Learning, Calculus, and AI. They are fundamentally different papers.
More questions answered on the GATE DA FAQ page.
Summary
The GATE DA syllabus spans eight sections covering General Aptitude, Probability and Statistics, Linear Algebra, Calculus and Optimization, Programming and DSA, Database Management, Machine Learning, and Artificial Intelligence. Machine Learning and Probability and Statistics are the highest-weightage sections and deserve the most preparation time.
Use the GATE DA Syllabus Tracker to track completion, practice with year-wise PYQs from 2024, 2025, and 2026, and take structured mock tests through the GATE DA Test Series for exam-level practice.
Related Pages: