To succeed as a Data Scientist, one must possess appropriate skills and qualities and develop relevant expertise. To take the first steps towards becoming a Data Scientist, it, thus, becomes very important to understand what a Data Scientist does every day. A Data Scientist spends 40% of the time in doing data related work, i.e. understanding the data, transforming the data, visualizing the data, doing exploratory analysis, understanding null values, imputing values through suitable rules and logics and understanding the problem and business case. Further, 40% of time is spent in going through a list of numerous available algorithms, reviewing the logical and mathematical basis of relevant algorithms, choosing the appropriate algorithm based on problem at hand to be solved and adopting, diagnosing and improving the selected algorithm and model for best possible solution. Further 20% of time is spent in coding related to modelling.

- Introduction to R-studio, mathematical and logical operators in R, Data types and data structures, simple operations and programs, matrix operations
- Data frames, string operations, factors, handling categorical data, lists and list operations
- Loops and conditional statements, switch and break function
- Apply functions: apply,sapply, lapply, tapply, mapply
- Statistical problem solving in R
- Visualizations in R
- Hands-on data manipulations: cleaning, sub-setting, sampling, data transformations and allied data operations
- Hands-on: Modelling on linear regression (continuous Dependent Variable(CDV)), logistic regression (discreet Dependant Variable(DDV)), SVM (DDV and CDV), decision trees(DDV and CDV), random forests(DDV and CDV), Naïve Bayes and clustering
- Evaluation of and improvement in learning algorithms: Evaluation of learning algorithms, test/validation/train concepts, model selection, diagnosing bias and variance, regularization and bias/variance, learning curves, error analysis and trade-off between precision and recall, cross-validation concepts

- Concept of statistics, population, sample, parameter and statistic, examples of use of statistic, data sources, representation of data, types of statistical analyses, sampling methods, types of variables, measures of central tendency, statistical estimation: point and interval, co-variance, coefficient of correlation, formulae
- Permutations and combinations, Probability concepts, types of probabilities, collectively exhaustive event set, joint probability, Bayes Theorem, probability distribution for a discreet random variable, probabilistic view on variance, covariance
- Distributions: Bernoulli’s trail, binomial distribution, Poisson distribution, Hypergeometric distribution, student-t distribution, Chi-square distribution, F- distribution, Normal distribution, explanation of derivation of population parameter through samples and central limit theorem, Z score
- Hypothesis and testing, single parameter and two-parameter testing, single sided and two-sided testing, p-value, tests and test statistic and logic behind it, problems on hypothesis testing, diagnostic tests: goodness of fit, t-test, f-test and chi-sq test, contingency table, degree of freedom, analysis of variances
- Regression and allied concepts, data transformation, Linear and Matrix algebra concepts

- Supervised, Unsupervised and Reinforcement Learning, geometry (lines, curves and 3D spaces) and visualisation of algebraic concepts
- Regression as a concept, simple one variable regression line, coefficients of the line, assumptions of linear regression, Gradient descent algorithm, cost function to find 'beta' values and concept, local and global minima, concept of learning rate
- Matrix representation of problem, Gradient descent for multiple features, use of feature scaling techniques in gradient descent, types of feature scaling, finding coefficients analytically, normal equation (matrix)non-invertibility
- Logistic regression model, matrix representation, general Sigmoid function and graphical representation, decision boundary (linear and non-linear), metrics for logistic regression (accuracy, sensitivity, specificity etcetera concepts), Receiver-operating characterstic (RoC) curve, use of RoC curve to find out optimum decision boundary, convexity and non-convexity of a group of points
- Optimization objective from logistic regression to support vector machines, large margin classifier, concepts behind large margin classifications,kernels (concept, types and graphical explanations), using SVM
- Decision trees and random forests:Concept, diagramatic representation, random forest as a voting committee of decision trees, parameter meaning and explanation.
- Naive Bayes: Venn diagrams, Naive Bayes algorithm, application and problems, Naive Bayes learning, Bayesian inference, Retail basket analysis; Concept of boosting and bagging
- Unsupervised learning methods/Clustering: K-means algorithm, optimization objective, graphical representation, random initialization, choosing number of clusters
- Association rule mining, K-nearest neighbours algorithm.

- Text Processing : Term Document Matrix, TF-IDF, Word Cloud, Recommendations Systems.
- Sentiment Analysis : Liner classifier, predicting sentiments, positive words, negative words, vocabulary building , scoring , training and evaluating classifer.

- Our Trainers support for any problems faced in real time stays with you always even after the completion of the course.
- We do not believe in simply adding bullet points to course content just for the heck of it. We go full deeper into what ever we train.
- Our course is aligned to the above insight and focuses on building the fundamentals of the participants.
- Our focus is not just information transfer as regard the functions to be used for specific problem solving like many full time as well as part time and online courses that exist in the market do. Our approach helps the participants to get started as a genuine Data Scientist and prepares them to embark on a journey of self-growth through transfer of knowledge and wisdom of the practicing Data Scientists.
- Our course curriculum has a solid foundation in Statistics and Machine Learning and encourages problem solving in R programming. Participants of LIPSINDIA Data Science course can expect themselves to be ready for the data sciences industry without being dependant on any specific software platform.
- Our course has been designed keeping in view that preferred platforms for implementation of data science change while the basic algorithms and philosophies dont change but evolve.

Date | Time (IST) | City | Location | Price (GST included) | |
---|---|---|---|---|---|

2018-05-29 | 07:00 PM - 09:00 PM | Pune | 47/2 ,Sankla Arcade , First floor ( opposite BSNL telephone exchange) Nal Stop , Karve Road, Pune Maharashtra 411004 | 27000 Inc GST |

Date | Time (IST) | City | Location | Price (GST included) | |
---|---|---|---|---|---|

2018-04-21 | 10:00 AM - 01:00 PM | Pune | 214, 2nd floor, B bulding, G-O SQUARE, Mankar Chowk, Wakad, Pune, Maharashtra 411057 | 27000 Inc GST | |

2018-05-26 | 02:30 PM - 05:30 PM | Pune | 214, 2nd floor, B bulding, G-O SQUARE, Mankar Chowk, Wakad, Pune, Maharashtra 411057 | 27000 Inc GST |

Data Scientist has been tagged as the sexiest job of 21st century by Harvard Business Review. From the year 2000 to 2010, there was a boom in the internet based market. With internet becoming more and more accessible to the masses through various gadgets, more and more data generated. Now, is the time to analyse this data and introduce efficiencies in the existing business processes. This era is thus, experiencing a boom in the job market for people who are able to handle and understand data and bring out interesting insights from the data to improve the business. To take advantage of this cycle of the Technology boom, one must undergo a good course on Data Science.

Data Science, like other ground breaking technologies, is a philosophy. It has a lot of logical and mathematical basis and background. To effectively use the machine learning and related techniques, one must know the underlying theory. Different platforms like R, Python, MATLAB deploy same theory and create similar functions for implementing data science based solutions and analyses. Thus, if one knows the underlying theory and fundamentals, moving from one platform to another would just require learning the syntax changes.

One must be prepared to learn a lot of things very quickly. Data Science as a sector is moving at a fast rate and one needs to keep on learning new things to stay up to date. One must have an open mind towards mathematics and statistics and be ready to put in the required effort to understand the crucial concepts to reap benefits from the growth in this sector.

Data Science concepts have been deployed on many platforms. Based on the level of complexity of the task at hand, one may use tools ranging from MS Excel and Tableau to R, Python, C++ and Java. In practice, some tasks are better executed in MS Excel than in R, some tasks are better executed in Python than in say, MS Excel. Having sound basic knowledge about data manipulation, transformation and analysis is what is required if one wants to use the many available platforms for data science in an integrated fashion. Generally speaking, knowing R programming or Python is enough to get one started.

Participants of LIPS Data Science program can expect themselves to be equipped with the theory of commonly used algorithms in the industry, data related manipulations, logics, statistics, mathematics and can expect themselves to have appropriate hands-on experience in handling day to day work of a typical data scientist.

Talking about the job roles that one might get after doing any kind of course depends on a lot of things including the background of the participant, previous experience, the effort put into learning things taught in the course and communication skills. To generalise, an average performing participant of this course may expect to get roles matching the profiles of, but not limited to, Junior Data Scientist, Data Scientist, Business Analyst, Business Intelligence Officer, Senior Business Analyst, Data Science Programmer and Data Science Developer.

Gaining a lot of knowledge and wisdom and appropriate hands-on experience is required if one wants to succeed in this program and secure the desired job role. The program is expected to be of average to high rigor for its participants

The course duration is three months.

✕

You will be reminded 3 day's in advance via SMS/Email.

Must be a valid email address.

✕

Submit OTP