Certificate in Data Science
- The course is based on a series of lectures and practical exercises.
- Participants will work on practicing and writing Python code to perform data science tasks.
- Participants will gain hands-on experience in coding for data science, unique printouts, practical exercises, and proofs of work that can be applied in their organizations.
- As part of the course, participants will complete a data science project from start to finish.
Participants will be able to:
- Clean, reshape, reformat, and describe data.
- Visualize data for desktop and web-based interactions.
- Identify and remove outliers.
- Make predictions using machine learning techniques.
- “Scrape” the internet to generate data sources.
- Visualize and analyze spatial and network data.
This course is designed for professionals who want to use data to improve their decision-making through predictive analytics. It includes technical professionals such as database managers, system administrators, business analysts, business intelligence specialists, geographic information system specialists, and web developers. Recommended prior knowledge includes data analysis using Excel, as well as basic concepts of correlation, probability, and statistics. Participants should have previous experience working with data stored in traditional relational database systems, and it is preferred that they have experience with an object-oriented programming language.
Training Program Content:
- Fundamentals of data organization.
- Working with data filters and options.
- Handling missing values.
- Eliminating duplicate datasets.
- Performing correlations and transformations on data.
- Grouping data.
Fundamentals of data visualization:
- Creating line, bar, and pie charts.
- Setting strategic elements.
- Crafting and labeling strategies.
- Plotting time series.
- Developing statistical strategies.
Mathematics and Statistics Basics:
- Performing basic calculations.
- Using statistical methods to summarize data.
- Formulating summaries for categorical variables.
- Measuring relationships between variables.
- Transforming distributions.
Data Dimensionality Reduction:
- Machine learning.
- Factor analysis in Python.
- Dimensionality reduction with PCA.
Outlier Detection and Removal:
- Applying outlier analysis.
- Implementing multivariate analysis.
- Applying linear regression.
Segmentation and Analysis within Data:
- K-Means clustering algorithm.
- Hierarchical clustering.
- Classification using supervised methods.
- Fundamentals of network analysis.
- Plotting and editing graphs.
- Creating graphical representations.
- Applying directed network analysis in social network simulation.
- Quantitative description of graphs.
Fundamentals of Algorithmic Learning:
- Linear regression.
- Logistic regression.
- Classification with Naive Bayes.
Interactive and Collaborative Data Visualization:
- Creating basic plots from Plotly sources.
- Creating statistical plots from Plotly sources.
- Creating maps from Plotly sources.
Web Scraping with Beautiful Soup:
- Extracting information from websites.
- Understanding database objects.
- Data analysis.
- Cleaning websites.