Data+AI Lab knowledge base#
This website provides an introduction to all things related to data science and AI, based on the collective experience from all researchers in the Data+AI lab. The scope of this page is very broad: From a gentle introduction into the basics of data science to a deep dive in the fundamentals of explainable AI. Of course, a lot of great educational material already exists around the majority of the topics covered here, so we provide links for further reading wherever we can.
First and foremost, we intend this to be a practical site for students and researchers who are eager to learn more about data science or AI, or are looking for ways to use any of these technologies in their projects.
The topics covered in this knowledge base are structured as follows:
Foundations
├── Mathematics & Statistics
│ ├── Linear algebra
│ ├── Probability theory
│ └── Statistical methods
├── Programming Fundamentals
│ ├── Python
│ ├── R
│ └── Version control
└── Computing infrastructure
├── Local vs cloud
├── GPU vs CPU computing
└── Cloud computing
Machine Learning
├── Supervised learning
│ ├── Regression
│ ├── Classification
│ └── Time-series analysis
├── Unsupervised learning
│ ├── Clustering
│ ├── Dimensionality reduction
│ └── Anomaly detection
├── Self-supervised learning
├── Deep learning
│ ├── Neural networks
│ ├── Convolutional neural networks
│ ├── Recurrent neural networks and LSTM's
│ └── Transformers
└── Reinforcement learning
├── Q-Learning
├── Policy gradient methods
└── Reinforcement learning applications
Data Engineering
├── Data acquisition
│ ├── Web scraping
│ ├── API's and data integration
│ └── IoT sensor data
├── Data storage
│ ├── Databases
│ ├── Vector databases
│ ├── Object storage
│ └── Data repositories
├── Data processing
│ ├── ETL pipelines
│ ├── Data cleaning
│ ├── Data validation
│ └── Feature engineering
└── Data governance
├── Data quality
├── Metadata management
└── Data privacy and compliance
Generative AI
├── Diffusion models
│ ├── Image generation
│ ├── Video generation
│ └── Audio generation
├── Large language models
│ ├── Architecture and training
│ ├── Model adaptation
│ ├── Prompt engineering
│ └── RAG & knowledge integration
├── Multimodal AI
│ ├── Interpreting images with a language model
│ └── Text to audio models
└── Generative AI ethics
├── Bias and fairness
├── Copyright and ownership
├── Misuse potential
└── Governance frameworks
MLOps
├── Model development
│ ├── Model evaluation
│ ├── Hyper parameter optimization
│ └── Experiment tracking
├── Model deployment
│ ├── Model serving
│ ├── Edge deployment
│ └── Containerization
└── CI/CD for machine learning
├── Automated testing
└── Deployment automation
Explainable AI
├── Interpretability methods
│ ├── Feature importance
│ ├── SHAP values
│ ├── LIME
│ └── Attention visualization
├── Fairness and bias
│ ├── Bias detection
│ ├── Mitigation strategies
│ └── Fairness
├── Model transparancy
│ ├── Decision trees
│ ├── Counterfactual explanation
│ ├── Natural language explanation
│ └── Information retrieval
└── Regulatory compliance
├── GDPR & Right to Explanation
├── AI Governance
├── Documentation requirements
└── Auditing frameworks
Applications
├── Research areas
│ ├── Natural language processing
│ ├── Computer vision
│ ├── Robotics
│ └── Scientific discovery
├── Ethics & Society
│ ├── AI safety
│ ├── Environmental impact
│ ├── Labor market effects
│ └── AI policy and regulation
└── Emerging trends
├── AI agents
└── Foundation models