Playground
This is a playground for instructors to test this system.
Teststudent: teststudent
Passwort: 123456789
asdasdas
Requirements: dasdasdsadasd
Places: 9
In recent years, large-scale machine learning models, including large language models (LLMs) and vision transformers, have significantly advanced the fields of natural language processing and computer vision. This seminar will provide an in-depth exploration of the challenges and strategies involved in training these massive models, which require enormous computational resources, data, and expertise.
Key topics include:
- Scaling Up Neural Architectures: Understanding the evolution from traditional neural networks to large-scale models, including transformer models, such as GPT, Llama, and Mistral.
- Data Requirements and Preprocessing: Techniques for managing vast amounts of training data, addressing biases, and ensuring quality.
- Training Infrastructure: Insights into the hardware and software requirements for training, including the use of distributed computing, GPUs, TPUs, and large-scale parallelization strategies.
- Optimization Techniques: Advanced methods for optimizing training, such as gradient accumulation, mixed precision training, and addressing issues like vanishing gradients and overfitting.
- Fine-Tuning and Transfer Learning: How to efficiently adapt large models for specific tasks with limited data and resources.
- Cost Management: Detailed analysis of the costs associated with training large-scale models, including computational, storage, and maintenance expenses, and strategies to optimize budgets.
- Business Model Development: Exploring how large-scale ML models can drive value creation, including use cases, market analysis, monetization strategies, and integrating AI into existing business models to generate sustainable revenue.
This seminar is aimed at computer science students majoring in data science, AI, and machine learning who want to scale their models to meet the growing demands of real-world applications. Each student will conduct a thorough study of a specific model by reviewing its documentation, research papers, and performance metrics. They will present their findings in the seminar presentation and write a short report. Participants will leave with a solid understanding of current best practices and future directions in training large-scale machine learning models.
Organisation: The kick-off date will be announced soon. Information about the seminar, important dates and further specifications will be available in Moodle: https://moodle.uni-saarland.de/course/index.php?categoryid=448
Requirements: Computer Science master students with successful participation in the “Data Science” lecture. Familiarity with neural networks, basic knowledge of deep learning, and an understanding of machine learning workflows.
Places: 6