cohort_structure

This repository contains detailed information about the structure of AI Saturdays Lagos Cohort.

Stars: 75

Visit

The Machine Learning (ML) Flipped Cohort is a 12-week structured program designed for beginners to gain foundational to intermediate ML knowledge. Participants consume pre-recorded content during the week and engage in weekly community discussions. The program covers topics such as Python, data science foundations, databases, math for ML, text processing, linear regression, non-linear modeling, deep learning basics, and more. Participants work on capstone projects and are assessed through Google Forms. Certification requires minimum attendance, assessment scores, and participation in the final project. The cohort provides a supportive learning environment with mentorship and community interaction.

README:

AI Saturdays Lagos C9 - Flipped

The Machine Learning (ML) Flipped Cohort is a structured, community-driven Data Science and Machine Learning learning 12 weeks cohort designed for beginners. The goal is to equip individuals with foundational to intermediate ML knowledge using a flipped classroom model where learners independently consume pre-recorded content during the week, then attend a weekly community to discuss, explore and ask questions about what they’ve learned.

We follow a flipped classroom model where:

Participants watch curated pre-recorded lectures and complete labs during the week.
Every Saturday, attend a community call to engage with one of the organizers.

What to Expect

Each week you will be:

Assigned selected videos (from a curated playlist of lectures and labs)
Receive supporting materials like Jupyter notebooks, slides, and assessments
Join a live Zoom session on weekends to engage with instructors and peers
Interact daily on Discord for Q&A, collaboration, and accountability

By the end of the cohort, you will:

Participate in capstone projects and present your solution to demonstrate real-world understanding
Earn a certificate if all conditions are met (see below)

Who is This For?

This cohort is ideal for:

Students and recent graduates exploring data science or ML
Career switchers with programming experience aiming to enter ML roles
Self-learners seeking structure, mentorship, and a community
You!

Prerequisite: Basic Python knowledge is expected

We’ll provide beginner-friendly Python resources during Week 1 for anyone needing a refresher.

Program Duration

The cohort will run for 10–12 weeks, broken down into:

10 weeks of structured learning
Capstone projects

Important Dates

Cohort Start Date: July 26, 2025
Cohort End Date: October 18, 2025

Tools & Platforms

Tool	Purpose	Link
GitHub	All materials, assignments, and resources	Cohort Repository
Gmail Group	Announcements & Notifications	AI6 Lagos Group
Zoom	Weekly community sessions & project demos	Link shared weekly
Discord	Daily interaction, Q&A, accountability & support	Join Discord
YouTube	Pre-recorded lectures & community session recordings	Pre-recorded Lectures & Lab, C9 - Weekly Community Sessions

Weekly Format

Each week will follow this schedule:

Sundays: Email regarding the videos, labs, notebooks and slides for the week will be sent to participants
Saturdays: Complete and Submit Assessments 9:00 AM WAT on Saturdays
Saturdays: Attend a 2-hour community discussion via Zoom (10-12 PM WAT)

There will be an onboarding session on July 26th, at 10:00 AM WAT.

Cohort Overview

Week	Dates	Topics	Lectures	Labs	Assessment	Suggested Weekly Schedule
0	Jul 26	Onboarding & Kickoff	-	-	-	-
1	Jul 27 – Aug 2	Python & Numerical Computing	☘️Python Refresher: Lecture Video , Lecture Notebook ☘️Numerical Computing with Python and Numpy: Lecture Video, Lecture Notebook	-	Link	Mon: Python Refresher Lecture Wed: NumPy Lecture
2	Aug 3 – Aug 9	Data Science Foundations	☘️Introduction to Data Science: Lecture Video, Lecture Slides ☘️Data Collection and Scraping: Lecture Video, Lecture Slides	🍒Introduction to Git and Github: Lab Video, Lab Slides 🍒Data Collection and Scraping: Lab Video, Lab Notebook	link	Mon: Intro to DS Lecture Tue: Intro + Git/GitHub Lab Wed: Data Collection Lecture Thur: Data Collection Lab
3	Aug 10 – Aug 16	Databases, SQL & Exploratory Data Analysis	☘️Relational Data: Lecture Video, Lecture Slides ☘️ Visualization and Data Exploration: Lecture Video, Lecture Slides	🍒Relational Data and SQL: Lab Video, Lab Notebook 🍒Data Exploration and Visualization: Lab Video, Lab Notebook	Link	Mon: Relational data Lecture Tue: Relational data Lab Wed: Data Exploration Lecture Thur: Data Exploration Lab
4	Aug 17 – Aug 23	Math for ML	☘️Linear Algebra: Lecture Video, Lecture Notebook, Lecture Slides	-	TBD	Mon: Linear Algebra Lecture Wed: Linear Algebra Notebook
5	Aug 24 – Aug 30	Text Processing	☘️ Free Text and Natural Language Processing: Lecture Video, Lecture Slides	🍒Text Processing: Lab Video, Lab Notebook		Mon: Free Text & NLP Lecture Wed: Text Processing Lab
			Project Checkpoint
6	Aug 31 – Sep 6	Linear Regression & Classification Models	☘️Introduction to Machine Learning & Linear Regression: Lecture Video, Lecture Slides ☘️Linear Classification: Lecture Video, Lecture Slides	🍒Linear Regression and Classification: Lab Video, Lab Notebook	TBD	Mon: Introduction to ML Lecture Wed: Linear Classification Lecture Thur Linear Regression & Classification Lab
7	Sep 7 – Sep 13	Non-Linear Modeling & Interpretable ML	☘️Nonlinear Modeling, Cross-Validation: Lecture Video, Lecture Slides ☘️Decision Trees, Interpretable Models: Lecture Video, Lecture Slides	🍒Nonlinear Modeling: Lab Video, Lab Notebook	TBD	Mon: Nonlinear Modeling Lecture Tue: Nonlinear Modeling Lab Wed: Decision Trees Lecture
8	Sep 14 – Sep 20	Probabilistic Models	☘️Basics of Probability: Lecture Video, Lecture Slides ☘️Maximum Likelihood Estimation, Naive bayes: Lecture Video, Lecture Slides	-	TBD	Mon: Basics of Probability Lecture Wed: MLE, Naive Bayes Lecture
9	Sep 21 – Sep 27	Unsupervised Learning & Recommendation Systems	☘️Unsupervised Learning: Lecture Video, Lecture Slides ☘️Recommendation Systems: Lecture Video, Lecture Slides	🍒Unsupervised Learning: Lab Video, Lab Notebook 🍒Recommendation Systems: Lab Notebook	TBD	Mon: Unsupervised Learning Lecture Tue: Unsupervised Learning Lab Wed: Recommendation Systems Lecture Thur: Recommendation Systems Lab
10	Sep 28 – Oct 4	Deep Learning Basics	☘️Introduction to Deep Learning: Lecture Video, Lecture Slides	🍒Neural Networks: Lab Video, Lab Notebook	TBD	Mon: Deep Learning Lecture Wed: Neural Network Lab
			Capstone Project Submission
12	Oct 18	Project Presentations	-	-	-	-

Assessments

Submitted via Google Forms
Deadline: 1 hour before the community call on Saturdays
Reviewed live during the discussion

Assessment Grading

TBD

Certification Requirements

To receive a Certificate of Completion:

60% minimum attendance at community calls (tracked via Google Forms)
40% average assessment score
100% participation in the final project (submission required)

Capstone Projects

TBD

Additional Learning Resources

You are encouraged to explore the following:

🙏 Acknowledgements

This cohort is built on the foundation laid by the incredible work from Cohort 8 (C8) — its lectures, labs, and community contributions. We are deeply grateful to the selfless volunteers who made it all possible: class instructors, lab facilitators, mentors, and countless others who gave their time and expertise.

Our community is fortunate to be supported by such a generous, talented, and inspiring group of individuals. Thank you for your continued impact.

✨ C8 Instructors (Alphabetical Order)

Afolabi Animashaun
Akintayo Jabar
Allen Akinkunle
Aseda Addai-Deseh
Deborah Kanubala
Ejiro Onose
Emefa Duah
Femi Ogunbode
Fortune Adekogbe
Foutse Yuehgoh
Funmito Adeyemi
Joscha Cüppers
Khadija Iddrisu
Kenechi Dukor
Lawrence Francis
Olumide Okubadejo
Oluwaseun Ajayi
Oluwatoyin Yetunde Sanni
Sandra Oriji
Steven Kolawole
Tejumade Afonja
Wuraola Oyewusi

☘️ C9 Organizing Team (Alphabetical Order)

This effort is brought to you by our amazing team of volunteers — thank you for your time, dedication, and leadership.

Adetola Adetunji
Ibrahim Gana
Jesuyanmife Egbewale (cohort lead)
Kenechi Dukor
Oluwafemi Azeez
Sharon Alawode
Simon Ubi
Tejumade Afonja

For Tasks:

Click tags to check more tools for each tasks

explore data build ml models analyze text present projects engage in community discussions

For Jobs:

data analyst machine learning engineer data scientist ai researcher data engineer

Alternative AI tools for cohort_structure

Similar Open Source Tools

cohort_structure

github

: 75

Documents-Parsing-Lab

A curated collection of Jupyter notebooks for experimenting with state-of-the-art OCR, document parsing, table extraction, and chart understanding techniques. This repository enables easy benchmarking and practical usage of the latest open-source and cloud-based solutions for document image processing.

github

: 63

motia

Motia is an AI agent framework designed for software engineers to create, test, and deploy production-ready AI agents quickly. It provides a code-first approach, allowing developers to write agent logic in familiar languages and visualize execution in real-time. With Motia, developers can focus on business logic rather than infrastructure, offering zero infrastructure headaches, multi-language support, composable steps, built-in observability, instant APIs, and full control over AI logic. Ideal for building sophisticated agents and intelligent automations, Motia's event-driven architecture and modular steps enable the creation of GenAI-powered workflows, decision-making systems, and data processing pipelines.

github

: 8.7k

sktime

sktime is a Python library for time series analysis that provides a unified interface for various time series learning tasks such as classification, regression, clustering, annotation, and forecasting. It offers time series algorithms and tools compatible with scikit-learn for building, tuning, and validating time series models. sktime aims to enhance the interoperability and usability of the time series analysis ecosystem by empowering users to apply algorithms across different tasks and providing interfaces to related libraries like scikit-learn, statsmodels, tsfresh, PyOD, and fbprophet.

github

: 9.3k

LlamaV-o1

LlamaV-o1 is a Large Multimodal Model designed for spontaneous reasoning tasks. It outperforms various existing models on multimodal reasoning benchmarks. The project includes a Step-by-Step Visual Reasoning Benchmark, a novel evaluation metric, and a combined Multi-Step Curriculum Learning and Beam Search Approach. The model achieves superior performance in complex multi-step visual reasoning tasks in terms of accuracy and efficiency.

github

: 215

Video-ChatGPT

Video-ChatGPT is a video conversation model that aims to generate meaningful conversations about videos by combining large language models with a pretrained visual encoder adapted for spatiotemporal video representation. It introduces high-quality video-instruction pairs, a quantitative evaluation framework for video conversation models, and a unique multimodal capability for video understanding and language generation. The tool is designed to excel in tasks related to video reasoning, creativity, spatial and temporal understanding, and action recognition.

github

: 1.3k

spark-nlp

Spark NLP is a state-of-the-art Natural Language Processing library built on top of Apache Spark. It provides simple, performant, and accurate NLP annotations for machine learning pipelines that scale easily in a distributed environment. Spark NLP comes with 36000+ pretrained pipelines and models in more than 200+ languages. It offers tasks such as Tokenization, Word Segmentation, Part-of-Speech Tagging, Named Entity Recognition, Dependency Parsing, Spell Checking, Text Classification, Sentiment Analysis, Token Classification, Machine Translation, Summarization, Question Answering, Table Question Answering, Text Generation, Image Classification, Image to Text (captioning), Automatic Speech Recognition, Zero-Shot Learning, and many more NLP tasks. Spark NLP is the only open-source NLP library in production that offers state-of-the-art transformers such as BERT, CamemBERT, ALBERT, ELECTRA, XLNet, DistilBERT, RoBERTa, DeBERTa, XLM-RoBERTa, Longformer, ELMO, Universal Sentence Encoder, Llama-2, M2M100, BART, Instructor, E5, Google T5, MarianMT, OpenAI GPT2, Vision Transformers (ViT), OpenAI Whisper, and many more not only to Python and R, but also to JVM ecosystem (Java, Scala, and Kotlin) at scale by extending Apache Spark natively.

github

: 4.0k

go-interview-practice

The Go Interview Practice repository is a comprehensive platform designed to help users practice and master Go programming through interactive coding challenges. It offers an interactive web interface with a code editor, testing experience, and competitive leaderboard. Users can practice with challenges categorized by difficulty levels, contribute solutions, and track their progress. The repository also features AI-powered interview simulation, real-time code review, dynamic interview questions, and progressive hints. Users can showcase their achievements with auto-updating profile badges and contribute to the project by submitting solutions or adding new challenges.

github

: 1.3k

FlowDown-App

FlowDown is a blazing fast and smooth client app for using AI/LLM. It is lightweight and efficient with markdown support, universal compatibility, blazing fast text rendering, automated chat titles, and privacy by design. There are two editions available: FlowDown and FlowDown Community, with various features like chat with AI, fast markdown, privacy by design, bring your own LLM, offline LLM w/ MLX, visual LLM, web search, attachments, and language localization. FlowDown Community is now open-source, empowering developers to build interactive and responsive AI client apps.

github

: 363

Foundations-of-LLMs

Foundations-of-LLMs is a comprehensive book aimed at readers interested in large language models, providing systematic explanations of foundational knowledge and introducing cutting-edge technologies. The book covers traditional language models, evolution of large language model architectures, prompt engineering, parameter-efficient fine-tuning, model editing, and retrieval-enhanced generation. Each chapter uses an animal as a theme to explain specific technologies, enhancing readability. The content is based on the author team's exploration and understanding of the field, with continuous monthly updates planned. The book includes a 'Paper List' for each chapter to track the latest advancements in related technologies.

github

: 1.2k

Xwin-LM

Xwin-LM is a powerful and stable open-source tool for aligning large language models, offering various alignment technologies like supervised fine-tuning, reward models, reject sampling, and reinforcement learning from human feedback. It has achieved top rankings in benchmarks like AlpacaEval and surpassed GPT-4. The tool is continuously updated with new models and features.

github

: 982

ts-bench

TS-Bench is a performance benchmarking tool for TypeScript projects. It provides detailed insights into the performance of TypeScript code, helping developers optimize their projects. With TS-Bench, users can measure and compare the execution time of different code snippets, functions, or modules. The tool offers a user-friendly interface for running benchmarks and analyzing the results. TS-Bench is a valuable asset for developers looking to enhance the performance of their TypeScript applications.

github

: 162

deepfabric

DeepFabric is a CLI tool and SDK designed for researchers and developers to generate high-quality synthetic datasets at scale using large language models. It leverages a graph and tree-based architecture to create diverse and domain-specific datasets while minimizing redundancy. The tool supports generating Chain of Thought datasets for step-by-step reasoning tasks and offers multi-provider support for using different language models. DeepFabric also allows for automatic dataset upload to Hugging Face Hub and uses YAML configuration files for flexibility in dataset generation.

github

: 533

awesome-ai-efficiency

Awesome AI Efficiency is a curated list of resources dedicated to enhancing efficiency in AI systems. The repository covers various topics essential for optimizing AI models and processes, aiming to make AI faster, cheaper, smaller, and greener. It includes topics like quantization, pruning, caching, distillation, factorization, compilation, parameter-efficient fine-tuning, speculative decoding, hardware optimization, training techniques, inference optimization, sustainability strategies, and scalability approaches.

github

: 115

actor-core

Actor-core is a lightweight and flexible library for building actor-based concurrent applications in Java. It provides a simple API for creating and managing actors, as well as handling message passing between actors. With actor-core, developers can easily implement scalable and fault-tolerant systems using the actor model.

github

: 458

dinopal

DinoPal is an AI voice assistant residing in the Mac menu bar, offering real-time voice and video chat, screen sharing, online search, and multilingual support. It provides various AI assistants with unique strengths and characteristics to meet different conversational needs. Users can easily install DinoPal and access different communication modes, with a call time limit of 30 minutes. User feedback can be shared in the Discord community. DinoPal is powered by Google Gemini & Pipecat.

github

: 104

For similar tasks

phospho

Phospho is a text analytics platform for LLM apps. It helps you detect issues and extract insights from text messages of your users or your app. You can gather user feedback, measure success, and iterate on your app to create the best conversational experience for your users.

github

: 389

OpenFactVerification

Loki is an open-source tool designed to automate the process of verifying the factuality of information. It provides a comprehensive pipeline for dissecting long texts into individual claims, assessing their worthiness for verification, generating queries for evidence search, crawling for evidence, and ultimately verifying the claims. This tool is especially useful for journalists, researchers, and anyone interested in the factuality of information.

github

: 856

open-parse

Open Parse is a Python library for visually discerning document layouts and chunking them effectively. It is designed to fill the gap in open-source libraries for handling complex documents. Unlike text splitting, which converts a file to raw text and slices it up, Open Parse visually analyzes documents for superior LLM input. It also supports basic markdown for parsing headings, bold, and italics, and has high-precision table support, extracting tables into clean Markdown formats with accuracy that surpasses traditional tools. Open Parse is extensible, allowing users to easily implement their own post-processing steps. It is also intuitive, with great editor support and completion everywhere, making it easy to use and learn.

github

: 2.4k

spaCy

spaCy is an industrial-strength Natural Language Processing (NLP) library in Python and Cython. It incorporates the latest research and is designed for real-world applications. The library offers pretrained pipelines supporting 70+ languages, with advanced neural network models for tasks such as tagging, parsing, named entity recognition, and text classification. It also facilitates multi-task learning with pretrained transformers like BERT, along with a production-ready training system and streamlined model packaging, deployment, and workflow management. spaCy is commercial open-source software released under the MIT license.

github

: 30.7k

NanoLLM

NanoLLM is a tool designed for optimized local inference for Large Language Models (LLMs) using HuggingFace-like APIs. It supports quantization, vision/language models, multimodal agents, speech, vector DB, and RAG. The tool aims to provide efficient and effective processing for LLMs on local devices, enhancing performance and usability for various AI applications.

github

: 156

ontogpt

OntoGPT is a Python package for extracting structured information from text using large language models, instruction prompts, and ontology-based grounding. It provides a command line interface and a minimal web app for easy usage. The tool has been evaluated on test data and is used in related projects like TALISMAN for gene set analysis. OntoGPT enables users to extract information from text by specifying relevant terms and provides the extracted objects as output.

github

: 584

lima

LIMA is a multilingual linguistic analyzer developed by the CEA LIST, LASTI laboratory. It is Free Software available under the MIT license. LIMA has state-of-the-art performance for more than 60 languages using deep learning modules. It also includes a powerful rules-based mechanism called ModEx for extracting information in new domains without annotated data.

github

: 102

liboai

liboai is a simple C++17 library for the OpenAI API, providing developers with access to OpenAI endpoints through a collection of methods and classes. It serves as a spiritual port of OpenAI's Python library, 'openai', with similar structure and features. The library supports various functionalities such as ChatGPT, Audio, Azure, Functions, Image DALL·E, Models, Completions, Edit, Embeddings, Files, Fine-tunes, Moderation, and Asynchronous Support. Users can easily integrate the library into their C++ projects to interact with OpenAI services.

github

: 321

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675