Logo
Deep Learning Indaba 2026 Β· Nigeria

Building Africa's Human
Data Infrastructure

A capacity-building workshop at DLI 2026 on community-led dataset creation and sustainable AI pipelines focused on strengthening the people, skills, and systems Africa's AI future depends on.

About the Workshop

The rapid growth of generative AI has intensified demand for high-quality datasets, yet progress in African AI remains constrained by gaps in human data infrastructure - the people, skills, and coordinated pipelines required to create, translate, validate, and maintain datasets for African languages and contexts.

Hosted by Tonative Africa at the Deep Learning Indaba 2026 in Nigeria, this session focuses on capacity building for creators, translators, validators, and annotators. Through short talks, breakout discussions, and collaborative roadmap design, participants will explore best practices for dataset creation, quality assurance, and long-term capacity development across African language communities.

The session aims to produce shared guidelines, identify priority challenges, and co-develop a roadmap for strengthening Africa's sovereign, sustainable, and locally owned AI ecosystems. All participants must be present in person.

πŸ“

Location

Nigeria: Deep Learning Indaba 2026 (in-person only)

πŸ—“οΈ

Format

Forums and dialogues (in-person; virtual or hybrid not supported)

⏱️

Duration

1 hour, 30 minutes

🎫

Registration

Via the DLI 2026 conference registration process

πŸ“‘

AV & Scheduling

Coordinated directly with the Indaba organising team

Session Agenda

Full schedule to be confirmed with the Indaba organising team.

🌍
Section 1 Β· ~3 min

Opening & Context

Welcome, session goals, and introduction to the Tonative Data Academy and African AI data challenges. Facilitated by Cynthia Amol.

πŸŽ™οΈ
Section 2 Β· ~10 min

Guest Talk

A short invited talk highlighting challenges in African dataset creation, community capacity-building initiatives, and sustainable data pipelines.

🀝
Section 3 Β· ~35 min

Breakout Discussions

Participants split into groups around key pipeline stages which encompasses dataset creation & collection, translation & validation & annotation, and dataset usage & evaluation in order to identify challenges, needs, and opportunities.

πŸ—ΊοΈ
Section 4 Β· ~30 min

Collaborative Roadmap Building

Groups share key insights and co-develop a shared roadmap for strengthening capacity, improving coordination across language communities, and designing scalable data pipelines.

πŸ’¬
Section 5 Β· ~10 min

Synthesis & Next Steps

Key takeaways, opportunities for collaboration, and post-Indaba follow-up plans including a shared resource toolkit and cross-community collaboration network.

Workshop Organisers

Alfred Kondoro

Alfred Kondoro

Head of Research, Tonative Africa

Lead Organiser

Alfred is a Tanzanian PhD researcher in Data Science at Hanyang University, Republic of Korea. He leads community-driven research initiatives at Tonative Africa aimed at strengthening African representation in AI, with work spanning NLP, HCI, and ICTD. His publications have appeared at EACL, AAAI, ACL, CHI, IMWUT, CIKM, CUI, and AfriCHI venues.

LinkedIn
Sharon Ibejih

Sharon Ibejih

Founder, Tonative Africa

Organiser

Sharon is a Senior Data Scientist at Ignite Energy Access and the Founder of Tonative Africa. Her work focuses on NLP, data curation, and AI pipeline design for low-resource African languages. She holds an MSc in Data Science and has presented at AfricaNLP, NeurIPS WiML, ICLR workshops, Deep Learning Indaba, and CVPR.

LinkedIn
Cynthia Amol

Cynthia Amol

Co-Founder & Head of Data, Tonative Africa

Organiser

Cynthia is a PhD student in Computer Science and Google NLP Fellow at Maseno University, Kenya. A Deep Learning Indaba Alele-Williams Masters Award recipient, she leads the data validation pipeline at Tonative Africa and has co-organised workshops at NeurIPS, LREC-COLING, EACL, and CHI.

LinkedIn
Chinenye Anikwenze

Chinenye Anikwenze

Engineering Lead, Tonative Africa

Organiser

Chinenye is a Software Engineer and Automation Specialist focusing on defensive infrastructure and AI safety. As Engineering Lead at Tonative Africa, she manages technical infrastructure for 400+ contributors. Her research on Semantic Collapse and the security of tonal languages was recently presented at AFLC 2026 and Impact Fellowship Summit IREX 2026.

LinkedIn
Joy Olusanya

Joy Olusanya

NLP Researcher & Training Manager, Tonative Africa

Organiser

Joy is a linguist and NLP researcher focusing on low-resource language technologies, multilingual NLP, and benchmark evaluation. She served as Workshop Chair for the CLRLC–LLMs Workshop at NeurIPS 2025 and is Founder and CEO of the Center for Low-Resource Languages and Cultures.

LinkedIn
Armand Bukama

Armand Bukama

Social Manager, Tonative Africa

Organiser

Armand is a Congolese computer scientist from the DRC with a degree from the Catholic University of Bukavu. He leads community engagement, outreach, and communications at Tonative Africa, and works at the intersection of AI, electronics, and sustainable energy for underserved communities.

LinkedIn
Faisal Muhammad Adam

Faisal Muhammad Adam

Hausa Language Validation Lead, Tonative Africa

Organiser

Faisal is a lecturer and data science practitioner based in Kano, Nigeria, pursuing graduate studies in Applied Data Science at WorldQuant University. He serves as Hausa Language Validation Lead at Tonative Africa, coordinating contributors on multilingual dataset validation and quality assurance.

LinkedIn
Godspraise Okechukwu

Godspraise Okechukwu

Project Lead, Tonative Research Group

Organiser

Godspraise is a software engineer and NLP researcher based in Nigeria. As Project Lead in the Tonative Research Group, he contributes to dataset creation, validation, and multilingual resource development for African languages, building community-driven data pipelines for low-resource languages.

LinkedIn

Who Should Attend

This workshop is designed for a broad community of African AI practitioners and contributors.

πŸ”¬

Researchers

Working on African language technologies

✏️

Dataset Creators & Annotators

Building training data for African languages

🌐

Translators & Validators

Ensuring linguistic accuracy and contextual relevance

πŸ’»

Open-Source Contributors

Supporting community-driven AI tools and pipelines

πŸŽ“

Students & Educators

Working on data-centric AI in academic settings

πŸ—οΈ

AI Practitioners

Building AI products and services for African users

πŸ“–

African Linguists

With an interest in data creation for their languages

Expected Outputs

The workshop aims to produce tangible, community-owned resources and connections.

πŸ“‹

Community Guidelines

Shared guidelines for dataset creation and validation across African language communities.

🧰

Resource & Toolkit List

A curated list of tools, frameworks, and training resources for African language data pipelines.

πŸ—ΊοΈ

Capacity-Building Roadmap

An actionable roadmap for scaling capacity-building initiatives and governance structures.

🀝

Collaboration Network

A cross-community network linking language teams, researchers, and practitioners.

πŸ“„

Post-Indaba Summary Report

A public synthesis of insights and recommendations to inform future collaborations and publications.

Participation Requirements

What you need to know before attending

🎟️

In-Person Attendance Required

This workshop is exclusively delivered in person at DLI 2026 in Nigeria. Virtual or hybrid participation cannot be supported.

πŸ“

Register via DLI 2026

Workshop participation is through the official Deep Learning Indaba 2026 registration. See the DLI website for registration deadlines and details.

πŸ“…

Speaker & Organiser Deadline

All confirmed speakers and organisers must finalise their participation by the ticket allocation deadline communicated by the DLI team.

πŸŽ™οΈ

AV & Logistics

Audio-visual requirements and scheduling will be coordinated directly with the Indaba organising team ahead of the event.

Get Involved

Interested in collaborating, co-organising, or presenting at this workshop? Reach out to the Tonative team... We want to hear from you.