Skip to content

Commit cb74bcb

Browse files
authored
GSoC Week 3 report by Mebin Thattil (#232)
1 parent 5c698dc commit cb74bcb

File tree

1 file changed

+61
-0
lines changed

1 file changed

+61
-0
lines changed
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
---
2+
title: "GSoC ’25 Week 03 Update by Mebin J Thattil"
3+
excerpt: "Re-thinking training dataset structure"
4+
category: "DEVELOPER NEWS"
5+
date: "2025-06-21"
6+
slug: "2025-06-21-gsoc-25-mebinthattil-week3"
7+
author: "@/constants/MarkdownFiles/authors/mebin-thattil.md"
8+
tags: "gsoc25,sugarlabs,week03,mebinthattil,speak_activity"
9+
image: "assets/Images/GSOCxSpeak.png"
10+
---
11+
12+
<!-- markdownlint-disable -->
13+
14+
# Week 03 Progress Report by Mebin J Thattil
15+
16+
**Project:** [Speak Activity](https://github.com/sugarlabs/speak)
17+
**Mentors:** [Chihurumnaya Ibiam](https://github.com/chimosky), [Kshitij Shah](https://github.com/kshitijdshah99)
18+
**Assisting Mentors:** [Walter Bender](https://github.com/walterbender), [Devin Ulibarri](https://github.com/pikurasa)
19+
**Reporting Period:** 2025-06-14 - 2025-06-21
20+
21+
---
22+
23+
## Goals for This Week
24+
25+
- **Goal 1:** Fix the issue for continious generation of chain of responses from model
26+
27+
---
28+
29+
## This Week’s Achievements
30+
31+
Note: _I'm officially on leave for this week and the next week, but I have however been taking calls and attending meetings, and did some light work in the background._
32+
33+
1. **Re-formatted the dataset to avoid generating chain of responses**
34+
- Before the dataset had a records of conversations between a student and a teacher. Each record would have around 5-10 back-and-forth questions and interactions between the student and teacher.
35+
- Since we were training on this dataset format, the model would also try to replicate this format - ie. it would start generating a chain of question-answer back and forths between the student and teacher. This is obviously something that we don't want.
36+
- I initially kept it this way to teach the model better conversational flow, but this approach does more harm than help.
37+
- So I have broken up the conversations and re-structured the conversations.
38+
- I will now fine-tune it again on a subset of the dataset and deploy just to test it (_this is yet to be done_)
39+
40+
41+
---
42+
43+
## Key Learnings
44+
45+
- Structure of dataset needs to be changed, in order to make it more conversational and understand the nuances of a chain of conversations.
46+
47+
---
48+
49+
## Next Week’s Roadmap
50+
51+
- Train the model and evaluate it
52+
- Also try to run [my model](https://huggingface.co/MebinThattil/FT-Llama3.2-1B/tree/main) that is on HF via Sugar-AI.
53+
54+
---
55+
56+
## Acknowledgments
57+
58+
Thank you to my mentors, the Sugar Labs community, and fellow GSoC contributors for ongoing support.
59+
60+
---
61+

0 commit comments

Comments
 (0)