Autumn Internship Reflection - LLM RAG Model Dataset Improvement
Theme: LLM RAG Model Dataset Improvement
Department: Information Technology Research Institute, Information Network Division
Comparing Research vs Industry Experience
Academic Research Approach
In university research, when studying new theories, I would:
- Select textbooks and read them slowly
- Study papers to grasp the field's content
- Program while reading papers in a structured learning cycle
Previous Industry Experience (Summer Internship)
During my previous internship, I experienced corporate development where:
- I received design documents for improvement with clear procedures and test experiments
- The challenging part was understanding design documents, then following manual procedures (which felt somewhat tedious)
- Key learning: The importance of communication - design documents alone weren't sufficient, and complex products/models made complete individual understanding difficult
This Internship: Self-Directed Problem Solving
This internship was completely different - no design documents provided. I had to:
- Think independently about how to approach challenges
- Try various solutions and consider countermeasures
- Consult on implementation effectiveness and execution order
- Iterate repeatedly through trial and error
This was incredibly challenging and required significant effort.
Results and Learning Gaps
Actual Results
The improvements achieved were relatively modest, but reaching even these small improvements required extensive trial and effort.
Knowledge Gaps Challenge
Working on completely unfamiliar content presented difficulties:
- Where to start studying and how deep to go were unclear
- Relying on internet research and personal judgment for preliminary study
- Shallow understanding led to misaligned problem analysis and solution evaluation
- This caused progression down incorrect solution paths
Learning Through Implementation
- Understanding deepened through hands-on work
- Problem factors became clearer through iteration
- Solution evaluation became more accurate
- Found potentially correct approaches, but fundamental project-wide improvements require comprehensive project understanding
- Time constraints remained a significant limitation
Japanese Original
秋インターンシップ2weeks
学校での研究のとき、新しい理論を勉強するため,教科書選んでゆっくり読んだり、論文を学習したりすることで、その分野の内容を大体把握できて、そしてプログラミングを作りつつ,論文を読むという感じです。
前回のインターシップでは,企業の開発を体験しましたが、その時に改善した設計書をもらって改善の手順とテスト実験をやりました。その時に難しいことは設計書を理解して、後はそれに従うって手作業するのがちょっとくだらないというところも感じでした。でもその時に1番勉強したのはコミュニケーションの大事さを勉強になりました。
今回のインターンシップでは設計書がなくて完全に自分から何とか課題に対してどうやってすべきかを考えて,いろいろ試して対策を考えて対策の実効性やその実行順番を相談しながら進めるという感じでした.めちゃくちゃ繰り返して難しかったことです。
実際に今回やった結果としては、どちらと言うとかなりわずかな改善ですが、まぁそのことをたどり着くために私はたくさん試して苦労しました。完全にやったことない内容で事前に勉強したいとは言っても,どこから勉強すべきかどこまで勉強すべきかをわからなくて、結局理解が浅くて間違ったルートに進みました。やりながら理解を深めて問題の要因を把握して、正しいルートを見つけましたが,プロジェクトの全体を把握しないと根本的改善は不可能。時間が足りない!
Key Professional Development Insights
🔬 Research vs Industry Problem-Solving
- Academic: Structured learning with clear theoretical foundations
- Industry: Iterative experimentation with real-world constraints and unclear starting points
🎯 Self-Directed Learning Challenges
- Scope Definition: Determining learning breadth and depth without clear guidance
- Knowledge Gaps: Shallow initial understanding leading to suboptimal solution paths
- Time Management: Balancing exploration with delivery timelines
💡 Technical Skills Development
- RAG Systems: Hands-on experience with retrieval-augmented generation models
- Dataset Optimization: Understanding data quality impact on LLM performance
- Iterative Problem-Solving: Learning to pivot approaches based on experimental results
🚀 Future Growth Areas
- Project Context Understanding: Need for broader system comprehension
- Efficient Learning Strategies: Developing better frameworks for approaching unfamiliar domains
- Solution Evaluation: Improving accuracy in assessing approach effectiveness