Fall 2008 Presentation of V-Unit Projects

Fall 2008 Presentation of V-Unit Projects

Time and Location: Tuesday, November 11, 2008, 4:00 – 6:00 p.m., Newell-Simon Hall 1507
Free pizza and drinks will be served.

V-Unit Faculty Coordinator Manuela Veloso will provide an introduction to the V-Unit program
and potential project ideas.

"English-Karen Translation System for Karen Refugees in Pittsburgh"
ThuyLinh Nguyen, Ph.D. Student, Language Technologies Institute
Faculty Advisor: Stephan Vogel

"Fast Food Image Database Collection and Recognition for Obesity Research"
Wen Wu, Ph.D. Student, Language Technologies Institute
Faculty Advisor: Jie Yang

"A Smart Optical Character Recognition for Inupiaq"
Shinjae Yoo, Ph.D. Student, Language Technologies Institute
Faculty Advisor: Lori Levine

Project Abstracts

English-Karen Translation System for Karen Refugees in Pittsburgh

In Pittsburgh, the Jewish Family & Children's Service (JFCS) organization provides support to 200 Karen people from Burma who are starting their lives in the United States. One of the biggest challenges these refugees face is the language barrier. The refugees do not speak English and some are illiterate in their own language. Without a translator, communication between JFCS staff and the refugees becomes very difficult especially when it comes to complicated matters such as transportation, health care or housing.

Our group of one faculty and four students in LTI proposes an English-Karen text-to-speech machine translation project. The goal of the project is to assist Karen refugees and JFCS by providing an English-Karen text-to-speech translation system. As an individual project, ThuyLinh is responsible for the English text to Karen text translation component. The project gives her a chance to test the existing machine translation technique especially word segmentation to a new language with limited resources.

Fast Food Image Database Collection and Recognition for Obesity Research

The intended target of this V-Unit project, a collaboration with Intel-Pittsburgh, is to enhance study, diagnosis and treatment of obesity, a condition in which the natural energy reserve, stored in the fat of humans and other mammals, is increased to such a point that it promotes serious pathologic conditions. Accurate and passive acquisition of dietary data from free-living individuals is essential for a better understanding of the etiology of obesity and the development of effective weight management programs. Currently, self-reporting is the main method for data acquisition. Despite its wide application using questionnaires and structured interviews, numerous studies have demonstrated that data obtained by self-reporting seriously underestimate food intake, and thus do not accurately reflect the habitual behavior of individuals in real life. Accurate computer-based programs for food recognition have yet to exist.

In this project, an image database was collected of more than 100 fast foods sold by 9 well-known restaurant chains and Wen independently studied the problem of identifying foods from videos of eating (recorded by a low-cost web camera) using some off-the-shelf vision and retrieval algorithms.

A Smart Optical Character Recognition
for Inupiaq

Inupiaq is one of the endangered Alaskan languages and the research on Inupiaq such as computational linguistics is quite limited due to lack of soft-copy materials. Having soft-copies of language resources is the first step toward computational linguistic research and real-world applications such as grammar checkers, machine translations, and information retrieval. Also, this computational linguistics research will improve Inupiaq language usability due to better web search and error correction in Editors, for example. However, Inupiaq is polysynthetic in which words are composed of many morphemes. So a dictionary-based error correction which usually works well in English will not work in this case. To overcome this problem, we propose character n-gram based error correction method and we believe it will produce better recognized OCR (Optical Character Recognition) performance.

 

Speaker Bios

ThuyLinh Nguyen is a fourth year Ph.D. student in the Language Technologies Institute at Carnegie Mellon University. Her research interests are in applying machine learning models to natural language processing problems. She is currently working in the statistical machine translation group under the supervision of Stephan Vogel. Before coming to Carnegie Mellon, Linh received her bachelor's degree in Computer Science from Hanoi National University in Vietnam and her Master's of Logics from the Universisty of Amsterdam in the Netherlands.

Wen Wu is Ph.D. student in his sixth year at Carnegie Mellon University's Language Technologies Institute. His research interests include information retrieval, multimedia and vision and their real-world applications such as in-car navigation systems. Wen received his undergraduate degree at Tsinghua University in China and Masters degree from National University of Singapore. Wen proposed his Ph.D. thesis - multimedia technologies for landmark-based vehicle navigation, in January 2008. He is advised by Jie Yang.

Shinjae Yoo is a sixth-year Ph.D. student at Carnegie Mellon University's Language Technologies Institute, where he is working on his thesis on personalized email prioritization, advised by Yiming Yang. His research interests include text classification and clustering and their real-world applications. Shinjae received his undergraduate degree at Soongsil University in Korea and his Master's degree from Seoul National University.

 

The V-Unit (Learning to Build a Vision) gives graduate students an opportunity to think about ways for computer science and technology to address non-traditional problems dealing with society, development, and the environment. The V-Unit enables broadening perspectives of important challenges in developing technology and is a part of TechBridgeWorld at Carnegie Mellon University.

Manuela Veloso and M. Bernardine Dias are the faculty coordinators for the V-Unit.

For more information on V-Unit participants and their projects please visit www.cs.cmu.edu/~vunit.


Sign up to receive news about upcoming seminars.