I recently found out that tatoeba.org is a pretty nice resource for collecting parallel text in many languages. The major reason why I love it is that the whole data is downloadable as a dump file, with all the sentences being under the creative commons license (although there are some mistakes in the sentences). Specifically, [...]
It’s already the final day of ACL-HLT 2011. Overall, I enjoyed the conference very much, listening to new ideas, algorithms, tasks, and meeting old acquaintances, friends, and meeting new researchers, whether from Japan, the U.S., or abroad. I especially liked the presidential speech by Kevin Knight at the banquet on the second day. What would [...]
The big difference of this year’s ACL is that they have made public the best papers before the main conference started. Two of the best papers are both graph-based, although I don’t think this is an evidence that the recent research trend is toward graph-based models. An important thing here is that what is truly [...]
The (probably most important) NLP international conference ACL-HLT 2011 is coming up next week. I’ve prepared my presentation (After I received my colleagues’ nice comments), so I upload the camera ready and the PPT slides here. We, short paper presenters, only have 10 minutes each, excluding questions and answers. This limitation forces me to remove [...]
About the Author
Masato Hagiwara currently works for Rakuten Institute of Technology in New York, as a Senior Scientist. Have worked on search technologies at Google, Microsoft Research, and Baidu in the past. Expert in Natural Language Processing (NLP). Also a lead translator of the O'Reilly book "Natural Language Processing in Python." A native speaker of Japanese. Good command of English and Chinese (Mandarin). For more information, see About Me.Pages
- 100 NLP Papers
- About Me
- iconlang – new ideographic writing system for better visibility and legibility
- iconlang – 視認性・識別性向上のための新しい表意文字体系
- Music
- Music for Language Fans
- NLTK Japanese Corpora – NLTKで使える日本語コーパス
- Python/Romkan ローマ字とひらがなを相互に変換する Python用のライブラリ
- TinySegmenter in Python
- 中国語学習完全ガイド | 1年以内にマスターする中国語
- 巻き舌クリニック – みんなで巻き舌を克服するサイト