I have been working on a weekend project called “jufsisku” for the past few weeks. This project is to build a search engine where you can look up Lojban-English translations using queries in these two languages. You can try out the search here: http://lojban.lilyx.net/jufsisku/ I have shown the demo to a group of Japanese-speaking lojbanist [...]
The Japanese morphological analyzer MeCab can also be directly called from Clojure, too, by using its Java binding. I have, however, come across some pitfalls related to JNI in the process, so I’ll describe how I’ve overcome them in the following so that everyone else doesn’t have to stumble over the same issues. The first [...]
I recently found out that tatoeba.org is a pretty nice resource for collecting parallel text in many languages. The major reason why I love it is that the whole data is downloadable as a dump file, with all the sentences being under the creative commons license (although there are some mistakes in the sentences). Specifically, [...]
Since Clojure is based on JVM, you can easily pick a publicly available library for Java (machine learning, multimedia processing, or whatever) and call it. Calling Java libraries is normally straightforward thanks to Clojure’s inter-operation functionalities, but you could spend hours reading the library’s API document and tweaking around your code accordingly, especially if you [...]
It’s only been one week since I began learning Clojure programming but I already started to like it! One of the many nice features of Clojure which I like the most is it’s parallel processing by “pmap” — a parallel version of “map”. Below is a simple demonstration how to use this parallel computation and [...]
About the Author
Masato Hagiwara currently works for Rakuten Institute of Technology in New York, as a Senior Scientist. Have worked on search technologies at Google, Microsoft Research, and Baidu in the past. Expert in Natural Language Processing (NLP). Also a lead translator of the O'Reilly book "Natural Language Processing in Python." A native speaker of Japanese. Good command of English and Chinese (Mandarin). For more information, see About Me.Calender
May 2012 M T W T F S S « Apr 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Pages
- 100 NLP Papers
- About Me
- iconlang – new ideographic writing system for better visibility and legibility
- iconlang – 視認性・識別性向上のための新しい表意文字体系
- Music
- Music for Language Fans
- NLTK Japanese Corpora – NLTKで使える日本語コーパス
- Python/Romkan ローマ字とひらがなを相互に変換する Python用のライブラリ
- TinySegmenter in Python
- 中国語学習完全ガイド | 1年以内にマスターする中国語
- 巻き舌クリニック – みんなで巻き舌を克服するサイト