We have been looking for an apartment for the past few weeks and finally found a nice one in Upper East Side. Even though this is not our first time apartment hunt in the city, the whole process is always an exciting but a bit tiring experience. We ended up finding a no-fee one on StreetEasy (we also got some broker agents to help the search, but, you know, the average broker fee 15% of annual Manhattan rents is not a chump change). In retrospect, it may have been a good idea to talk directly to the doorman and ask if there are any vacant apartment in the building.
When we tried to file an application to the management company, we found there had already been another application to the same apartment, filed just the previous night. We had no other choice but to file for the second-best one. But it is New York, anything happens.
Last Friday, there was a talk by Prof. Liang Huang, who is now an assistant professor at Graduate center, city university of New York.
One of the advantages of living in New York is that you can appreciate many tech talks hosted by colleges and meet-ups. For example, just during last year, I attended talks by Ulrich Germann, Kevin Knight, and Chin-Yew Lin, to name a few, not including many more researchers who visit our lab privately.
The talk is titled “Structured Perceptron with Inexact Search”, which also appeared in NAACL 2012, and is about how to make sure structured perceptron works well for the problems where exact inference is hard (e.g., parsing).
He focuses on the concept of “violation,” which is a hypothesis which has higher score than correct sequence. The update is modified so that it fixes these “violations” to achieve efficient learning curve.
Over the labor-day weekend, we went up to Bear Mountain State Park. We didn’t imagine we could do that without a car, but it tunned out doable, getting off at Manitou station (which is rather a stop in the middle of nowhere than a station) and walking across Bear Mountain Bridge on foot. We were simply exhausted by the time we got on the Metro-North train to head back to the city.
レイバー・デーの週末には、日帰りでベア・マウンテン州立公園まで行ってきた。車無しでも何とかなることが分かった。Metro-North の Manitou という駅（駅というか、電車が何も無いところで突然停車する感じ）で下り、ベア・マウンテン・ブリッジを徒歩で渡って公園まで行く。帰る時にはクタクタになってしまった。
I’ve finished watching “Game of Thrones” over the weekend, which is a fantasy TV drama series aird on HBO this spring. The main reason for wathing it is to see how Dothraki is spoken, which is an artificial language constructed specifically for this series by David J. Peterson, a member of the Language Creation Society. The entire series ended with a big “to be continued, ” and I cannot wait to watch Season 2.
週末にかけて、今春にHBOで放送したファンタジーTVドラマ “Game of Thrones” を見終わった。主な目的は、ドラマ中で話される「ドスラキ語」を聞いてみたかったこと。Language Creation Society の David J. Peterson によってこのドラマのためだけに設計された人工言語らしい。ドラマ自体は壮大な「続きオチ」で終わった。シーズン2が楽しみである。
There are several “Chinatowns” in New York City. We tried Vegetarian Dim Sum in Chinatown located in Manhattan, probably the most famous one, with some Chinese-speaking friends. This turned out to be my favorite so far, with cheap and healthy substitutes for seafood and ducks.
Yesterday, I went to play golf with my colleagues in Scarsdale. One of the greatest advantages of living in New York is that you can play golf for a very cheap price. In Japan, playing a full course would cost hundreds of dollars, while it’s only 30 bucks or so here.
(Continued from the last post)
We (my wife and I) have been hosting study group meet-ups in New York, and thought it’d be nice to have experimental meet-ups for Japanese learners of Chinese language. We made this video and the event web page, and planned to have meet-ups in three largest cities in Japan.
Unfortunately, there were few participants for Nagoya and Osaka, but we successfully held a 6-person study group in Tokyo and enjoyed practice and sharing information.
Later I heard from some of my friends that they wanted to participate but they didn’t know it was in Tokyo or they couldn’t make it because they were not available on that day. We should definitely hold meet-ups in Japan again!
4. Mitoh Conference at mixi’s office, Shibuya, Tokyo
It was fun to listen to many interesting projects and presentations (my favorite was Mr. Tanaka Taisei from Geisha Tokyo Entertainment and Mr. Goto Masaki from BestTeacher) and get to know new people in the industry.
I talked about Unnatural language processing (here is the slide), which was well received I think. (Actually, I didn’t sleep at all on the airplane from NY to Tokyo, partially because I wanted to avoid jet lag by being awake when it’s noon in Tokyo, but mainly because I wanted to finish my demos)
5. NLP20122 at Hiroshima City University, Hiroshima
The main reason I went back to Japan is to attend NLP2012, the largest NLP conference, which was held in Hiroshima this year.
I was one of the program committee members (I spent two full days creating the whole program with my boss, which is my first experience), had to chair a session, and had two presentation to make, so it turned out the busiest conference I’ve ever attended in Japan.
My main duty during the conference as a PC member is to broadcast UStream of tutorials and the invited talks without no problems. Despite of minor issues such as low-quality audio and illegible characters on slides, overall it went all right. Hope everybody enjoyed the broadcast. The sessions where I presented, especially the one for morphological analysis, was really popular, with a roomfull of audience (or some of them) aggressively discussing different perspectives.
After all, it was very fruitful event, getting to know many new researchers / students, and enjoying wonderful Hiroshima local speciality!
It’s been almost one month since I came back from the once-in-a-year trip back to Japan in March.
I had been too busy (i.e., too lazy to consider it a top priority) to write anything about it, but now let me briefly note down some of the event I participated in before my memory fades.
1. IPSJ Special Event at Nagoya Instite of Technology
I attended and made a presentation at a special event “Real-world Natural Language Processing.”
My talk was about the “Unnatural Language Processing”-related activities I’ve been involved in the past few years and I included catchy examples explaining how to “decode” Gal-Moji (letter subsitution based cryptic Japanese writing style used by high school girls) and “Cambridge” sentences. (this is one kind of Typoglycemia )
It was nice and intersting to listen to other talks presented by top-tier researchers, so interesting that everybody talked too much and there’s little time left for the panel discussion. We end up deciding not to have the panel discussion this time.
I heard there were 89 participants at the event, and it was also refreshing to me to meet local people who I haven’t met for long (including my advisors at Nagoya University). It was a pity I couldn’t linger there because I had to go to Tokyo on that very day. (Thanks to Inui san @ Tsukuba Univ. for organizing the event!)
2. Rakuten Tokyo office business trip in Shinagawa, Tokyo
One of the main reasons I went back to Japan is, of course, to visit Rakuten Headquarter in Shinagawa. I was really glad that I could avoid the rush-hour by staying at a hotel in Ohimachi and having a 20-minute-walk commute to the office every morning. Too many things are going on in the company and it was great to meet everybody after a 1-year gap and refresh my memory.
I’ll write the latter half of my trip later.
I have been working on a weekend project called “jufsisku” for the past few weeks. This project is to build a search engine where you can look up Lojban-English translations using queries in these two languages. You can try out the search here:
I have shown the demo to a group of Japanese-speaking lojbanist at our Skype study group the other day, and announced the initial version at the English-speaking mailing list for lojbanists. Overall, it was positively accepted, and I’m glad to see several people said they liked it. I personally believe that a bunch of good quality translations (and a system to search them) are essentical not only when you are translating some documents but also when you are writing in a foreign languages. Dictionaries don’t help very much because you have to know not only what words to use but also how to use them. This issue is more serious for languages with small number of speakers and learning materials, like Lojban, which is why I decided to start on this project.
And the translation data is stored in MongoDB, a flexible “NoSQL” database system, to which the users can add new sentences.
lojban jufsisku is only the beginning of my long-term goal to provide the best learning environment for Lojban. Any feedback is appreciated.
The paper we submitted to IJCNLP2011 has been accepted, and will be presented soon at the conference which will be held in a few weeks from now.
The paper describes the #ANPI_NLP project, a voluntary relief project focusing on text and safety information mining in the wake of The East Japan Earthquake in March, 2011.
Here’s the full paper PDF (which is kindly uploaded by the leading co-author Mr. Graham Neubig).
In the paper, we not only describe how the project was started and evolved and what kind of tasks we dealt with, but also focused on the lessons we learned from the project experience.
Even after the submission we have received some useful feedback from colleagues and peer researchers. In retrospect, we could have done more things during the relief effort and even BEFORE any disasters happen.
Please read the paper if you are interested, and give us back any feedback. (Floods in Thailand still continue as I write this article — I hope the conference is held without any problems)
About the AuthorMasato Hagiwara currently works for Rakuten Institute of Technology in New York, as a Senior Scientist. Have worked on search technologies at Google, Microsoft Research, and Baidu in the past. Expert in Natural Language Processing (NLP). Also a lead translator of the O'Reilly book "Natural Language Processing in Python." A native speaker of Japanese. Good command of English and Chinese (Mandarin). For more information, see About Me.
March 2015 M T W T F S S « Sep 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
- 100 NLP Papers
- About Me
- iconlang – new ideographic writing system for better visibility and legibility
- iconlang – 視認性・識別性向上のための新しい表意文字体系
- Music for Language Fans
- NLTK Japanese Corpora – NLTKで使える日本語コーパス
- Python/Romkan ローマ字とひらがなを相互に変換する Python用のライブラリ
- TinySegmenter in Python
- 中国語学習完全ガイド | １年以内にマスターする中国語
- 巻き舌クリニック – みんなで巻き舌を克服するサイト