---
title: Learning Japanese the Rubyist way
teaser: 'If you know Ruby then you already know some Japanese grammar.

  '
tags: japanese,ruby,new bamboo,web
author: Makoto Inoue
published_on: 2010-12-17
---

_This post was originally published on the New Bamboo blog, before [New Bamboo
joined thoughtbot in London][new-bamboo-thoughtbot]._

---

- Introduction
- Step 1: How to read Japanese characters
- Step 2: Japanese and OO
- Step 3: Japanese and functional
- Step 4: Writing Japanese programming language in Ruby
- Summary
- Ruby Advent Calendar

## Introduction

Have you ever thought about learning Japanese, but it looks too difficult to
learn? Surprisingly, Japanese and Ruby share some common features and concepts.
This is a shortcut version of my presentation called "Japanese and Ruby" which I
presented at [LRUG].

When you finish reading this post, hopefully you find Japanese language less
magical, and may even add "Learn Japanese" to one of your 2011 new year
resolutions.

Learning a new language always has a bit of steep learning curve. Go and get
some coffee before you start!!

## Step 1: How to read Japanese characters

One of the first big hurdle when learning a language is to remember all
characters. This is not a issue if you are learning a language based on
alphabet, but many non western languages have their own character sets.

To make the matters worse, Japanese uses three different character set, Kanji
(Chinese character), Hiragana, and Katakana. Hiragana and Katakana each has 46
characters and there are a lots of Kanji (possibly 50,000, though we use ONLY
2000 ~ 3000 in daily use).

Here is the mapping of Hiragana, Katakana, and Alphabet.

![Hiragana Katakana Kanji](https://images.thoughtbot.com/new-bamboo/blog/learning-japanese-the-rubyist-way/Y7xrTZJqT9uTm9xtaOmt_531px-Nihongo_ichiran_01-converted.svg.png)

(The diagram is from [Wikipedia][jp-lang])

The point here is not to overwhelm you with the amount of information, but to
let you think "WHY" Japanese uses 3 character sets.

Originally, Japanese did not have its own character set, so we used to borrow
characters from China (Kanji). Since Chinese grammar and Japanese grammar are
completely different, it was not easy to map all these Kanji into Japanese
sentence. That's when Hiragana and Katakana were born to supplement Kanji.
Hiragana is often used as a glue to combine words into sentence where Kanji
alone is not good enough.

For example "行" is a Kanji character which means "to go". Japanese has many
different ways to change the ending of verb (eg: goes) end we use Hiragana to
supplement. Here are the examples.

<table border="1" cellspacing="0">
  <tr>
    <th>Japanese</th>
    <th>Alphabet</th>
    <th>Meaning</th>
  </tr>
  <tr>
    <td>行く</td>
    <td>Iku</td>
    <td>I go</td>
  </tr>
  <tr>
    <td> 行かない</td>
    <td>Ikanai</td>
    <td>I do not go</td>
  </tr>
  <tr>
    <td> 行こう</td>
    <td>Ikou</td>
    <td>Let's go, the more casual way</td>
  </tr>
  <tr>
    <td> 行きましょう</td>
    <td>Ikimasho</td>
    <td>Let's go, the more polite way</td>
  </tr>
  <tr>
    <td> 行け</td>
    <td>Ike</td>
    <td>Go, very non polite way</td>
  </tr>
</table>

NOTE: If the above examples do not look like Japanese, you have an encoding
issue. Make sure that your browser encoding is set to UTF-8.

Katakana, on the other hand, was often used alongside with Kanji so that people
can understand how to pronounce the Kanji. Nowadays, Katakana is often used to
represent new words which came from foreign countries.

eg: <ruby><rb>漢字</rb><rp>（</rp><rt>カンジ</rt><rp>）</rp></ruby>,  ルビー

(Trivia. The above example is expressed with html5 [ruby tag] )

(Another Trivia. Before multibyte became common, Japanese computers were only
able to handle Alphabet and single-byte Katakana (eg: ﾙﾋﾞｰ), instead of
multibyte (eg:  ルビー). Some banks' ATM slips still use this single byte Katakana)

Even though they are the conventions, you can use Kanji, Katakana, and Hiragana
interchangeably.

The following 3 all mean "Cherry blossom bloom" and pronounce the same "Sakura
Saku":

- 桜咲く
- サクラさく
- さくらさく

(Trivia. The world "Karaoke" is the combination of Kanji "Kara"(空 , means
"Empty")  and English "Oke" Orchestra.)

Here is the quick recap of what you learnt so far.

- Kanji came first to import Chinese words
- Hiragana was created to suit for domestic use
- Katakana is used to adopt new words

Doesn't this "There are many ways to achieve one thing" concept familiar with
Ruby's philosophy?

- Ruby came first to bring concept of OO & Functional
- Ruby was created to suit for every day scripting use
- Ruby keeps evolving by adopting new concepts (Fiber,
  Multinationalization/M17N, Refinements etc)

##  Step 2: Japanese and OO

Satoshi Nakashima is a well known Japanese blogger who used to work at Microsoft
as one of the development team members who shipped Windows 95 and Internet
Explorer.

He was once asked "Are there anything it helped to create Windows 95 as a
Japanese?" He initially did not come up with anything, but then thought that
Japanese grammar structure is more suited to Object Oriented programming. To
explain his thought, I will explain you some basic Japanese grammar.

English and Japanese has very different grammatical order.

English grammar structure is called "SVO"(Subject - Verb - Object), while
Japanese one is called "SOV"(Subject - Object - Verb)

If I put "I eat bacon" in Japanese order, it is going to be "I bacon
eat"(Watashi ha bacon wo tabemasu "私はベーコンを食べます")

At first glance, English order is clearer as "what you do"(verb) comes next to
"who does it"(subject). It's almost like command line options (eg: git clone
url).

The problem of command line options is that there are so many choices that it's
hard to figure out which command you are supposed to use.

On the other hand, Japanese grammatical order is more similar to GUI. You often
(right-mouse) click an object you are interested, then it suggests the possible
actions. This is much more user friendly because you do not have to know all the
possible actions and its argument options.

As you already know, Ruby is one of the best scripting languages to express OO
(though you can write in procedural, or "command oriented way" if you wish)

```ruby
# Procedural
open("box")
open("car")
open("file", "foo.txt")

# OO
Box.new.open
Car.new.open
File.open("foo.txt")
```

In the above example, they both do exactly the same thing, but the
implementation will be quite different. For procedural example, I imagine that
you have to keep adding nested "if" statement as logic becomes more complicated.
On the other hand, the logic of OO way is kept isolated within each class.

## Step 3: Japanese and functional

I often says Japanese is a politician's language. What does this mean? My
definitions of politicians's are:

- they do not commit to anything unless necessary
- they mean different things depending on context

In Japanese grammar, there is a term called "Postpositional" ("Pre-positional"
is often used in English, such as _for_ you, _after_ dinner, and so on).
Postpositional is used to decide the role of noun which it supports. This
enables you to change the order of structure very flexibly, chain as many
sentence as you like, and also let you omit subject.

Here are some examples of what I just said.

<table border="1" cellspacing="0">
<tr>
  <th>Japanese</th>
  <th>English</th>
  <th>How to pronounce</th>
  <th>Structure</th>
  <th>How it is ordered if written in English</th>
</tr>
<tr>
  <td>私はベーコンを食べます</td>
  <td>I eat bacon</td>
  <td>Watashi ha bacon wo tabemasu</td>
  <td>SOV</td>
  <td>I bacon eat</td>
</tr>
<tr>
  <td>ベーコンを私は食べます</td>
  <td>I eat bacon</td>
  <td>Bacon wo watashi ha tabemasu</td>
  <td>OSV</td>
  <td>Bacon I eat</td>
</tr>
<tr>
  <td>ベーコンを食べます</td>
  <td>I eat bacon</td>
  <td>Bacon wo tabemasu</td>
  <td>OV</td>
  <td>Bacon eat</td>
</tr>
</table>

And this is the example of chaining too much sentence together.

One of the common mistakes Japanese people make when writing a sentence is
chaining too much, because it is very hard to digest the whole sequence (One of
my friends explained this as "Don't write a sentence which could cause stack
overflow").

What makes English very logical and concise (in my opinion) is because the
subject and verb comes at the beginning. Even though you can still write verbose
sentence in English, this strict ordering forces you to write things relatively
concise.

On the other hand, you can write a lot of sentence in Japanese meaning nothing
because it omits subject, and also the verb you used at the very end have very
loose relationship to the sentence you started at the beginning.

Now let's move back to how this (loosely) relates to some of the concepts in
Ruby.

- they do not commit to anything unless necessary => Lazy evaluation
- they mean different things depending on context => Block

eg:

```ruby
10000.times # ==> #<Enumerator: 10000:times>

User.order('users.id DESC').limit(20).includes(:items)

File.open("/tmp.txt").each do |line|
  puts line
end
```

Functional features of Ruby lets you do crazy meta-programming. Though they are
powerful, abusing may confuse people to understand the code and may cause
unexpected bug ;-P

## Step 4: Writing Japanese programming language in Ruby

So, how are you doing so far? Easy peasy Japanesey?

(Trivia: the above expression is apparently common phrase in UK, derived from
some TV commercial saying "easy peasy lemon squeezy")

When you learn a new language, reading books/articles are not enough. You always
need to practice. Having said that, speaking to real Japanese people from day
one may be a bit too difficult (or you just do not have a Japanese friend ;-( ),
so here is a toy for you to play around.

- [Japanize]

Some of my colleague once asked me "Are there any Japanese programming
languages? What I mean is not just to be able to write Japanese text as string,
but all programming syntax (such as "if", "loop") are actually in Japanese".
Yes, there are some. [Nadeshiko] and [Mind] are the ones. However, I decided to
write it myself using Ruby, and here is the result.

<iframe width="650" height="396" src="//www.screenr.com/embed/F9L" frameborder="0"></iframe>

Looks amazing, isn't it?

Here are few more Japanese examples to understand what I just showed.

- 'に' and 'を' are postpositional which means that the words in front of them (1
  and 2) are objects.
- 'たす'(hiragana) and '足す'(kanji + hiragana) are both verbs and mean "to add"
- 'て' is also postpositional which says this is end of one sentence and next
  sentence will start (equivalent to "and")

In my programme, I simply used postpositional as delimiters to split a Japanese
phrase into words.

(Trivia: Japanese words are not separated by space, so tokeniziing Japanese are
  very important part of natural language processing)

So

-１に２をたして４を掛ける

Becomes

```ruby
[1, 2, :+, 4, :*]
```

Japanese grammar is a bit like [reverse polish notation], or a stack machine
which is often used by a compiler to process a programming language. So the
above array is equivalent to the following mathematical calculation.

```ruby
describe Evaluator do
  it "must calculate all operands" do
    Evaluator.new([1, 2, :+, 3 , :* , 1, :-, 2, :/]).
      evaluate.must_equal ((((1+2) * 3) - 1 ) / 2)
  end
end
# NOTE: This is minitest which comes by default in Ruby 1.9
```

There are a few more secrets

<script src="https://gist.github.com/1333471.js?file=gistfile1.rb"></script>

If you see the video closely, you can notice that the number I typed is slightly
different from normal ascii number. It's unicode number, so it raises "undefined
local variable or method" error.

Since Japanese does not have any space between words, you can catch an entire
sentence as a method.

<script src="https://gist.github.com/1333513.js?file=gistfile1.txt"></script>

So I just passes the entire expression as one method and catches at
method_missing.

<script src="https://gist.github.com/1333517.js?file=gistfile1.txt"></script>

This is how japanize works.

When I was researching how to implement very simple compiler/interpreter, I
learnt a lot from an article written by Koichi Sasada (the creator of Ruby 1.9
Virtual Machine). The article is written in Japanese, but there is one sample
code which implements some basic VM functionalities in Ruby.

- [RubiMaVM]

The code handles not just maths, but also loop and if statement. If you are
curious enough, you could implement something similar on top of Japanize. I will
accept pull request as long as it looks like Japanese !!

## Summary

Here are the list of things you learnt through this post.

- Japanese uses 3 characters, Kanji, Hiragana, and Katakana.
- Japanese grammar structure is Subject - Object - Verb(SOV)
- Japanese order can be flexible thanks to postpositional

Even though Matz did not intend to reflect Japanese language into the design of
Ruby, I think there are certain influence, since anyone's thought is influenced
by the language they use.

![](https://images.thoughtbot.com/new-bamboo/blog/learning-japanese-the-rubyist-way/E0VPomqHR06TvTH9gHEK_image_1.jpg)

(NOTE: \@yukihiro_matz says "Japanese and Ruby? I try not to think too much about
Japanese culture. The method chain looks like Japanese, but it's just a
coincident. Having said that, the support of M17N is heavily influenced by the
use case of Japanese people. Otherwise, I wouldn't spend too much time on such a
hard problem".  You can compare how \@matz_translated bot actually translated the
sentence).

If you are interested more, the full slide of my talk at LRUG is here.

<iframe src="//www.slideshare.net/slideshow/embed_code/6165968" width="427"
height="356" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"
style="border:1px solid #CCC; border-width:1px; margin-bottom:5px;
max-width: 100%;" allowfullscreen></iframe>

<div style="margin-bottom:5px">
  <strong>
    <a href="https://www.slideshare.net/inouemak/ruby-and-japanesepdf"
    title="Ruby and japanese" target="\_blank">
      Ruby and japanese
    </a>
  </strong>
  from
  <strong>
    <a href="http://www.slideshare.net/inouemak" target="\_blank">
      inouemak
    </a>
  </strong>
</div>

The talk was [videotaped and uploaded][video-upload] after the event.

## Ruby Advent Calendar

This is Day 17 of [Ruby Advent Calendar] (Each day leading up to December 25th,
one person posts an article to their blog and adds a link to their blog on the
Advent Calendar). The previous entry was written by [matschaffer] or
[gautamrege] (order seems a bit screwed up), and the next will be written by
[elight].

[elight]: http://evan.tiggerpalace.com
[gautamrege]: http://blog.joshsoftware.com/
[Japanize]: http://github.com/makoto/japanize
[jp-lang]: http://en.wikipedia.org/wiki/Japanese_language
[LRUG]: http://lrug.org
[matschaffer]: http://matschaffer.com
[Mind]: http://www.scripts-lab.co.jp/mind/whatsmind.html
[Nadeshiko]: http://nadesi.com/
[new-bamboo-thoughtbot]: https://thoughtbot.com/blog/new-bamboo-joins-thoughtbot-in-london
[reverse polish notation]: http://en.wikipedia.org/wiki/Reverse_Polish_notation
[RubiMaVM]: http://jp.rubyist.net/magazine/?0007-YarvManiacs
[Ruby Advent Calendar]: http://atnd.org/events/10439
[ruby tag]: http://www.quackit.com/html_5/tags/html_ruby_tag.cfm
[video-upload]: https://skillsmatter.com/meetups/873-japanese-and-ruby-and-processing-tweets-at-the-bbc
