About the Tanaka Corpus, Tatoeba.org and www.ManyThings.org
Bilingual Sentence Pairs - About the Tatoeba Project - Warning / Caution - Japanese Reading Practice
Attribution, Copyright, etc.
Attribution / License
- www.ManyThings.org has used this data from tatoeba.org. Their data is licensed under the Creative Commons - Attribution 2.0 France license. (terms of use page on tatoeba.org)
Copyright
- Though the source data is available under the CC BY license, the pages on this website have been edited and are copyrighted.
- If you would also like to do a project using this corpus, you can get the source data from their downloads page.
Sections of This Website That Use the Data
- Daily Listen and Repeat Pronunciation Practice
- English Sentences with Audio
- Bilingual Sentence Pairs
- Japanese-English Parallel Corpus (日英パラレルコーパス)
- Japanese Reading and Translation Practice
- I have attempted to filter out sentences that are potentially offensive and not appropriate for all ages. If you find anything I missed, please let me know. (Send a message via kelly.reachby.com)
More Information about the Tatoeba Project
Tatoeba.org is a Multilingual Corpus of Sentence Equivalents
- Tatoeba.org is a large database of example sentences translated into many languages by its members who volunteer their time.
- You, too, can easily become a member and help make corrections and additions.
- If you do become a member, I highly recommend that you ONLY contribute sentences in your own native language, translating from your non-native language.
- It's very easy to sound natural in your own native language, and very easy to sound unnatural in your non-native language.
- For more information watch What is Tatoeba.org @ YouTube.com - A video by TRANG (the tatoeba.org admin)
- What You Can Do and How to Do It
Basic Guidelines for Contributions - The Short Version
You can click the links to read [Translations] or read the [Source] documentation.- 1. We want complete sentences.
- [Translations] [Source]
- 2. Don't change sentences that are correct.
- [Translations] [Source]
- You can, instead, submit natural-sounding alternate translations.
- 3. Don't add sentences from copyrighted sources.
- [Translations] [Source]
- 4. We want natural-sounding translations, not word-for-word direct translations.
- [Translations] [Source]
Of course, we don't want computer translations.
- 5. Make a good translation of the sentence that you are translating. Don't let translations into other languages influence you.
- [Translations] [Source]
- 6. Don't add annotations.
- [Translations] [Source]
In other words, don't do this kind of thing.
X It's raining cats and dogs. (idiom)
X I like her/him.
Warning! There are errors!
- Please read the Warning / Caution.
- Briefly, not only are there errors, but there are sentences that sound a bit unnatural.
- In addition to the errors still remaining from the original Tanaka Corpus, there are also errors in new sentences contributed by members who are overconfident in their abilities in their non-native langauges.
Who Else Uses Data from the Tatoeba.org Project?
Other Things for Members
- A Short List of Native Speakers
- More Things for Tatoeba.org Members
And ideas for tatoeba.org programmers.