Download the Brochure

Contact Us

Release


"Sudachi" is now available on the AWS search service "Amazon OpenSearch Service"
Realizes highly accurate search with the largest scale dictionary in Japan

2023/10/30

Download PDF

 (Head Office: Chiyoda-ku, Tokyo; CEO: Osamu Hata; hereinafter referred to as "WAP") announced today that "Sudachi," an open source software Japanese morphological analyzer*1 released free of charge by WAP, will be available from October 11, 2023 on Amazon Web Services (hereinafter referred to as (hereinafter referred to as "AWS"), as a custom plug-in for Japanese morphological analyzer. Until now, AWS users have often expressed a desire to use "Sudachi". From now on, users will be able to select "Sudachi" as a customization function within the search engine, which is expected to improve search capabilities.

Sudachi," a Japanese morphological analyzer with the largest vocabulary in Japan

 Amazon OpenSearch Service is a managed service that makes it easy to deploy, operate, and scale OpenSearch clusters in the AWS Cloud<span style="font-size: 10pt;">*2. When AWS customers choose Sudachi as a customization feature of the Amazon OpenSearch Service, Sudachi's high linguistic accuracy and flexibility enables highly accurate searches in the AWS Cloud. Sudachi's high linguistic analysis accuracy and flexibility enable highly accurate searches on the AWS Cloud.

 Sudachi is an open source software (OSS) Japanese morphological analyzer developed by WAP's Works Tokushima NLP Laboratory for Artificial Intelligence. Morphological analysis divides text into the smallest meaningful units and assigns them information such as parts of speech. Sudachi has the following features: (1) the largest vocabulary in Japan with over 2.9 million words, (2) multiple word segmentation units can be selected and used together, (3) it can absorb the differences in spelling of Japanese words such as differences in letter types and kana, and (4) various functions can be added through plug-ins. (4) Various functions can be added through plug-ins.

 The Python version of Sudachi, "SudachiPy," was released as OSS in June 2019 and exceeded 11 million downloads in September 2023. It has surpassed 11 million downloads.

For more information: https://worksapplications.github.io/Sudachi/

Differences in Kanji (variant, alternate, and conventional writing) Art - Geijutsu, Amazement - Marvel, Tokuyo - Getuyo

Differences in character types

Sunflower - Himawari, Sunflower - Himawari

Difference of sending kana

Acceptance - Acceptance - Acceptance

Condensed forms (a casual way of saying things)

~Chaaaa~te

For customer-specific help desk, trouble-shooting, and information dissemination regarding Sudachi

 WAP provides maintenance services related to Sudachi. Please contact us at:
https://landing.worksap.co.jp/SaaS_LP_Sudachi_LP.html

*Please contact AWS for inquiries about OpenSearch plugin (Sudachi).

Sudachi GitHub Sponsorship Opportunities

 GitHub Sponsor" is a sponsorship program offered by GitHub<span style="font-size: 10pt;">*3, released in 30 countries in 2019, to financially support developers and teams of open source projects WAP is seeking Sudachi GitHub Sponsors to strengthen Sudachi's research and development capabilities and ensure its sustainable development as OSS.
 Sudachi GitHub sponsorship is open to both individuals and organizations. The amount of the sponsorship fee and the frequency of support can be set freely, starting at $1 per contribution. Sponsors will receive a sponsor badge on the Github page, a logo on the website, access to the development roadmap, participation in the sponsor chat space, priority bug handling, workshops, and more.

For more information on sponsorship opportunities, please visit the following website:
https://github.com/sponsors/WorksApplications

Providing "support for configuration and dictionary creation for effective use of Sudachi" as a professional service.

 WAP's "Sudachi Enablement Configuration and Dictionary Creation Support" will be available for purchase on the AWS Marketplace<span style="font-size: 10pt;">*4 as a Sudachi Enablement Support on OpenSearch. This professional service will be available for purchase on the AWS Marketplace. In this professional service, experienced members of the Works Tokushima NLP Institute for Artificial Intelligence will provide a variety of assistance to improve search accuracy.

*1:Morphological analysis
A part of Natural Language Processing (NLP), a technology that breaks down "natural language," the words we commonly use in everyday life, into morphemes (the smallest unit of words in a group of words that have meaning)

*2:More about Amazon OpenSearch Service for more information.
 https://docs.aws.amazon.com/ja_jp/opensearch-service/latest/developerguide/what-is.html

*3:GitHub is used by 40 million developers and is the center of software development from open source projects It is the central development platform for software development from open source projects to business use.

*4:For more information about AWS, please visit *5:For more information about Amazon OpenSearch Service, please visit
  https://aws.amazon.com/jp/mp/marketplace-service/overview/

About Works Tokushima Artificial Intelligence NLP Laboratory

  Established by WAP in February 2017, this research institute specializes in NLP (natural language processing) within the field of AI and conducts research to help computers correctly process words with ambiguous expressions, overlapping meanings, and shaky notation. Many of our research results are used in our own products, such as chatbots and AI-OCR, and are also released free of charge as OSS for use by other companies and research institutes, to be used in research in this field and in corporate AI applications. In addition to its efforts in industry-academia-government collaboration, the company has received the Tokushima Prefecture Regional Informatization Award (e-Tokushima Award) for FY2022 for its contribution to the promotion of regional informatization, in recognition of the wide use of the OSS it has released free of charge.
https://nlp.worksap.co.jp/

*Released on January 18:
Works Tokushima Artificial Intelligence NLP Laboratory received the "Tokushima Prefecture Regional Informatization Commendation"
In recognition of its industry-academia-government joint research and natural language processing OSS with 7.6 million downloads

About Works Applications

Works Applications was founded in 1996 as an ERP package vendor in Japan. With innovative solutions such as no-customization and free version upgrades, we have supported the growth of our customers, mainly major Japanese companies. Believing in the potential of each individual, we aim to be a "growth engine" that maximizes the value of companies and individuals, and we will continue our pursuit of turning "work" into "creation" and making "work" fun.

*Company names, product names and service names are trademarks or registered trademarks of their respective companies.
*The information in this release is current as of the date of publication, and is subject to change or withdrawal without notice. Please be aware that the forecasts and other forward-looking information in this release are based on uncertainties and may differ from actual results.

For inquiries regarding this article, please contact

Public Relations, Works Applications Corporation
TEL : 03-3512-1400 03-3512-1400
FAX : 03-3512-1401
E-mail: [email protected]