   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭






专业背景:计算机科学 研究方向与兴趣: JavaEE-Web软件开发, 生物信息学, 数据挖掘与机器学习, 智能信息系统 目前工作: 基因组, 转录组, NGS高通量数据分析, 生物数据挖掘, 植物系统发育和比较进化基因组学

Tools for Natural Language Processing(转)   

2010-01-18 22:08:34|  分类: 计算机 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

Tools for Natural Language Processing(转)

Open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.


LingPipe is a suite of Java libraries for the linguistic analysis of human language.

Text simplification - Wikipedia, the free encyclopedia
Text simplification is an operation used in natural language processing to modify, enhance, classify or otherwise process an existing corpus of human-readable text in such a way that the grammar and structure of the prose is greatly simplified, while the underlying meaning and information remains the same. Text simplification is an important area of research, because natural human languages ordinarily contain complex compound constructions that are not easily processed through automation.



CoPT, Corpus Processing Tools
CoPT, Corpus Processing Tools, is a set of java classes intended to assist field linguists, NLP researchers and developers, students and software developers in all corpus-related processing.



Jazzy - Java spell checker API
Jazzy is a Java spell checker based on the algorithms used by aspell.



JLinkGrammarParser is a Java port of the CMU link grammar parser, a syntactic parser for english.



It’s a simple statistical spelling corrector.



jTokeniser is a Java library for tokenising strings into a list of tokens. A variety of possible tokenisers are available, including a very basic whitespace tokeniser, a more flexible StringTokeniser, a couple of regular expression tokenisers, and a tokeniser that utilises Java’s BreakIterator, which provides more complex, locale dependant tokenisation. More recently, a tokeniser that add breaks text into its constituent sentences. All are very simple to use.



Linguistic Tree Constructor
LTC is a free program for building linguistic syntax trees from text.

It lets the user build the tree in a point-and-click fashion.

The program does no analysis on its own — the user is completely free to draw the tree however he or she wishes. However, the program makes sure that the tree is a tree and not some other kind of graph.



MII Medical NLP Toolkit
This is a toolkit for medical natural language processing (NLP). The core engine is general enough to be used in a variety of text processing domains, though the toolkit includes specific support for medical reports and patient de-identification.



The nlpFarm is a Natural Language Processing (NLP) resource where early research prototypes (Java) can evolve into robust and useful open source. Our farmstead collaborates under the OpenNLP initiative, in order to make NLP software publically available.



OpenNLP provides the organizational structure for coordinating several different projects which approach some aspect of Natural Language Processing. OpenNLP also defines a set of Java interfaces and implements some basic infrastructure for NLP components



Open source natural language tools
Toolkit for implementing question answering systems and machine translation in both controlled languages and natural languages. Includes first order logic inference, parsing and semantic analysis, and APIs and standalone server software. Currently some t



The OpenNLP Grok Library
Grok is a library of natural language processing components, including support for parsing with categorial grammars and various preprocessing tasks such as part-of-speech tagging, sentence detection, and tokenization.



The OpenNLP Leo Project
Leo is a project to provide an architecture for defining XML specifications of grammars for different natural language parsing systems and tools for using that architecture to permit sharing of grammar resources across different systems.



The OpenNLP Maximum Entropy Package
Maximum entropy is a powerful method for constructing statistical models of classification tasks, such as part of speech tagging in Natural Language Processing. Several example applications using maxent can be found in the OpenNLP Grok Library.



Visuwords? online graphical dictionary - download source code
Download the source code for Visuwords.



Extraction from Text with Machine Learning and Natural Language Techniques



FerFT: Spectral Analyzer
This software is for multi-purpose power spectral analyzer based on the successive Fourier transformation method. (® UTD) It has been developed with Java (ver.1.5) and works on any OS implemented Java ver.1.5 or later.



Julius Speech Recognition Engine
Julius Speech recognition engine



Modular Audio Recognition Framework
MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.



VoxForge 0.0.1
Speech recognition support



OpenCCG: The OpenNLP CCG Library
OpenCCG, the OpenNLP CCG Library, is an open source natural language processing library written in Java, which provides parsing and realization services based on Mark Steedman’s Combinatory Categorial Grammar (CCG) formalism.



Joone (Java Object Oriented Neural Engine) is an artificial neural network Java framework. It is used to build and train neural networks with a powerful visual environment. It has a modular design and can be easily extended by writing new modules to implement new learning algorithms or architectures.


阅读(1427)| 评论(0)



<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->


网易公司版权所有 ©1997-2018