Software tools


Perldoop

Perldoop is a new open-source tool that automatically translates Hadoop-ready Perl scripts into its Java counterparts, which can be directly executed on Hadoop clusters while improving their performance significantly. 

You can download the source code from the Git repository or here.

An User Manual can be downloaded here.

Perldoop v0.6.3 (november 2014) includes:

 
If you use Perldoop, please cite this article: 
J. M. Abuin, J. C. Pichel, T. F. Pena, P. Gamallo and M. Garcia. "Perldoop: Efficient Execution of Perl Scripts on Hadoop Clusters", IEEE International Conference on Big Data, pp. 766-771, 2014. (PaperBibTex Reference)

CitiusTagger and CitiusNec

A PoS-Tagger and Named Entity Classification tool for Portuguese and Spanish

CitiusTagger / CitiusNec is an open source software, written in Perl, to perform both PoS tagging and Named Entity Classification in the Portuguese and Spanish languages. It has been developed at CITIUS by the ProLNat@GE group. It makes use of the same tagset as FreeLing.

You can test it in our DEMO and download it here.

If you use this tool, please cite the article: 
P. Gamallo, J. C. Pichel, M. Garcia, J. M. Abuin, T. F. Pena. "Análisis Morfosintáctico y Clasificación de Entidades Nombradas en un Entorno Big Data", Procesamiento del Lenguaje Natural, vol. 53, pp. 17-24, 2014. (PaperBibTex Reference)

 

How to install

# tar xzvf CitiusTool.tar.gz
# cd CitiusTool 
# sh install-citiustool.sh

How to use

# sh nec.sh
Syntax: nec.sh language file

language=pt, es
file= path of the file input

Spanish PoS-Tagger

The Spanish POS-tagger has been trained with the Ancora corpus. The current version of the lexicon contains the same forms as FreeLing.

Portuguese PoS-Tagger

The European Portuguese FreeLing POS-tagger has been trained with the following linguistic resources: