Our part of speech tagger (a fast implementation of the Brill algorithm)
has been trained on both mixed case text and all upper case text. On
mixed case Wall Street Journal data we get tagging accuracy of around
96.5%, and on artificially upcased WSJ data we get around 94.5% accuracy.
Our tagger is available as part of the Alembic Workbench distribution,
and ships with rules and lexica for both mixed case and upcase English.
http://www.mitre.org/technology/alembic-workbench/
Regards,
John
-------------------------------------------------------
John Aberdeen aberdeen@mitre.org
Senior Scientist Natural Language Processing
The MITRE Corporation voice +1.781.271.2840
Bedford, Massachusetts USA fax +1.781.271.2352
-------------------------------------------------------