[Corpora-List] CFP: IJCAI 2007 Workshop on Analytics for Noisy Unstructured Text Data

From: L V Subramaniam (lvsubram@in.ibm.com)
Date: Tue Jun 20 2006 - 15:33:29 MET DST

  • Next message: belendb@ujaen.es: "[Corpora-List] learner corpora"

    CALL FOR PAPERS

                                                    AND 2007

                                        IJCAI 2007 Workshop on
                       Analytics for Noisy Unstructured Text Data
                             8 January, 2006, Hyderabad, India

                             http://research.ihost.com/and2007
     
                 held at 20th Int. Jt. Conf. on Artificial Intelligence
                           (IJCAI 2007) http://www.ijcai-07.org/

                  Deadline for Papers is September 25th 2006
     
    WORKSHOP DESCRIPTION AND OBJECTIVES
    Noisy unstructured text data is found in informal settings such as online
    chat, SMS, emails, message boards, newsgroups, blogs, wikis and web
    pages. Also, text produced by processing spontaneous speech, printed
    text, handwritten text contains processing noise. Text produced under
    such circumstances is typically highly noisy containing spelling errors,
    abbreviations, non-standard words, false starts, repetitions, missing
    punctuations, missing case information, pause filling words such as "um"
    and "uh." Such text can be seen in large amounts in contact centers,
    on-line
    chat rooms, OCRed text documents, SMS corpus etc. The theme of the IJCAI
    2007 Conference is "AI and its benefits to society." In keeping with this
    theme,
    this workshop proposes to look at text analytics of highly noisy text that
    is
    produced in such everyday applications in society.

    The goal of the workshop is to focus on the problems encountered in
    analyzing
    such noisy documents coming from various sources. The nature of the text
    warrants moving beyond traditional text analytics techniques. We hope that
    the
    workshop will allow researchers to present current research and
    development in
    addressing this challenge. We also believe that as a result of this
    workshop
    there will be sharing of real life noisy data sets and will result in
    their
    becoming available to a wider research community.
     
    TOPICS
    We welcome original research papers that identify key problems related to
    noisy text analytics and offer solutions. We particularly encourage
    contributions that look at solving real life problems in the different
    settings where such data is produced in huge amounts. Potential topics
    include (but not limited to):
    * NLP techniques for handling noisy unstructured data
    * Characterization of the types of noise in documents
    * Genre recognition based on the type of noise
    * Robust parsing
    * Characterizing, modeling and accounting for historical language change
    * Methods for detecting and correcting spelling and grammatical errors in
      noisy text
    * Information Extraction and Retrieval from noisy text
    * Automatic classification and clustering of imprecise documents
    * Noise-invariant document summarization techniques
    * Issues in keyword search in presence of noise in unstructured data
    * Machine Translation for noisy text
    * Text analysis techniques for analysis and mining of call-logs,
    transcribed
      calls, web logs, chat logs, email exchanges
    * Business Intelligence(BI) applications for contact centers that deal
    with
      noisy data
    * Surveys on aspects of text analytics for noisy unstructured data
     
    PARTICIPATION
    We hope that the workshop will allow researchers working in areas related
    to
    unstructured data analytics, Natural Language Processing, Information
    Extraction, Information Retrieval, etc., to focus on the needs of users
    extracting useful information from noisy text. The target audience is a
    mixture of academia and industry researchers working with noisy text. We
    believe this work is of direct relevance to domains such as call centers,
    the world-wide web, and government organizations that need to analyze huge

    amounts of noisy data.

    IAPR ENDORSEMENT
    This workshop is endorsed by the International Association for Pattern
    Recognition (http://www.iapr.org)

    IMPORTANT DATES
    Paper Submission: September 25th, 2006
    Notification of Acceptance: October 23rd, 2006
    Camera-Ready papers due: November 8th, 2006
    Workshop at IJCAI 2007: January 8th, 2007
     
    SUBMISSION REQUIREMENTS
    We invite papers up to 8 pages in length in the style specified by IJCAI
    at
    (pdf: http://www.ijcai-07.org/ijcai07_format.pdf,
    word: http://www.ijcai-07.org/ijcai07_format.dot,
    LaTeX: http://www.ijcai-07.org/ijcai07_format_latex.tar).
     
    Submissions should be made electronically to lvsubram@in.ibm.com and
    rshourya@in.ibm.com before September 25th, 2006.

    PUBLICATION
    We are currently in negotiation with a leading publisher for the
    proceedings
    to be available onsite. We are also arranging a journal special issue for
    post-workshop publication of selected papers.
     
    WORKSHOP CHAIRS
    Craig Knoblock
    University of Southern California
     
    Daniel Lopresti
    Lehigh University
     
    Shourya Roy
    IBM Research, India Research Lab
     
    L. Venkata Subramaniam
    IBM Research, India Research Lab

    WORKSHOP CONTACTS
    * L. V. Subramaniam lvsubram@in.ibm.com
    * Shourya Roy rshourya@in.ibm.com

    Please visit the workshop website
    ***** http://research.ihost.com/and2007 *****
    for information about participation and submitting papers.

    For general information, please visit the IJCAI website
    ***** http://www.ijcai-07.org *****



    This archive was generated by hypermail 2b29 : Wed Jun 21 2006 - 00:07:28 MET DST