[Corpora-List] Weblogs Corpus + 2nd CFP for the Int. Conference on Weblogs and Social Media (ICWSM)

From: Nicolas Nicolov (Nicolas@umbrialistens.com)
Date: Fri Sep 15 2006 - 18:56:09 MET DST

  • Next message: Sophia Ananiadou: "[Corpora-List] 2 positions in TEXT MINING-- University of Manchester/NaCTeM"

    =============================================
    Int. Conference on Weblogs and Social Media
    March 26-28, 2007
    Boulder, Colorado, U.S.A.
    www.icwsm.org
    =============================================

    Availability of Data

    Continuing the tradition from the WWE'06
    workshop, we are once again offering a large
    blog dataset to conference participants. The
    data release comprises a complete set of
    weblog posts collected by Nielsen BuzzMetrics
    for May 2006 (consisting of about 14M posts
    from 3M weblogs). The data set includes the
    full content of the posts plus mark-up and
    represents an unprecedented collection for
    blog researchers. Our hope is that a communal
    dataset, approached from many different
    directions, will yield many interesting
    results. More information on the dataset,
    which is available for immediate download,
    can be found at:
    http://www.icwsm.org/data.html

    Call for Papers

    Recent years have seen a flourishing of social
    media - the promise of the WWW coming to fruition.
    Across the world, individuals can share opinions,
    experiences and expertise at the push of a button.
    There has been a fundamental shift thanks to
    significant advances in the ease of publishing
    content. Creating web content was for years the
    domain of tech-savvy people; now the barrier has
    been torn down.

    Perhaps the most visible among the successes of
    social media in recent years is the blogosphere.
    Tens of thousands of new blogs are created every day;
    blog content is becoming ubiquitous, surfacing
    in news portals, search results and corporate
    public relations. Even those who are unaware of the
    blogosphere are still influenced by its content.
    Although blogs are highly visible currently, other
    forms of conversational spaces continue to flourish,
    especially message boards, mailing lists, review
    sites and Usenet.

    Social media covers all forms of sharing: from
    photos, to videos, to recommendations. In the past
    few years, many examples of social media have
    become hugely successful. Flickr is a premier photo
    sharing site; del.icio.us has become a touchstone
    for sharing recommendations of websites; Web 2.0
    applications in general abound with newcomers in
    the social media space.

    One of the fascinating aspects of social media
    has been the drive from within to study the
    ecology as it evolves. People act at once as
    creators, observers and influencers of the space
    in which they participate. At the same time,
    businesses are quickly grasping the potential
    benefit to attending to the new space of social
    media. Monitoring the aggregate trends and
    opinions revealed by social media provides
    valuable insight to a number of business
    applications: marketing intelligence, competitive
    intelligence.

    The fast growing blogosphere and social media space
    is a fruitful area for investigations across many
    disciplines. For example:

      * Natural language processing and machine learning
        researchers study the extraction of factual
        information from text; can blogs be processed in
        a robust manner and can knowledge bases be
        populated with facts from blogs?
      * Social network researchers and graph theory
        researchers are concerned with inferring
        community structure; analyzing the linkage
        patterns among blog entries can provide explicit
        community structure; can we infer implicit
        communities through the content of the blogs?
      * Political scientists are looking at ways of
        identifying influencers in a community; who are
        the influential bloggers whose voice is echoed
        by others?
      * Multimedia researchers are attempting to
        categorize audio and video content, aggregate
        information from diverse sources (textual, audio,
        video); can visual & audio social media be stored
        in a way that allows search across different
        modalities?
      * Market analysis researchers are concerned with
        what people think of the products and services
        of a company; can we process blogs automatically
        and find consumer complaints and breaking reports
        about vulnerabilities of products; also when does
        a burst of blogging activity become a trend?
      * Social psychologists study the response to
        current events, including emotional and
        attitudinal dimensions as well as content and
        patterns of influence.

    Despite the growing relevance of blogs and social
    media, existing research has only begun to address
    the spectrum of issues that arise in their analysis.
    Blogs, for example, are a different kind of document
    than the relatively clean text that NLP research is
    based on. Such differences in term of structure,
    content and grammaticality will be a challenge
    considering that blogs will likely represent the most
    common way of publicly accessible personal expression.

    AREAS OF INTEREST

    The conference aims to bring together researchers
    from different subject areas (e.g., computer science,
    linguistics, psychology, statistics, sociology,
    multimedia and semantic web technologies) and foster
    discussions about ongoing research in the following
    areas:

    [01] AI methods for ethnographic analysis through
         social media.
    [02] Blogosphere vs. mediasphere; measuring the
         influence of blogs on the media.
    [03] Centrality/influence of bloggers/blogs; ranking/
         relevance of blogs; web pages ranking based on
         blogs.
    [04] Crawling/spidering and indexing.
    [05] Human Computer Interaction; social media tools;
         navigation.
    [06] Multimedia; audio/visual processing; aggregating
         information from different modalities.
    [07] Semantic analysis; cross-system and cross-media
         name tracking; named relations and fact
         extraction; discourse analysis; summarization.
    [08] Semantic Web; unstructured knowledge management.
    [09] Sentiment analysis; polarity/opinion
         identification and extraction.
    [10] Social Network Analysis; communities
         identification; expertise discovery;
         collaborative filtering.
    [11] Text categorization; gender/age identification;
         spam filtering.
    [12] Time Series Forecasting; measuring
         predictability of phenomena based on social
         media.
    [13] Trend identification/tracking.
    [14] Visualization, aggregation and filtering.
    [15] New social media applications, interfaces,
         interaction techniques

    IMPORTANT DATES

    Submissions: December 8, 2006
    Acceptance Notifications: February 2, 2007
    Camera ready copies: February 16, 2007
    Tutorials: March 25, 2007
    Conference: March 26-28, 2007

    SUBMISSION

    People interested in participating should submit
    through the conference website a technical paper
    (up to 8 pages), a short paper (up to 4 pages),
    a poster or demo description (up to 2 pages)
    by midnight (PST) of Dec 8, 2006. Each submission
    should, to the extent possible, indicate a list of
    relevant areas from the list above (e.g., 03, 04, 10).

    CHAIRS

      * Natalie Glance, Nielsen BuzzMetrics.
      * Nicolas Nicolov, Umbria Inc.

    CO-CHAIRS

      * Eytan Adar, Univ. of Washington.
      * Matthew Hurst, Nielsen BuzzMetrics.
      * Mark Liberman, Univ. of Pennsylvania.
      * Franco Salvetti, Univ. of Colorado at Boulder &
        Umbria Inc.

    LOCAL CHAIR

      * James H. Martin, Univ. of Colorado at Boulder.

    PROGRAM COMMITTEE

      * Paolo Avesani, ITC-irst, Italy
      * Bran Boguraev, IBM Research, USA
      * Chris Brooks, Univ. of San Francisco, USA
      * Claire Cardie, Cornell Univ., USA
      * Scott Carter, UC Berkeley, USA
      * Steve Cayzer, HP Labs Bristol, UK
      * Thierry Declerck, DFKI Language Lab, Germany
      * Donghui Feng, ISI, USC, USA
      * Tim Finin, UMBC, USA
      * Kathy Gill, Univ. of Washington, USA
      * Michelle Gumbrecht, Stanford Univ., USA
      * John Henderson, MITRE, USA
      * Eduard Hovy, ISI, USC, USA
      * Jussi Karlgren, SICS, Sweden
      * Laura Knudsen, OSC, USA
      * Moshe Koppel, Bar-Ilan Univ., Israel
      * Cameron Marlow, Yahoo! Research, USA
      * Lluis Marquez, Univ. Poli. de Catalunya, Spain
      * Rada Mihalcea, Univ. of North Texas, USA
      * Gilad Mishne, Univ. of Amsterdam, The Netherlands
      * Tomoyuki Nanno, Google, Japan
      * Apostol Natsev, IBM Research, USA
      * Kamal Nigam, Google, USA
      * Peter Norvig, Google, USA
      * Jon Oberlander, Univ. of Edinburgh, Scotland
      * Peter Pirolli, PARC, USA
      * Oana Postolache, Univ. of Saarland, Germany
      * John Prager, IBM Research, USA
      * Alessandro Provetti, Univ. of Messina, Italy
      * Drago Radev, Univ. of Michigan, USA
      * Jonathon Read, Univ. of Sussex, UK
      * Maarten de Rijke, Univ. of Amsterdam
      * Laura Ripamonti, Univ. of Milan, Italy
      * Irina Rish, IBM Watson Research Center, USA
      * Dan Roth, Univ. of Illinois at Urbana-Champaign
      * James G. Shanahan, Turn Inc., USA
      * Emma Shen, OSC, USA
      * Suresh Sood, Univ. of Tech. Sydney, Australia
      * Savitha Srinivasan, IBM Research, USA
      * Carlo Strapparava, ITC-irst, Italy
      * V.S. Subrahmanian, Univ. of Maryland, USA
      * Belle Tseng, NEC Labs America, USA
      * Janyce M. Wiebe, Univ. of Pittsburgh, USA
      * Tong Zhang, Yahoo! Research, USA
      * Liang Zhou, ISI, USC, USA
      * Ethan Zuckerman, Harvard Univ., USA

    VENUE

    The conference will take place at Marriott Boulder
    (http://marriott.com/property/propertypage/DENBO)
    located near downtown Boulder, Colorado.

    SPONSORS

    ICWSM is proud to be supported by:

      * Google, Inc.
      * Microsoft Live Labs
      * NEC Labs America
      * Sphere

    and

      * Nielsen BuzzMetrics.
      * Umbria, Inc.
      * University of Pennsylvania
      * University of Maryland, Baltimore County

    ICWSM is a IW3C2 endorsed conference
    (http://www.iw3c2.org/).

    HISTORY

    The International Conference on Weblogs and social
    media grew out of two events: the annual series of
    Workshops on the Weblogging Ecosystem (WWE 2006,
    WWE 2005, WWE 2004) held in conjunction with the
    International World Wide Web Conference and the
    Spring Symposium organized by the American
    Association for Artificial Intelligence (AAAI)
    on Computational Approaches to Analyzing Weblogs
    (CAAW 2006).

    CONTACT

      info (at) icwsm dot org

    Best wishes
    Nicolas

    ---
    Dr Nicolas Nicolov
    Chief Scientist
    Umbria Inc.
    1655 Walnut St, Suite 300
    Boulder, CO 80302, U.S.A.
    Tel: (310) 754-5010
    



    This archive was generated by hypermail 2b29 : Fri Sep 15 2006 - 19:08:23 MET DST