RE: [Corpora-List] Chiniese Name Gender Recognition

From: Xiaofei Lu (xflu@ling.ohio-state.edu)
Date: Thu Dec 22 2005 - 19:45:29 MET

  • Next message: Rob Freeman: "Re: [Corpora-List] QM analogy and grammatical incompleteness"

    Are you planning to look at context at all? The pronoun resolution idea
    should definitely help. Plus, looking at the context in which a personal
    name appears may help a bit, too, e.g., in cases where one or more names
    appears after things like "member(s) of the women's team", etc.

    Xiaofei

    On Thu, 22 Dec 2005, Heng Ji wrote:

    >
    > I believe your IR idea will boost the performance. Besides, you may want to
    > try applying pronoun reference resolution before gender disambiguation.
    > Since Chinese person pronouns are distinguished clearly based on genders. If
    > you could link the pronoun in the context with the name candidate, that
    > might help. In addition a few gender-specific title words in the context
    > would be useful too.
    >
    > I would guess only using lexical information can accurately recognize name
    > genders for people born before 1980; but might not be enough for those
    > names appearing later - many names have been given intentionally
    > gender-insensitive.:) So you may want to incorporate the time frame
    > information in your system.
    >
    > Heng
    >
    > On Thu, 22 Dec 2005, Jun Lang wrote:
    >
    >> Hi Mark Lewellen,
    >> Thanks for your concerning about this problem.
    >> Yes. After doing some baseline research, I found there were many
    >> related problems about the gender recognition based on Chinese Name. May be
    >> using only Name could not achieve better result. I am considering
    >> combining some other resource for disambiguation the gender. For example, I
    >> could use some search engine for some gender designing word to enhance the
    >> final accuracy.
    >> How do you think about it?
    >>
    >> Thanks!
    >>
    >> May you nice Christmas Eve and Day!
    >>
    >> Best wishes,
    >> Bill_Lang(Jun Lang): Ph.D Candidate
    >> Information Retrieval Laboratory
    >> Harbin Institute of Technology
    >> Mail: bill_lang@gmail.com
    >> Homepage: http://ir.hit.edu.cn/~bill_lang
    >>
    >>
    >> -----Original Message-----
    >> From: Mark Lewellen [mailto:lewellen@erols.com]
    >> Sent: Wednesday, December 21, 2005 11:49 PM
    >> To: 'Jun Lang'; 'Xiaofei Lu'
    >> Cc: corpora@uib.no
    >> Subject: RE: [Corpora-List] Chiniese Name Gender Recognition
    >>
    >> Since Chinese given names are not limited to a set of
    >> lexical items that are prototypically 'names' (i.e. they
    >> can be just about any lexical item), Chinese given names,
    >> as you probably know, often have no clue about gender.
    >> There has been some discussion on 'traits' that are
    >> more feminine or masculine and would be reflected in names,
    >> but there remains a lot of ambiguity. I doubt there is any
    >> statistical method, algorithm, or even native speaker that
    >> can make up for that problem!
    >>
    >> Mark Lewellen
    >>
    >>> -----Original Message-----
    >>> From: owner-corpora@lists.uib.no
    >>> [mailto:owner-corpora@lists.uib.no] On Behalf Of Jun Lang
    >>> Sent: Tuesday, December 13, 2005 7:31 AM
    >>> To: 'Xiaofei Lu'
    >>> Cc: corpora@uib.no
    >>> Subject: [Corpora-List] 答复: [Corpora-List] Chiniese Name
    >>> Gender Recognition
    >>>
    >>>
    >>> Yeah! There are many names which could be used for mail and
    >>> female. It is a
    >>> difficult problem. Now I have done some simple research on this topic.
    >>> Recently, I am trying to get more and more data. Since the
    >>> parameter space
    >>> is very huge, decision trees can not get the final result
    >>> quickly. I want to
    >>> use Bayes Model again.
    >>>
    >>> Can you give me some ideas about it? Thanks a lot!
    >>>
    >>> Best wishes,
    >>> Jun Lang
    >>>
    >>> -----邮件原件-----
    >>> 发件人: Xiaofei Lu [mailto:xflu@ling.ohio-state.edu]
    >>> 发送时间: 2005年12月13日 13:56
    >>> 收件人: Jun Lang
    >>> 主题: Re: [Corpora-List] Chiniese Name Gender Recognition
    >>>
    >>> Interesting. What is and how do you establish the baseline?
    >>> Many names can
    >>> be either male or female, can't they?
    >>>
    >>> On Tue, 13 Dec 2005, Jun Lang wrote:
    >>>
    >>>> Hi all Corpora Members,
    >>>>
    >>>> Now I am studying on Chinese Name Gender Recognition.
    >>> The input is a
    >>>> Chinese name. The output is the corresponding gender. I
    >>> used decision
    >>> trees
    >>>> method. But finally, the accuracy is only about 70%.
    >>>>
    >>>> Do you know any other method which can achieve higher
    >>> accuracy? And is
    >>>> there somebody has done any similar research?
    >>>>
    >>>> Thanks a lot!
    >>>>
    >>>>
    >>>>
    >>>> Best wishes,
    >>>>
    >>>> Bill_Lang(Jun Lang): Ph.D Candidate
    >>>>
    >>>> Information Retrieval Laboratory
    >>>>
    >>>> Harbin Institute of Technology
    >>>>
    >>>> Mail: bill_lang@gmail.com
    >>>>
    >>>> Homepage: http://ir.hit.edu.cn/~bill_lang
    >>>>
    >>>>
    >>>
    >>
    >>
    >>
    >>
    >



    This archive was generated by hypermail 2b29 : Thu Dec 22 2005 - 20:14:19 MET