Re: [Corpora-List] Phishing email corpus

From: Vlado Keselj (vlado@cs.dal.ca)
Date: Mon May 01 2006 - 14:20:45 MET DST

  • Next message: Nick Webb: "[Corpora-List] QA06: Call for Participation"

    There is a collection at the site of Anti-Phishing Working Group:
    http://www.antiphishing.org/phishing_archive.html

    I have my own collection and am interested in sharing it in order to
    create a larger public-domain collection.

    Best regards,
    --Vlado

    On Sun, 30 Apr 2006 radev@umich.edu wrote:

    > I don't know him. I have been collecting these myself after I read the
    > paper in Scientific American:
    >
    > @article{bennett&al.03,
    > author = {Bennett, Charles H. and Li, Ming and Ma, Bin},
    > title = {{Chain Letters and Evolutionary Histories}},
    > journal = {{Scientific American}},
    > month = {June},
    > year = {2003},
    > pages = {76--81},
    > url = {http://www.sciam.com/article.cfm?colID=1&articleID=0003D476-1852-1EB7-BDC0809EC588EEDF},
    > }
    >
    >
    > j_kurjian@hotmail.com wrote:
    > >
    > > Well, Lawrence Kestenbaum is in Michigan somewhere so you might have more
    > > luck than I did. He has a quite a few on his site, but he claims to have 15
    > > or 20 thousand on his hard drive!
    > > J
    > >
    > >
    > >
    > > >
    > > >I have a larger collection of "Nigerian" Letters, more than 2,500 of
    > > >them, collected since 1998. If anyone is interested, drop me a note.
    > > >
    > > >D.
    > > >
    > > >j_kurjian@hotmail.com wrote:
    > > > >
    > > > > Nicklas -
    > > > > I don't know if this is what you're looking for but I have a collection
    > > >o=
    > > > > f=20
    > > > > "Nigerian Letters," about 100 of them. They are not tagged. If you
    > > >are=20
    > > > > handy with a spider or offline browser, you might be able to get some
    > > >fro=
    > > > > m:
    > > > > http://potifos.com/fraud/
    > > > > Last year I contacted it's owner, Lawrence Kestenbaum, and he almost
    > > >agre=
    > > > > ed=20
    > > > > to make his massive collection available to me for a corpus. Then it
    > > >fel=
    > > > > l=20
    > > > > through. His site has lotto letters and probably other types too.
    > > >Hope=20
    > > > > that helps. Let me know if you want what I have.
    > > > > Jerry Kurjian
    > > > >
    > > > >
    > > > > >
    > > > > >Hi
    > > > > >
    > > > > >My name is Nicklas Karlsson, and I'm a student at V=E4xj=F6 University
    > > >i=
    > > > > n=20
    > > > > >Sweden.
    > > > > >I'm working on my bachelor's degree with a phishing detection and
    > > >warnin=
    > > > > g=20
    > > > > >project.
    > > > > >The part I'm working on is a classification module, which will
    > > >classify=20
    > > > > >emails to find the phishing emails and mark them for further
    > > >investigati=
    > > > > on.
    > > > > >To develop this and test it I need a collection of phishing emails.
    > > >I've=
    > > > > =20
    > > > > >collected a few but I still need more.
    > > > > >
    > > > > >So I wonder if anyone has or knows where I can find a corpus with
    > > >phishi=
    > > > > ng=20
    > > > > >emails?
    > > > > >
    > > > > >I'll post a list of all replys sent directly to me.
    > > > > >
    > > > > >Thanks
    > > > > >
    > > > > >Nicklas Karlsson
    > > > > >
    > > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > > >
    > > >
    > > >
    > > >--
    > > >Dragomir R. Radev radev@umich.edu
    > > >Associate Professor of Information, Electrical Engineering and
    > > >Computer Science, and Linguistics, the University of Michigan, Ann Arbor
    > > >Phone: 734-615-5225 Fax: 734-764-2475 http://www.si.umich.edu/~radev
    > >
    > >
    > >
    > >
    > >
    >
    >
    >



    This archive was generated by hypermail 2b29 : Mon May 01 2006 - 14:44:37 MET DST